How to organize tests and where to get and store input data

tmadlener commented 3 years ago

As mentioned in #5 we would like to have the different converters (ideally), or at least have the converter part under test. The former would also allow us to check that we actually do the setup in the different readers correctly. This issue should serve a sort of discussion thread, because there are quite a few moving parts involved.

All of the above mentioned options require that we have some input files that we can then use in the tests. For some of the readers I think this is not really a problem, because in this case these inputs basically just comprise a few text files (e.g. Pythia), but for others we would need some binary input files (e.g. STDHEP or ROOT). Since the scale of these tests will probably be rather limited, it might be feasible to also just add a few binary input files to the repository.

Another question is on how we want to organize these tests. As a starting point I would suggest to write a small test utility that takes an edm4hep input file and a delphes input file and then simply compares these event-by-event to see if all the things expected from delphes are actually present in the edm4hep output. We could then run all the converters and the corresponding delphes readers and use the outputs from those as inputs to this test utility.

In the interest of covering the most possible cases with the smallest possible set of different inputs, another questions is which inputs should we use and where to get them.

If there are simpler / easier solutions to have at least parts of the whole thing under test I am also very much open to ideas into that direction.

vvolkl commented 3 years ago

Those are some very good point. I think mostly we should focus on testing the converter, since the readers are actually Delphes code (and could be replaced at one point, either with changes in Delphes that allow refactoring or by using the framework integration, which would prepare the input instead)

For the input files, we can add small config files and even small binaries to the repo, no problem, but better to produce data files on the fly. Delphes has a download for stdhep: http://cp3.irmp.ucl.ac.be/downloads/z_ee.hep.gz
For the testing organization,in my opinion it's really simple: the more tests the better, that's really just limited by how much time we can put into it :) Ideally there are granular tests for specific behaviors, but even having overlapping expensive tests is good - if needed we can always have different test suites to run at different times to keep the CI snappy :)

vvolkl commented 3 years ago

Also, it's usually good practice to add a test for every bug found that would have caught it. An example: https://github.com/key4hep/k4SimDelphes/blob/main/converter/src/DelphesEDM4HepConverter.cc#L356 :)

tmadlener commented 3 years ago

An example: https://github.com/key4hep/k4SimDelphes/blob/main/converter/src/DelphesEDM4HepConverter.cc#L356

Oh my, that one was quite obvious in hindsight. Already fixed in #5, and since a few minutes ago also with an accompanying test. ;)

key4hep / k4SimDelphes

How to organize tests and where to get and store input data #12