JeffersonLab / JANA2

Multi-threaded HENP Event Reconstruction
https://jeffersonlab.github.io/JANA2/
Other
6 stars 9 forks source link

Handling multiple consecutive event sources #146

Open nathanwbrei opened 2 years ago

nathanwbrei commented 2 years ago

In JANA1, the user could specify a list of event sources. Each event source would be opened, fully processed, and closed, and then JANA1 would move on to the next event source.

In JANA2, currently, the event sources are run in parallel, which is generally not the desired behavior. This is a straightforward change to JEventSourceArrow and JTopologyBuilder.

nathanwbrei commented 1 year ago

@faustus123 What is the desired interaction between nskip/nevents and multiple event sources? Our options are:

  1. Give people the option to set nskip/nevents on a per-JEventSource basis, e.g. by specifying something like jana myinput.root:10:100 to have JANA skip the first 10 events and process the next 100.

  2. Apply the global jana:nskip and jana:nevents to each source (unless the user already overrode the values for that particular source). So jana -Pjana:nskip=10 -Pjana:nevents=100 file1.root file2.root will skip the first 10 from file1, process the next 100 from file 1, close file1 and open file2, skip the first 10 from file2, and process the next 100 from file2. This is what we currently do

  3. Apply the global jana:nskip and jana:nevents to the stream of all JEvents emitted from all sources, e.g. jana -Pjana:nskip=10 -Pjana:nevents=100 file1.root file2.root file3.root (where each file contains 50 events) will drop the first 10 events from file1.root, process the next 40 events from file1.root, process all 50 events from file2.root, and process the first 10 events from file3.root before exiting.

faustus123 commented 1 year ago

Good question. I would guess 3. would cover the most common use case. However, this will be very annoying for those who actually want use case 1 or 2. I'm not sure how hard 1 is, but it would mostly cover case 2 as well. Would it be hard to implement 1 and 3? You could throw an exception if inconsistent values are specified.

If it seems very complicated, I would probably prefer 3. and then defer the more complex scenarios until a user requests them.

nathanwbrei commented 1 year ago

I'll give it a shot, but put it in another pull request

nathanwbrei commented 1 month ago

Event sources have run one after another since #176. What has been missing thus far is sensible handling of nevents/nskip. The reason nevents/nskip is weird is because it shouldn't live at the JEventSource level at all, but rather the JEventSourceArrow. Furthermore, nevents/nskip shouldn't be reading and immediately discarding data; it should be jumping to the nskip'th event. Right now, JEventSource doesn't support random file access. It is easy to add, though: We just need bool Seek(uint64_t entry_nr) and uint64_t GetEntryCount() callbacks, and have the default implementation return false, indicating that the seek failed.