JeffersonLab / halld_recon

Reconstruction for the GlueX Detector
6 stars 8 forks source link

std histo Independent/IsEvent reports incorrect trigger count #489

Open rjones30 opened 3 years ago

rjones30 commented 3 years ago

Comparing the event statistics from a raw data reconstruction run on the OSG with a prior pass through the same data, Alexander noted that there is a mismatch in the total number of physics triggers being reported in the standard monitoring histogram Independent/IsEvent by the two passes through the same raw data input file. The could indicate a problem with the raw data processing on the OSG, so I made a detailed study to investigate. I created a simple plugin that writes a ROOT tree saving the following information on every event that passes through the JEventProcessor::evnt method.

  1. event_no = eventnumber
  2. trig_mask = DL1Trigger::trig_mask
  3. fp_mask = DL1Trigger::fp_trig_mask
  4. is_event = DTrigger->Get_IsPhysicsEvent()

This Get_IsPhysicsEvent() is what is being histogrammed in Independent/IsEvent histo in the standard monitoring histograms. I ran hd_root with just this one plugin and nothing else, over the hd_rawdata_071728_000.evio input file. Each time I ran I got a slightly different answer for the total number of physics triggers that are recorded with is_event > 0 in my tree, differing from one run to the next by something on the order 10. However, if I cut on eventnumber > 0 then I always get the same number with is_event > 0, This suggests that DTrigger->Get_IsPhysicsEvent() is returning true in some cases it shouldn't, when the input record is a non-trigger type indicated by eventnumber = 0.

rjones30 commented 3 years ago

Here are some results of a concrete test. I ran single-threaded over hd_rawdata_071728_000.evio under 3 distinct scenarios.

  1. the entire input file is fed in one giant stream into hd_root, with ecounter plugin enabled, saving triggers tree to hd_root.root;
  2. the input file is split into 68 segments consisting of 5 evio blocks each (an evio block contains ~4.5K triggers), and passed to hd_root as a sequence of 68 input files listed on a single command line;
  3. the input file is split into 338 segments consisting of 1 evio block each, and passed to hd_root as a sequence of 338 input files listed on a single command line;

The triggers tree from the ecounter plugin was saved for each of the 3 cases, and then compared afterward.

  1. 1393546 events - fluctuates by ~2 events from one run to the next
    • 37 repeats of eventnumber = 0, indicating non-trigger records
    • remaining 1393509 events have valid eventnumber > 0, and seem ok
  2. 1393699 events - fluctuates by ~20 events from one run to the next,
    • 190 repeats of eventnumber = 0
    • remaining 1393509 events have valid eventnumber > 0, same sequence as case #1 above
  3. 1394276 events - fluctuates by ~200 events from one run to the next
    • 767 repeats of eventnumber = 0
    • remaining 1393509 events have valid eventnumber > 0, same sequence as case #1 above

Based on these observations, I would say that all of the non-reproducibility we are seeing in the IsEvent histogram is coming from these variable numbers of eventnumber=0 records that are being incorrectly tagged as physics triggers by DTrigger->Get_IsPhysicsEvent(). This is confusing, and probably should be fixed.

The non-reproducibility is especially surprising, especially given that the JANA "events processed" and "events read" statistics that are printed to stdout at the end of each run are consistent from one run to the next, and also between the 3 ways listed above for feeding events into hd_root. All of these tests were carried out with just a single worker thread. Weird.

-Richard Jones

rjones30 commented 3 years ago

I looked at the code in libraries/TRIGGER/DTrigger_factory.cc where the DTrigger object is populated from the evio input event, and found a path through the evnt method that never initializes the dL1TriggerBits and dL1FrontPanelTriggerBits members of the DTrigger JANA object. By simply re-ordering the lines, it was possible to ensure that these are always set to zero if the event is not of a physics trigger type.

To test this fix, I have added a similar fix to my ecounter plugin so that I can reliably count physics triggers in the interim, between now and the time when this fix makes its way into the standard production libraries. With this fix installed, I ran again on hd_rawdata_071728_000.evio in the 3 input data slicing modes described above. Now the counts with is_event > 0 are all the same, 1393508, the actual number of valid event triggers that were contained in this input file.

I have submitted a PR for this fix under the patch branch fix_broken_IsEvent_trigger_flag_rtj. As soon as this is accepted, this issue can be closed. -Richard