elaird / supy

analyze events stored in TTrees in parallel
8 stars 7 forks source link

Optional slice-level cross-check for nExpected #199

Closed betchart closed 9 years ago

betchart commented 9 years ago

Add a new option (in defaults.py, off by default) to additionally calculate and cache the number of events per file upon creation of file lists, for later use as a cross-check that the output of individual slices contains the expected number of events. This cross-check requires no hand-entry of expected numbers of events, and causes the failure of just the affected slice, rather than the entire job, in the case that some events go missing. Can be used in independently of the nCheck argument in sample specification.

This feature is inspired by issue #197, and its implementation has helped to confirm via the FNAL condor stderr messages that missing events occur after a failure to open a file. It is not clear why such a file-open failure does not cause immediate failure of the supy slice, or if the issue is confined to FNAL.

NB: Cached file lists must be recreated upon adoption of this feature, regardless of whether the feature is enabled. The new format is of type [(str,int)] or [(str,None)] rather than [str].