cms-nanoAOD / nanoAOD-tools

Tools for working with NanoAOD (requiring only python + root, not CMSSW)
42 stars 326 forks source link

Fix crash when no entries are pre-selected in a crab job #305

Closed meisonlikesicecream closed 1 year ago

meisonlikesicecream commented 1 year ago

TL;DR: Quick fix to avoid crashes when no events in a job gets pre-selected when running code via e.g. crab.

Long version: in a previous PR I solved an issue with NanoAODTools crashing if no events got pre-selected in one of the input files (https://github.com/cms-nanoAOD/nanoAOD-tools/pull/303). However, in that solution we stopped processing the event straight after the pre-selection, thus not adding anything to the job report or creating any output files. This was fine for running interactively, but caused issues if you submitted a job to e.g. crab if no events in the job got pre-selected. Then it tried to run the haddnano.py script merging the non-existent output files and crashing, and the crab jobs were also looking for the outputs and crashed when they were not found. In this PR I have moved the pre-selection check to the event loop instead, so it still creates output files (albeit empty) even if no events got pre-selected to avoid any issues with crab.

meisonlikesicecream commented 1 year ago

@clelange you seem to have reviewed my previous PRs (thank you!) and not sure if git gave you any notifications when I marked this PR as ready... so I thought I'd ping you :-)

Also, if you have any opinions on the creation of empty output root files let me know. Implementation is easy (as you might notice from this one line PR), and the user might be less confused about any "missing" output files. Although maybe wasteful to create empty files....

clelange commented 1 year ago

Hi @meisonlikesicecream - thanks for this fix, sounds reasonable to me. We'll soon have a better responsibility model set up for this repository. I personally think that it's better to have empty files (since one can then test for their existence as proof that the job ran). What would be an alternative besides log files?

meisonlikesicecream commented 1 year ago

Hi @meisonlikesicecream - thanks for this fix, sounds reasonable to me. We'll soon have a better responsibility model set up for this repository. I personally think that it's better to have empty files (since one can then test for their existence as proof that the job ran). What would be an alternative besides log files?

Not sure what an alternative would be... I just wasn't sure if empty files was considered a good solution. :-) And I agree that it's good proof for if the job was run.