cms-tau-pog / TauFW

Analysis framework for tau analysis at CMS using NanoAOD
9 stars 40 forks source link

Event-based splitting of jobs. #7

Closed IzaakWN closed 1 year ago

IzaakWN commented 3 years ago

Right now jobs are split by number of files. However, the number of entries varies wildly between nanoAOD files. If the submission routine in pico.py allows for event-based splitting of jobs, it would make it possible to create jobs and output files more uniform in length and size, and have easier finetuning of batch submission parameters such as maximum run time. With event-splitting, smaller files can be combined into one job, or a single large file can be split into several jobs.

It would not be too hard to implement–I think.

The post-processor already allows to define a start event index and maximum number of events, so "all" one needs to do it add this as an option for the job argument list.

But first one needs to split the files into chunks that may overlap over not. Right now chunks are made here: https://github.com/cms-tau-pog/TauFW/blob/4a6311c37766a0f3786a92c098996c9c102b218d/PicoProducer/scripts/pico.py#L671 Currently, the chunks are saved as a dictionary in the JSON job config file for bookkeeping during resubmission, e.g.

"chunkdict": {
  "0": [ "nano_1.root",  "nano_2.root" ]
  "1": [ "nano_2.root",  "nano_3.root" ]
  ...
}

The trickiest part is to save it in this config format for bookkeeping in the resubmission and status routines. This is where a lot of bugs might creep in if the information is not stored and retrieved correctly. The simplest and most compact would be to simply add it to the end of the usual filename in the chunk dictionary of the config JSON file,

"chunkdict": {
  "0": [ "nano_1.root:0:1000" ]
  "1": [ "nano_1.root:1000:2000" ]
  ...
}

and parse it in checkchunks.

It should be possible. I plan to implement it in the near future.

IzaakWN commented 3 years ago

Implemented as per commit https://github.com/cms-tau-pog/TauFW/commit/4e46663de2a69b208143ede8ec98110817207686.

Tested with some samples, and so far the submission, resubmission and status checks seems to work as expected. For skimming, however, there seems to be a bug in nanoAOD-tools, see issue https://github.com/cms-nanoAOD/nanoAOD-tools/issues/269. Until that is fixed, one should probably set self.firstEntry to 0 in this line: https://github.com/cms-nanoAOD/nanoAOD-tools/blob/25a793ec55b30fe7107af263c4523f20ff1a5fbd/python/postprocessing/framework/output.py#L175-L176

IzaakWN commented 1 year ago

Fix https://github.com/cms-nanoAOD/nanoAOD-tools/pull/276 was merged.