cms-nanoAOD / cmssw

CMS NanoAOD software integration repository
http://cms-sw.github.io/
Apache License 2.0
3 stars 10 forks source link

Split NanoAOD into smaller files in v7 #511

Closed andreypz closed 1 year ago

andreypz commented 4 years ago

May I request for the upcoming v7 production to split samples /DY*JetsToLL_M-50_LHEZpT_*_TuneCP5_13TeV-amcnloFXFX-pythia8/ into smaller files. In v5 and v6 many of those samples are split into >1GB per file. This causes problems when running post-processing crab jobs, with fileBased splitting - exceeding of wall clock time and memory. I believe we can not split the jobs more than 1 file per job, otherwise the normalization of would break (genEventSumw stored in Runs tree).

adewit commented 3 years ago

Not a real solution to the problem, but with something like this: https://github.com/adewit/nanoAOD-tools/commit/e2ec493d7fce5804ec87f143becbe210d24ff921 you can track genEventSumw when splitting into more than one file per job (a "central" solution would be better, but my knowledge of the inner workings of nanoAOD isn't good enough to know how to solve this better :-) )

vlimant commented 1 year ago

please ping/re-open in cms-sw if needed