Open mcremone opened 5 years ago
@trtomei will make sure to get the recipe for this. @mcremone will check how to generate files with larger size.
Update on this: I made many changes to the repository with new config files for data and for 2018 MC. The 2016 and 2017 MC configs still need to be updated. Per Doug’s request, I added the SingleMuon datasets to the lists to be processed.
I took the instructions for data from here: https://twiki.cern.ch/twiki/bin/viewauth/CMS/PdmVDataReprocessingNanoAODv6
We will keep track of the progress of the production here: https://docs.google.com/spreadsheets/d/1TDXd3DoiKnz5FBoAxOB62en2sRIbkt4Zr_DQ9V-J2IE/edit?usp=sharing
We are only at 16 TB out of 60 TB in lpccoffea, so we should have plenty of space.
Outstanding issues: We need to test the config files to make sure they work. We need to double check if we need to add the variable GenJetAK15_hadronFlavour, or if the GenJetAK8_hadronFlavour variable is sufficient. We should possibly ask UCSB if any updates are needed to the DeepAK15 code. We need to find a balance between increasing the file size and making sure the jobs run and finish. @mcremone have you double checked that the compression settings are what we want for the NanoAOD production?
@areinsvo unfortunately LZMA is used. We need to change this to LZ4. I have an outstanding issues myself to keep track for this production:
Let's see if it's possible
I have updated our repository to match the code found in the 102X branch of Huilin's repostiory: https://gitlab.cern.ch/hqu/NanoTuples/tree/prod/102X. I also tested that the 2016 config file for data successfully runs.
I add here the full commands that I was able to ascertain:
RunIIAutumn18 MC
cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--mc --step NANO --eventcontent NANOEDMAODSIM --datatier NANOAODSIM --nThreads 2 \
--era Run2_2018,run2_nanoAOD_102Xv1 --conditions 102X_upgrade2018_realistic_v20
RunIISummer16 MC
step1 --filein "file:input.root" --fileout "file:output.root" \
--mc --step NANO --eventcontent NANOEDMAODSIM --datatier NANOAODSIM --nThreads 2 \
--era Run2_2016,run2_nanoAOD_94X2016 --conditions 102X_mcRun2_asymptotic_v7
2016 data
cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--data --step NANO --eventcontent NANOEDMAOD --datatier NANOAOD --nThreads 2 \
--era 'Run2_2016,run2_nanoAOD_94X2016' --conditions 102X_dataRun2_v12
2017 data
cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--data --step NANO --eventcontent NANOEDMAOD --datatier NANOAOD --nThreads 2 \
--era 'Run2_2017,run2_nanoAOD_94XMiniAODv2' --conditions 102X_dataRun2_v12
2018 data ABC
cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--data --step NANO --eventcontent NANOEDMAOD --datatier NANOAOD --nThreads 2 \
--era 'Run2_2018,run2_nanoAOD_102Xv1' --conditions 102X_dataRun2_v12
2018 data D
cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--data --step NANO --eventcontent NANOEDMAOD --datatier NANOAOD --nThreads 2 \
--era 'Run2_2018,run2_nanoAOD_102Xv1' --conditions 102X_dataRun2_Prompt_v15
@mcremone Regarding JEC/JER, currently the AK8Puppi JEC is applied on the AK15 fatjet and the AK4Puppi JEC is applied on the subjets when producing the NanoAOD samples. JER is not applied in the production but applied afterwards.
In the Hcc analysis, we took the second approach, i.e., using the corrected subjets to reconstruct the AK15.
@hqucms can JER be applied at production level? Similarly, can the corrected subjets be used to reconstruct the AK15 at production level?
@hqucms can JER be applied at production level?
That should be possible using some producer in CMSSW.
Similarly, can the corrected subjets be used to reconstruct the AK15 at production level?
If one writes a custom producer that should still be possible.
@hqucms one more question. In our current NanoAOD, we have GenJetAK8_hadronFlavour and GenJet_hadronFlavour. Do we also need the GenJetAK15 hadron flavor in order to calculate scale factors for the tagger? If so, do you have example code for adding this variable that you can point us to?
@areinsvo There is no need to add that as we already store the number of b or c hadrons.
Sounds good. Thanks for the clarification @hqucms
We are getting close to being able to launch this production. The main remaining issue is the compression settings. Once that's settled, then everyone can test the config for their own part of the production.
It would also be useful if we could define more concretely the recommendations for what splitting to use.
@hqucms I saw that you update the crab script in your repo. Do you recommend that we implement those changes as well?
@areinsvo Maybe -- there are a few robustness improvement here and there, so hopefully the updated script will be more convenient to use.
if we use the new script we also need to update the one to run over privately produced MiniAOD.
To be done once recipes are out:
https://indico.cern.ch/event/849900/contributions/3594750/attachments/1936055/3208343/Slides_191031_ExoWorkshopMCI.pdf (slide 7)