LPC-DM / NanoHarvester

0 stars 8 forks source link

NanoAODv6 Private Production #16

Open mcremone opened 5 years ago

mcremone commented 5 years ago

To be done once recipes are out:

https://indico.cern.ch/event/849900/contributions/3594750/attachments/1936055/3208343/Slides_191031_ExoWorkshopMCI.pdf (slide 7)

mcremone commented 5 years ago

@trtomei will make sure to get the recipe for this. @mcremone will check how to generate files with larger size.

areinsvo commented 4 years ago

Update on this: I made many changes to the repository with new config files for data and for 2018 MC. The 2016 and 2017 MC configs still need to be updated. Per Doug’s request, I added the SingleMuon datasets to the lists to be processed.

I took the instructions for data from here: https://twiki.cern.ch/twiki/bin/viewauth/CMS/PdmVDataReprocessingNanoAODv6

We will keep track of the progress of the production here: https://docs.google.com/spreadsheets/d/1TDXd3DoiKnz5FBoAxOB62en2sRIbkt4Zr_DQ9V-J2IE/edit?usp=sharing

We are only at 16 TB out of 60 TB in lpccoffea, so we should have plenty of space.

Outstanding issues: We need to test the config files to make sure they work. We need to double check if we need to add the variable GenJetAK15_hadronFlavour, or if the GenJetAK8_hadronFlavour variable is sufficient. We should possibly ask UCSB if any updates are needed to the DeepAK15 code. We need to find a balance between increasing the file size and making sure the jobs run and finish. @mcremone have you double checked that the compression settings are what we want for the NanoAOD production?

mcremone commented 4 years ago

@areinsvo unfortunately LZMA is used. We need to change this to LZ4. I have an outstanding issues myself to keep track for this production:

Let's see if it's possible

areinsvo commented 4 years ago

I have updated our repository to match the code found in the 102X branch of Huilin's repostiory: https://gitlab.cern.ch/hqu/NanoTuples/tree/prod/102X. I also tested that the 2016 config file for data successfully runs.

trtomei commented 4 years ago

I add here the full commands that I was able to ascertain:

RunIIAutumn18 MC

cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--mc --step NANO --eventcontent NANOEDMAODSIM --datatier NANOAODSIM  --nThreads 2 \
--era Run2_2018,run2_nanoAOD_102Xv1 --conditions 102X_upgrade2018_realistic_v20

RunIISummer16 MC

step1 --filein "file:input.root" --fileout "file:output.root" \ 
--mc --step NANO --eventcontent NANOEDMAODSIM --datatier NANOAODSIM --nThreads 2 \
--era Run2_2016,run2_nanoAOD_94X2016 --conditions 102X_mcRun2_asymptotic_v7

2016 data

cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--data --step NANO --eventcontent NANOEDMAOD --datatier NANOAOD  --nThreads 2 \
--era 'Run2_2016,run2_nanoAOD_94X2016' --conditions 102X_dataRun2_v12

2017 data

cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--data --step NANO --eventcontent NANOEDMAOD --datatier NANOAOD  --nThreads 2 \
--era 'Run2_2017,run2_nanoAOD_94XMiniAODv2' --conditions 102X_dataRun2_v12 

2018 data ABC

cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--data --step NANO --eventcontent NANOEDMAOD --datatier NANOAOD  --nThreads 2 \ 
--era 'Run2_2018,run2_nanoAOD_102Xv1' --conditions 102X_dataRun2_v12  

2018 data D

cmsDriver.py step1 --filein "file:input.root" --fileout "file:output.root" \
--data --step NANO --eventcontent NANOEDMAOD --datatier NANOAOD  --nThreads 2 \   
--era 'Run2_2018,run2_nanoAOD_102Xv1' --conditions 102X_dataRun2_Prompt_v15  
hqucms commented 4 years ago

@mcremone Regarding JEC/JER, currently the AK8Puppi JEC is applied on the AK15 fatjet and the AK4Puppi JEC is applied on the subjets when producing the NanoAOD samples. JER is not applied in the production but applied afterwards.

In the Hcc analysis, we took the second approach, i.e., using the corrected subjets to reconstruct the AK15.

mcremone commented 4 years ago

@hqucms can JER be applied at production level? Similarly, can the corrected subjets be used to reconstruct the AK15 at production level?

hqucms commented 4 years ago

@hqucms can JER be applied at production level?

That should be possible using some producer in CMSSW.

Similarly, can the corrected subjets be used to reconstruct the AK15 at production level?

If one writes a custom producer that should still be possible.

areinsvo commented 4 years ago

@hqucms one more question. In our current NanoAOD, we have GenJetAK8_hadronFlavour and GenJet_hadronFlavour. Do we also need the GenJetAK15 hadron flavor in order to calculate scale factors for the tagger? If so, do you have example code for adding this variable that you can point us to?

hqucms commented 4 years ago

@areinsvo There is no need to add that as we already store the number of b or c hadrons.

areinsvo commented 4 years ago

Sounds good. Thanks for the clarification @hqucms

areinsvo commented 4 years ago

We are getting close to being able to launch this production. The main remaining issue is the compression settings. Once that's settled, then everyone can test the config for their own part of the production.

It would also be useful if we could define more concretely the recommendations for what splitting to use.

@hqucms I saw that you update the crab script in your repo. Do you recommend that we implement those changes as well?

hqucms commented 4 years ago

@areinsvo Maybe -- there are a few robustness improvement here and there, so hopefully the updated script will be more convenient to use.

mcremone commented 4 years ago

if we use the new script we also need to update the one to run over privately produced MiniAOD.