cms-analysis / flashgg

20 stars 158 forks source link

Add ParticleNet SV tagging for H+c analysis #1284

Closed colizz closed 2 years ago

colizz commented 2 years ago

A pull request to add ParticleNet SV tagging for Higgs+charm analysis.

Goal:

The analysis aims to use soft-charm tagging on secondary vertices (SV). We hope to include the SV collection with the flavour tagging score included. We adopt the ParticleNet model to infer tagging scores for each SV in an event, using the information of all PF candidates associated with the SV. Please see details in the [BTV talk] by @p-masterson.

This feature needs:

(1) this PR which includes the following changes:

(2) inference model stored under /eos/cms/store/group/phys_higgs/cmshgg/flashgg-data/MicroAOD/data/, which is already there. (I gave it a try - it seems I have the permission to copy files into this dir, but I cannot remove them)

Validation:

The test command runs cmsRun MicroAOD/test/microAODstd.py processType=sig datasetName=hh conditionsJSON=MetaData/data/MetaConditions/Era2017_legacy_v1.json

More validations to come.


ping people who are relevant: @p-masterson, @chenzhou36, @XuanhaoZhang, @chernyavskaya, @gouskos, @hqucms, @leaca, @mhl0116, @missirol, @TizianoBevilacqua, @selvaggi

colizz commented 2 years ago

Hello, we did a more concrete validation under UL16 and UL17. We try the following test commands with UL16 (pre/postVPF) on a variety of MC/data samples.

N.B. It occurs that when running the current cms-analysis:dev_legacy_runII branch on UL datasets, the customizePDFs need to be commented out. https://github.com/cms-analysis/flashgg/blob/aed0a5a142d7d65998df27f74abee57f8035a708/MicroAOD/python/MicroAODCustomize.py#L248 With this change the following commands all work.

## VH
cmsRun MicroAOD/test/microAODstd.py processType=sig datasetName=vh conditionsJSON=MetaData/data/MetaConditions/Era2016_legacyPreVFP_v1.json fileNames=/store/mc/RunIISummer20UL16MiniAODAPVv2/VHToGG_M125_TuneCP5_13TeV-amcatnloFXFX-madspin-pythia8/MINIAODSIM/106X_mcRun2_asymptotic_preVFP_v11-v1/2430000/39E916A0-ECA9-354E-9477-425345A201F2.root
## VBF
cmsRun MicroAOD/test/microAODstd.py processType=sig datasetName=vbf conditionsJSON=MetaData/data/MetaConditions/Era2016_legacyPreVFP_v1.json fileNames=/store/mc/RunIISummer20UL16MiniAODAPVv2/VBFHToGG_M125_TuneCP5_13TeV-amcatnlo-pythia8/MINIAODSIM/106X_mcRun2_asymptotic_preVFP_v11-v2/80000/19B273D2-28FD-2749-9CFE-FA3DEA5E29C9.root
## TTH
cmsRun MicroAOD/test/microAODstd.py processType=sig datasetName=tth conditionsJSON=MetaData/data/MetaConditions/Era2016_legacyPostVFP_v1.json fileNames=/store/mc/RunIISummer20UL16MiniAODv2/ttHJetToGG_M125_TuneCP5_13TeV-amcatnloFXFX-madspin-pythia8/MINIAODSIM/106X_mcRun2_asymptotic_v17-v2/2520000/19D706D4-54F1-3340-921C-C0CF51FCB104.root
## GGH
cmsRun MicroAOD/test/microAODstd.py processType=sig datasetName=ggh conditionsJSON=MetaData/data/MetaConditions/Era2016_legacyPostVFP_v1.json fileNames=/store/mc/RunIISummer20UL16MiniAODv2/GluGluHToGG_M125_TuneCP5_13TeV-amcatnloFXFX-pythia8/MINIAODSIM/106X_mcRun2_asymptotic_v17-v2/2520000/13FC15EF-5E2B-454E-8E41-CC0668C6663B.root

## Data: DoubleEG
cmsRun MicroAOD/test/microAODstd.py processType=data datasetName=DoubleEG conditionsJSON=MetaData/data/MetaConditions/Era2016_legacyPreVFP_v1.json fileNames=/store/data/Run2016B/DoubleEG/MINIAOD/ver2_HIPM_UL2016_MiniAODv2-v3/100000/0002A5D0-8C99-5642-B7D3-543F005B22AB.root
cmsRun MicroAOD/test/microAODstd.py processType=data datasetName=DoubleEG conditionsJSON=MetaData/data/MetaConditions/Era2016_legacyPostVFP_v1.json fileNames=/store/data/Run2016H/DoubleEG/MINIAOD/UL2016_MiniAODv2-v1/110000/15FBCF29-A289-7A41-95F1-B0970E78CAFD.root

Besides, the additional SV collection adds very little space to the output μAOD - only around 0.4%.

I set this PR ready for review. It would be great to see this feature added to μAOD production. Thanks a lot! - Congqiao

youyingli commented 2 years ago

Hi @colizz , thanks for this contribution. As for the PDF extraction (customizePDFs), I make PR #1285 to fix them. Could you please try to pull PR #1285 and test if it can work together in your PR? Then I will merge both PRs.

colizz commented 2 years ago

Hi @youyingli ! Thanks for the fix. I have tested that the commands all work well when cooperating with PR #1285 and with customizePDFs setting up normally.

colizz commented 2 years ago

Hi @youyingli, I have two more commits to this PR. The first one is a small update on the implementation. The second one is to remove the ONNXRuntime setup to adapt CMSSW_10_6_29, following your suggestion.

I have tested that all the above test commands work when cooperating this PR with #1286.

Please merge this PR when #1286 is merged. Thanks!