cms-analysis / flashgg

20 stars 158 forks source link

Shower shapes corrections for 2018 Legacy #1265

Closed maxgalli closed 3 years ago

maxgalli commented 3 years ago

The trained regressors for UL2018 are now available at /eos/cms/store/group/phys_higgs/cmshgg/flashgg-data/Taggers/data/PhoIdInputsCorrections/2018_Legacy and the summary can be found at /eos/cms/store/group/phys_higgs/cmshgg/flashgg-data/Taggers/data/PhoIdInputsCorrections/corrections_summary_2018_Legacy.json. The path to the correct json file is thus updated in MetaData/data/MetaConditions/Era2018_legacy_v1.json.

Please note that this PR shouldn't be merged until https://github.com/cms-analysis/flashgg/pull/1249 is merged as well. The correct procedure would be to push 50f8feb1e4b2eef328a9db0c31b31804da310a1f to https://github.com/cms-analysis/flashgg/pull/1249, but since also 6cc65e994d605ded4154e43808fd50b3133c8d9f (which is already in master) is needed to correctly reproduce the results, it made more sense to cherry-pick fc1a59fc75195a225f54cf6b000ca7bb666f92e3 and 5896b9843c9af66ee0125d8fc90472e665a6a7d5 and add 50f8feb1e4b2eef328a9db0c31b31804da310a1f on top. This should also cause less confusion at the moment of merging.

Feedback is needed from @simonepigazzini and/or @rchatter concerning whether or not including https://github.com/maxgalli/flashgg/commit/0618a9664bc9768c27d8bc17ebf606f48037cc79 since it was applied for the training.

rchatter commented 3 years ago

Hi Max,

Please include the updated global tags in maxgalli@0618a96 . Since these are the correct tags for the UL18 data and MC for the non central recipes it wont hurt anything, but for other things(JEC/JES, etc) its best we use the updated GTs like you used.

Thanks, Rajdeep

rchatter commented 3 years ago

Hi @maxgalli , Just confirming that this PR has the final SS corrections for UL18 right with no changes foreseen, in which case it can be merged to the default branch ?

Thanks, Rajdeep

maxgalli commented 3 years ago

Hi @maxgalli , Just confirming that this PR has the final SS corrections for UL18 right with no changes foreseen, in which case it can be merged to the default branch ?

Thanks, Rajdeep

Hi @rchatter, yes the corrections are final and there are no changes foreseen. Providing that the two commits (fc1a59fc75195a225f54cf6b000ca7bb666f92e3 and 5896b9843c9af66ee0125d8fc90472e665a6a7d5) that I cherry-picked from https://github.com/cms-analysis/flashgg/pull/1249 haven't changed recently, everything should be in place to be merged in the default RunII branch.

rchatter commented 3 years ago

Thanks! @youyingli could you please merge the shower shower corrections for UL18 from @maxgalli to the default branch. Thanks in advance.

rchatter commented 3 years ago

Hi @maxgalli , Would you have the updated systematics soon? @Prasant1993 wanted to get going with the central SFs otherwise and update the Photon ID validation plot later. Best, Rajdeep

simonepigazzini commented 3 years ago

Hi @rchatter ,

we are checking the EE training scheme. We realized that for 2018 the mis-modeling of the preshower variables might be visible and might play a role in the discrepancy we saw in the last presentation. We need to dump these variables and take a look at the agreement which might take a while. For EB everything is finalized but we now are again above quota on EOS so we can't copy the systematics files there... (will email the group shortly)

rchatter commented 3 years ago

Hi @maxgalli ,

Thanks for this information. I see that you think that the ES variables might be responsible for EE discrepancy, though it seems to be fairly low priority features looking at @Prasant1993 's slide 12 here https://indico.cern.ch/event/996680/contributions/4208099/attachments/2179705/3681731/Update_XGBoost_TMVA_BDT_UL2018_prasant.pdf . I guess Prasant and @JunquanTao would already have the Data-MC comparisons of these variables from Z->ee and Z->MuMuG. So we can check from there the degree of discrepancy in these.

For the space issues we are hopeful to be able to remove at lesat ~160 Gigs ver soon.

Best, Rajdeep

maxgalli commented 3 years ago

Update 17/05/21: since the results for EE were quite worse than usual, we decided to correct also the preshower variable esEnergyOverSCRawEnergy. The necessary changes to implement this further step are described in detail in the message to commit b417fd8.

The files /eos/cms/store/group/phys_higgs/cmshgg/flashgg-data/Systematics/data/SystematicsIDMVA_LegRunII_v1_UL2018.root and /eos/cms/store/group/phys_higgs/cmshgg/flashgg-data/Systematics/data/SystematicsIDMVA_LegRunII_v1_UL2018_noEdgeCorr.root have been changed with the newly computed systematics. The relevant plots can be found here (systematics) and here (shapes).

youyingli commented 3 years ago

OK, thanks. Merged!