Closed ktht closed 2 years ago
Interim update: JMS and JMR are now enabled. Implemented LO-to-NLO corrections for W+jets samples. Its uncertainty is named as Vpt_nlo
, so no need to rename it. One thing that concerns me a bit is that the W+jets background will increase by roughly 50% after this change. Either this background must be so small that it didn't show up in data/MC comparison, or there's going to be a problem.
As for the PS weights, after closer inspection it looks like
Thus, I see little point in implementing it for the signal samples. I'll work on subjet b-tagging next week.
So I did a couple of sanity checks before starting to implement the solution for determining hadron flavor of AK8 subjets. The first test was to verify that AK8 subjet hadron flavor can be determined from the number of matching b-hadrons and c-hadrons considered in the ghost clustering in the following way: if the number of matching b-hadrons is greater than zero, assign hadron flavor 5; if the number of matching b-hadrons is zero but the number of matching c-hadrons is greater than zero, then assign 4 as the hadron flavor; otherwise, assign zero as the hadron flavor. I performed the test on a fully hadronic UL ttbar sample (because UL samples have all this info available out-of-the-box) and it seems to work out perfectly. Thus, the information available in NanoAODv7 is enough to determine AK8 subjet b-tagging SFs.
However, we're not using NanoAODv7, so we need another proxy to determine the flavor composition of AK8 subjets. Here are some ideas:
Based on NanoAODv7, both options yield comparable results for fully hadronic ttbar. However, for samples like DY the first idea doesn't work at all because the only quarks that are available in our post-processed Ntuples are either from V boson decays or from top decays (including hadronic W decay products). The problem is that the extra jets that are added at the ME level are not descending from neither of those cases, at least not according to Pythia gen particle listing. The situation is similar for other samples such as W+jets, in which the extra jets are introduced at the ME level. Thus, it really leaves the first option as our only viable way to determine these hadron flavors.
Below are confusion tables determined from ttbar (fully hadronic) and DY events separately using NanoAODv7 samples. Each cell shows how many AK8 subjets with hadron flavor given by the row correspond to the AK4 gen jet with hadron flavor given by the column. Numbers in parentheses indicate the classification rate for AK8 subjets with hadron flavor given by the row. The true positive rate for identifying b-flavored, c-flavored and light-flavored AK8 subjets is 89% (75%), 78% (51%) and 98% (99%) based on ttbar (DY) events. The method is clearly suboptimal in identifying c-flavored AK8 subjets, but I think it's best we can do with our post-processed Ntuples at the moment.
+-----------------+-----------------------------------------------+
| ttbar | AK4 gen jets |
| (168000 events) +---------------+---------------+---------------+
| | 5 | 4 | 0 |
+-------------+---+---------------+---------------+---------------+
| | 5 | 17993 (89.2%) | 336 (1.7%) | 1840 (9.1%) |
| AK8 subjets | 4 | 216 (1.5%) | 11070 (77.8%) | 2948 (20.7%) |
| | 0 | 621 (1.1%) | 696 (1.3%) | 53894 (97.6%) |
+-------------+---+---------------+---------------+---------------+
+-----------------+-----------------------------------------+
| DY | AK4 gen jets |
+ (191928 events) +------------+-------------+--------------+
| | 5 | 4 | 0 |
+-------------+---+------------+-------------+--------------+
| | 5 | 45 (75.0%) | 0 (0.0%) | 15 (25.0%) |
| AK8 subjets | 4 | 1 (0.5%) | 108 (51.2%) | 102 (48.3%) |
| | 0 | 7 (0.2%) | 22 (0.7%) | 3285 (99.1%) |
+-------------+---+------------+-------------+--------------+
I'll try to implement the SFs today. If I'm not done today, I'll finish it tomorrow.
I cannot find the place in the code where we require b-tagged subjets to have pT > 30 GeV. It has some (arguably minor) implications in determining the event-level weight from subjet b-tagging SFs.
I guess it is implemented here ?
https://github.com/HEP-KBFI/hh-bbww/blob/master/src/RecoJetCollectionSelectorAK8_hh_bbWW_Hbb.cc#L226
From: Karl Ehatäht @.*** Sent: 23 February 2022 17:15 To: HEP-KBFI/hh-bbww Cc: Subscribed Subject: Re: [HEP-KBFI/hh-bbww] Missing systematics (Issue #39)
I cannot find the place in the code where we require b-tagged subjets to have pT > 30 GeV. It has some (arguably minor) implications in determining the event-level weight from subjet b-tagging SFs.
— Reply to this email directly, view it on GitHubhttps://github.com/HEP-KBFI/hh-bbww/issues/39#issuecomment-1048698712, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACKEDDD7JRMMUIWJLSODGXDU4TCFNANCNFSM5OVO4YBQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks, that settles it.
I discussed the details of computing the b-tagging SFs for AK8 subjets with Louvain. The recommendation is to use method 1a in BTV Twiki. This implies that we compute the subjet b-tagging efficiencies from MC in bins of pT, |eta| and hadron flavor of the AK8 subjet. I'll create another workflow in tthProjection.py
that determines these efficiencies.
Well.. I made a mistake in my assessment that we have access to hadron flavors of AK4 gen jets in our post-processed Ntuples. I had this impression because we have dedicated GenJetReader
that could've been easily modified to read the hadron flavor branch -- but I never actually looked what's inside the post-processed Ntuples. Turns out that we used GenParticleWriter
to transcribe the GenJet
collection, so hadron flavor information is completely lost in post-production.
This has two implications:
GenQuarkFromTop
, GenBQuarkFromTop
, GenWZQuark
and GenHiggsDaughters
) and dR-match them to subjets;If these SFs cause any problems in dedicated SL fits or SL+DL combination, we always have the option to post-process the Ntuples again such that we could use the hadron flavor of matched AK4 gen jets as proxy to AK8 subjet flavor. I would delay these efforts unless really necessary.
All corrections and uncertainties are implemented. However, there seems to be a problem with the LHE Vpt reweighting that needs to be investigated on a longer time scale. For now I've disabled the corrections (and associated systematics). I'll keep the thread open until this last item is resolved.
@saswatinandan Please update: this repository, ttH repository and $CMSSW_BASE/src/PhysicsTools/NanoAODTools
repository before submitting any jobs.
Ok so we're not going to apply these corrections to LO samples at all but try out NLO W+jets samples instead. I'll refresh my memory on how to do it all over again and start with the Ntuple production asap.
Thank you very much, Karl!
@saswatinandan samples are ready. You now have to specify in the command line which samples you want to use:
-W lo
if you want to use LO samples;-W nlo
if you want to use NLO samples.
After yesterday's discussion it seems that we're still missing some systematic uncertainties that other groups have implemented:
The first two points are easy enough; the third requires some investigation. The fourth item is the most challenging, because our Ntuples simply lack the information needed to compute these SFs. Will need to discuss what our options are, because I don't think we can just ignore them since they rank relatively high in terms of impact.