Closed ktht closed 3 years ago
Just noticed that we don't have GenPhotonAll
collection in our Ntuples, probably because I reduced the number of gen collections to save computing time spent on post-production. I'll post-process the W+gamma and W+jets samples shortly to include the missing collection.
edit: need to post-process single top/+gamma, ttbar/+gamma and Z/+gamma samples as well.
Since my last post the situation has developed quite a bit. @kramerto crafted a recipe that makes the merging of X+gamma and X+jets samples more effective (in terms of yields) and accurate (in terms of lepton pT spectrum).
The veto goes as follows: use the event from X+gamma samples if it has a photon passing
isHardProcess
electrons, muons, taus, quarks (except for tops) and gluons, with no requirements on their Pythia status (except for quarks and gluons that should not have Pythia status equal to 21 in the veto): |pdgId| = 11, 13, 15, 21 or < 6).The photons considered in the event-level veto are:
isPrompt
photons with no requirements on their Pythia status;isPrompt
SFOS lepton pairs, where
mu -> mu + e+ + e-
);In case of X+jets samples, the event-level veto is inverted: if there exists a photon or a photon candidate in the event passing all of the above conditions, the event is vetoed.
Our Ntuples currently contain:
GenPhoton
: isPrompt
photons with Pythia status = 1 (because we use them in the estimation of conversion background);GenLep
: isPrompt
or isDirectPromptTauDecayProduct
electrons and muons with additional isLastCopy
condition and Pythia status = 1 (because we use them to match to reconstructed leptons);GenTau
: all generator-level tau leptons, regardless of their Pythia status codes or flags.GenPhoton
collection is too limited for our purposes, so we need an additional collection of gen photons that does not impose any conditions on the Pythia status code. The problem with GenLep
and GenTau
collections is that their mother-daughter relationship is lost.
Thus, we need the following collections added to our Ntuples:
isPrompt
electrons, muons (with Pythia status = 1) or tau leptons (with Pythia status = 2) that are either parentless or come from the same lepton parent (GenPhotonCandidates
);isPrompt
photons with no requirement on the Pythia status ~(GenPromptPhotons
)~ (edit: we'll use the same GenPhoton
collection but filter out status = 1 photons for gen matching and histogram filling at the analysis level);isHardProcess
particles with |pdgId| = 11, 13, 15, 21 or < 6, and with no requirements on their Pythia status (~GenFromHardProcess
~ GenIsHardProcess
).We need the three additional collections only in the relevant (X+gamma and X+jets) samples.
Plan of action:
GenPhoton
collection with a more broader one and make the Pythia status condition more explicit when estimating conversions (edit: we're going with it).I'll remove my previous development branch from this repository since it has become irrelevant.
Updated my previous post: we also need to pay attention to the mother-daughter relations when finding the SFOS leptons pairs for reconstructing the photon candidates.
Second edit: I forgot that decays such as mu -> mu + e+ + e-
, where the intermediate photon may be dropped, can (and are even more likely to) happen.
Updated my previous post: we also need to pay attention to the mother-daughter relations when finding the SFOS leptons pairs for reconstructing the photon candidates.
@ktht So does this mean you will add in parent / daughter info for GEN particles (or at least GEN leptons) into the NTuples that was not there before?
No, not possible. The mother-daughter relationship works by looking up GenPart_genPartIdxMother
branch of a daughter particle to find out the position of its mother in the GenPart
collection. We drop the GenPart
collection during post-processing of the Ntuples because of technical and somewhat historical reasons (reading the full collection is expensive in our analysis FW), so this relationship is completely lost. The post-processing step should be rewritten for Run 3.
So, for this short-term task, the logic of determining good candidates for photon reconstruction has to be partially done in the post-processing stage. It's a bit risky because the price of making a mistake there is higher (due to longer turnaround times spent on post-production) compared to analysis level.
There are some corner cases that need to be addressed that make this task somewhat non-trivial. For instance, there could be a decay like e+ -> e+ + e+ + e-
, where any of the three decay products undergoes further generator/showering steps before they become final state leptons. To resolve this, something like this could work:
At the analysis level, when applying the veto, we read genuine photons, proxy photons and particles that are part of the hard process.
edit: if my memory serves right and the intermediate steps don't conserve 4-momentum for some odd reason, then we should use the "true parents" to reconstruct the proxy photons.
I'm wondering how should we treat the ttbar samples that were produced with different top masses, widths, tune variations, etc in this context. We don't need these samples in HH-multilepton analysis, but they become relevant in analyses such as bbWW DL where the irreducible ttbar background is huge (hence motivating additional NP on ttbar background).
I tend to think that we should apply the same veto for the extra ttbar samples as well because the associated systematic uncertainties are derived wrt the nominal ttbar background that is subject to the veto.
I'll probably move these lines to analysisConfig.py
base class:
https://github.com/HEP-KBFI/hh-multilepton/blob/42931681d687f5e14166a8ecacb0a6403a5f9385/python/samples/reclassifySamples.py#L107-L119
because not all analyses in this repository or in HH-bbww repository (or any of the ttH analyses) have the gen photon filter implemented. So this feature should be enabled on per channel basis since I have no plans to update another 20+ executables.
The gen photon filter has now been implemented in our analysis FW. I also added a validation workflow that basically reproduced the same plots that @kramerto showed us for W+jets in his studies. The plots are available here: testGenPhotonFilter.pdf
A few comments:
From these plots it's apparent that the pT spectrum of the "extra" lepton in each case is smooth if the gen photon filter is properly applied.
The Ntuple post-processing will finish likely tomorrow (unless there are some jobs that exceed the 48h mark like some 2018 jobs did); reskimming shouldn't take more than a day.
At the moment we're using the same
GenPhoton
collection that consists of prompt status=1 gen photons to estimate the conversion background also in gen photon filter to resolve overlap between eg W+gamma and W+jets samples. There is no reason to use stable photons in the veto because we see that the ME photon in W+gamma (that the current veto is based on) can also decay into a pair of leptons. Thus, we should use a different gen photon collection (GenPhotonAll
) that includes unstable photons in order to apply the veto effectively.Since we're still discussing other options to make the veto more effective, I'm currently just creating the issue.