Apparent true q2 bin outliers (validating step 1 Bd2DstMuNu MC ntuples)

afernez commented 3 years ago

As observed in slides 4, 5 of this presentation, in the true q2 histograms for our run1 and run2 Bd2DstMuNu MC, there seem to be 2 bins that, for the reconstructed tree wrt the non-reconstructed tree, are overfilled for an unknown reason. The "outlier" bins in question are near q2=0,4:

I don't think I've identified the "cause" of these outliers, but there are a few points I've investigated that I wanted to highlight.

When an event has multiple (>1) candidates, if only one candidate is kept, the cut "extra" candidates seem to significantly reduce the outlier bins (and almost exclusively these bins); it seems these outlier bins contain most of the events with multiple candidates. In this plot, the red histogram includes all multiple candidates in all events, while the blue plot only includes one candidate per event. Maybe notably, even if all candidates were excluded for events with multiple candidates (which would create a new histogram for which the outlier bins, at max, would lie a distance below the blue histogram equal to the difference between the blue and red histograms in these outlier bins), it seems the exact shape of the non-reconstructed q2 distribution (plot above) would not be recovered. Also as a note: I've only plotted this for run 1 because run 1 has higher statistics than run 2 (the run 2 ntuple I used only uses a subset of all the .dst files for the decay).
In the above presentation, one thing I'm comparing is the yields between Phoebe and our run 1 ntuples (wrt the number of events on disk), which should be (at least close to) equal, since the ntuples are produced from the same .dst files, and I apply "compensating" cuts when making the plots to try to account for any differences between the production of the ntuples (the main differences between Phoebe and our run 1 ntuples are due to the different reco scripts and, relatedly, DaVinci versions). As of right now, these yields (found in the presentation) are relatively close (19.7% for Phoebe vs 18.0% for us), but out of curiosity I plotted the true q2 distribution for Phoebe's ntuple vs ours, to see if it had similar outlier bins (Phoebe didn't have a non-reconstructed tree, like we do in our ntuples, so I'm comparing reconstructed events), and it seems that the remaining excess events in Phoebe's ntuple wrt ours, after I apply compensating cuts, seem to be primarily in the two outlier true q2 bins (about 7000 out of about 10000, to be precise). That is, indeed Phoebe's ntuple also seems to have these strange outlier bins (not confirmed outliers in her case, since there's no non-reconstructed tree to compare to, but I suspect these bins would be outliers for her like they are for us), and in fact hers are much more pronounced than ours. As a note for the first plot after this point: all candidates are included for each event. Related to point 1 above, the second plot following this point shows the number of candidates in each event, and it appears Phoebe has many more multiple candidate events than us (and even some events with 3 or more candidates). It would be interesting to see if Phoebe's outlier excess in the first plot became significantly reduced if only one candidate were kept per event (as a side note, I didn't produce this plot for now because I'm not sure how to keep track of event numbers when using plot_scripts, which is why I wrote my own macro to make plots with shared events between trees/with only one candidate per event, but also I haven't implemented the compensating cuts inside my macro; of course, I could address one of these two issues and make the plot, if there's sufficient interest). My expectation is that the outliers would be reduced, along with her yield becoming more similar to ours.
To try to see if the outlier q2 bins were correlated with any distortions of histograms for any variables I could think of, I plotted a bunch of variables with custom q2 "cuts", in particular with 0<q2<1 (containing first outlier bin), 4<q2<5 (containing second outlier bin), q2 outside of the 0-1 and 4-5 intervals, and all q2 "cuts". The plots used our run 1 reconstructed tree, included all multiple candidates for each event, had all of my compensating cuts applied, and had each histogram normalized to 1 to compare shapes. For the most part, I did not observe anything in these plots that I found obviously meaningful for the question of what caused the outlier bins, but I'll highlight the plots where the various q2 cuts did visibly affect shapes:
- These plots (and other similar momentum plots) demonstrate an expected physics correlation. Notably, the momentum variables plotted aren't truth variables, but the reconstructed q2 is still correlated with the true q2, and the reconstructed variables should still reflect trends that truth variables would. With q2 being the mass of the off-shell W for the decay, and higher q2 corresponding to more energy carried away from the D* to the muon (and neutrino), it's expected that higher q2 should correspond to overall lower D* momentum (ie. fewer high p events) and overall higher muon momentum (ie. more high p events), and vice versa. For the chosen cuts, the green histograms have the (relative) most high (true) q2 events, with black next, then red, and finally the blue histogram has the most low q2 events. The plots indeed show the expected trends, though this does not tell us much about the origin of the outlier bins:
  - D* most to least high p events: blue, red, black, green
  - muon most to least high p events: green, black, red, blue
- This plot reflects the already suspected point: the outlier bins contain the majority of events with multiple candidates (in particular, the outlier bin at q2=0 has the most)
- I'm not sure how to interpret these final plots; they also don't jump out to me as being indicative of something causing (or caused by) the outlier q2 bins, but I wanted to include them in case somebody else had any thoughts about them. Actually, the B (measured) mass plot raises an unrelated question for me: why are these histograms not peaked around the nominal B mass (around 5280 MeV)? This very wide peak centered at the wrong (?) value seems to be common amongst all the ntuples (ie. Phoebe's and our run 1 and run 2). I also cannot think of a reason why different q2 cuts would distort the shape of this histogram, but I don't think the distortions can be attributed solely to the "excess" of the outliers (the distortions are too significant).

Final thoughts: I'm fairly confident in the correlation between multiple candidates and the excesses in the outlier bins, but I'm not sure what this correlation could possibly imply. Perhaps true q2=0 events are more likely to be reconstructed with multiple candidates (but why?), but why would q2=4 also be special? Even more, I note in point 1 that I don't believe multiple candidate events can account for all of the excess in the outlier bins, so even if this correlation were explained, where is the other excess originating from? We can discuss this in the meeting Tuesday, but for now just FYI @manuelfs

manuelfs commented 3 years ago

Phoebe suggested that the peaks were due to the truth-matching failing (the one at q2 ~ 4 GeV^2 would be the D* truth-matching succeeding, but the B failing), and Alex confirmed by applying abs(b_TRUEID)=511 && abs(dst_TRUEID)=413

afernez commented 3 years ago

Just to provide plots for this point that the outlier bins were due to truth-matching failing, here are plots of q2 for runs 1 and 2, with shared events from the no reco tree in black, events from the reco tree in grey, and reco events with truth-matching applied for the various particles in the decay in the various colors:

Notable things:

Referencing the plots of true IDs (plots included with the ntuples; I won't include them here), it's clear that the particles that have the most truth-matching failures are B and D*.
In this plot we see that once the B is truth-matched, the outlier bins are reduced to normal. Actually, it seems most of the D* truth-matching failures are also B truth-truth matching failures, since we only get 1 fewer entry when adding D* matching on top of B matching (but the other way around isn't true, ie. there are more B failures than D* failures, and hence the peak at q2=4).
Most particles seem to have some truth-matching failures, except the D (I'm not sure why? To check: the plots included with the ntuples also show that the D does not have any truth-matching failures); though, again, the dominant effect is the B and D* truth-matching.
It seems the events that had truth-matching fail (namely, for the B or for both B and D*) come slightly more commonly from high-q2 events (high q2 in the non-reconstructed event, that is); ie. the two peaks at q2=0,4 are really events that slightly more commonly had high-q2 before DaVinci reconstruction.
Finally, I just wanted to note that although truth-matching failures and multiple candidates are correlated, they aren't one-to-one. This plot of the total candidates in each event accepts multiple candidates (unlike the other plots in this comment), and it has truth-matching applied for all particles in the decay, as well as the compensating cuts that I made for validation purposes (see slide 5 here).

umd-lhcb / lhcb-ntuples-gen

Apparent true q2 bin outliers (validating step 1 Bd2DstMuNu MC ntuples) #60