umd-lhcb / lhcb-ntuples-gen

ntuples generation with DaVinci and in-house offline components
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Potential truth-matching problem #119

Closed yipengsun closed 1 year ago

yipengsun commented 2 years ago

Discussed $D_1^0$ truth-matching, and she thinks things to check are:

  1. Make sure to also use dst_gd_gd_mom for truth-matching (might be the reason why D0PiPi get killed
  2. Sometimes for Dst channel, the truth-matching is all OK except BKGCAT = 50, instead of 0. In this case, ignore the BKGCAT and consider this event truth-matching OK.
yipengsun commented 1 year ago

The new truth-matching might be (NOT sure!) problematic. While we working on this, I need to get the basic D** templates ready. So I've created another branch to have the OLD truth-matching (buggy-truth-matching).

We'll continue on master for latest truth-matching efforts.

yipengsun commented 1 year ago

This is what a D** template looks like, with the new truth-matching:

updated_truth_match

The old template can be seen at https://github.com/umd-lhcb/rdx-run2-analysis/blob/master/docs/fit_templates/pre_final_reweighting/22_08_25-2016_Dstst.pdf

yipengsun commented 1 year ago

So, either the updated truth-matching is still buggy, or we don't have enough MC for this template.

yipengsun commented 1 year ago

I think it is more likely that the TM is bugged: The plot I show above is about D_0 which doesn't have PiPi templates (Only D_1 does).

yipengsun commented 1 year ago

Or, it could be due to PID problems: https://github.com/umd-lhcb/lhcb-ntuples-gen/issues/120

When I copy the auxiliary ntuples, I forgot to remove the aux PID ones (Applying PID weights are fast, and they may change more frequently, so I typically never cache them).

Cache validation is indeed pretty hard. I'll try re-run and see what's going to happen.

yipengsun commented 1 year ago

OK, I can confirm that the issue is in the ProbNNk side. NOT truth-matching! False alarm!

yipengsun commented 1 year ago

I think the D1PiPi truth-match problem has been solved.

yipengsun commented 1 year ago

Alex and I checked the truth-matching, and we think there's no more problem.

@afernez can you document the problem and the fix regarding D**PiPi and close this issue?

afernez commented 1 year ago

Sure, actually I'll write here about the two problems we thought about for completeness, though one was seemingly never a truth-matching bug. I think both problems still have some thinking remaining to be done, but the $D^0\pi\pi$ issue can probably be considered resolved for now.

First, we found that in the $D^{*}$ sample we had surprisingly low $D^{**}\tau$ stats compared to Phoebe (see slides 8-9 here). We worried that this was because of a bug on our side, but the majority of the stats difference is explained by Phoebe using separate requests/.dec files for her $D^{*}$ and $D^0$ $D^{**}$ MC, while we shared the $D^{**}$ MC between $D^{*}$ / $D^0$ (note: if we end up deciding we need more MC, probably we should use her $D^{*}$ .dec file). Still, the important ratio is not Our MC/Phoebe MC but Our MC/Our Data (we would ideally like to be at least about 10x); this check is being continued in this issue.

Second, we found that in the $D^0$ sample, the $D_1\rightarrow D^0 \pi\pi$ templates were very low on events (see slides 5-6 here). In these templates, Phoebe allows for events that proceed as

  1. $B \rightarrow D_1 [\rightarrow D^0 \pi \pi ] l \nu_l$
  2. $B \rightarrow D_1 [\rightarrow D^{*} (\rightarrow D^0 \pi) \pi \pi ] l \nu_l$
  3. $B \rightarrow D_1 [\rightarrow D_0^{*} (\rightarrow D^0 \pi) \pi \pi ] l \nu_l$ (note: 95% of $D_0^{*}$ goes to $D^0 \pi$ in .dec files)

(note: 1. and 2. are only allowed in principle; there are no such $D_1$ decays included in the .dec files!) while cutting out $B \rightarrow D_1 [\rightarrow D_0^{*} (\rightarrow D^{*} [\rightarrow D^0 \pi ] \pi \pi ) \pi ] l \nu_l$ and all other $D^{**}$. On the other hand, the nominal (not $\pi\pi$) templates include events that go as

i. $B \rightarrow D^{**} [\rightarrow D^0 \pi] l \nu_l$ ii. $B \rightarrow D^{**} [\rightarrow D^{*} (\rightarrow D^0 \pi) \pi] l \nu_l$.

It took some staring to figure out, but the reason for the low $\pi\pi$ stats was a bug in my truth-matching code that was classifying 3. events as ii. (and so incorrectly putting the events in the not $\pi\pi$ templates).

With this bug fixed, I ran on a subset of our tracker-only MC (001/002 files for 11874430/12873450 and 1/2 files for 11874440/12873460 in this folder) to check the BFs looked okay wrt the .dec files. For this, I looked at all $D^{**}$ $\pi\pi$ templates (not all used in fit, but my code still assigns them all nonzero truth-matching values) and included all the types of decays 1., 2., 3. to get the fullest comparison to the .dec files (Phoebe creates these templates too for comparison purposes and just doesn't use them in the fit).

** Some decays included in .dec file are cut by truth-matching, so to compare to expected BF from .dec files, need to normalize BFs. However, only a very small fraction of events are cut, so I don't bother renormalizing. Also, these comparisons are naive because the reco is better/worse for some decays (ie. all decays in .dec file are weighted the same in the ** column, regardless of reco eff.).

Interestingly, the reco. eff. consideration noted in ** only seems to considerably affect the $B^- \rightarrow D^{**0}$ decays (or, more accurately, it seems to affect the $D^{**+}$ BFs differently from $D^{**0}$). The obvious thing to try to explain the difference is the presence of neutral vs charged pions, but I can't seem to justify the $D^{**+}$ vs $D^{**0}$ difference this way. I again checked my code to make sure that nothing there could be causing the difference, and it shouldn't be truth-matching related (the $D^{**(+,0)}$ are treated identically).

If this difference is ignored, generally the ratios seem consistent with the truth-matching being okay now. At the least, the $D^0\pi\pi$ problem has been fixed.

yipengsun commented 1 year ago

Hmm, so for the charged $D^{**}$, the BF obtained from reco & truth-matching is larger than that specified in the .dec file, whereas for neutral $D^{**}$, the 2 BFs are more or less the same, right?

Can't think of a good reason to explain this either.

afernez commented 1 year ago

I think the $D^{**+}$ BFs seem fairly consistent actually (with a slight $D_0^{*}$ bump), at least with the naive estimation. And the neutral $D^{**0}$ are showing more $D_0^{*}$ and $D_1\rightarrow D^0 \pi \pi$ (which proceeds via $D_1 \rightarrow D_0^{*} \rightarrow D^0$) and less of the other $D^{**}$ relative to what's naively expected from the .dec files.

Maybe more $D_0^{*}$ events get reconstructed because the $D_0^{*}$ is the lightest $D^{**}$, so the pions from its decay are slowest/get lost more readily? And then that bump is enhanced for $D_0^{* 0} \rightarrow D^0 \pi^0$ (relative to $D_0^{* +} \rightarrow D^0 \pi^+$ because at least that pion is charged). But that's really just a guess.