Add D0 to the DaVinci script

yipengsun commented 4 years ago

We want to add D0 selection to the DaVinci script, for 2 purposes:

Use this as a filter for D* mu combo
Create a new tuple tree for D0 mu decays.

yipengsun commented 4 years ago

@CoffeeIntoScience I've included D0 Mu combo as a filter for D* Mu combo. I have these questions:

In your YCandReco3.py and YCandRecoD04.py, the _MyDmup and _MyDmum have the following decay descriptors:
- _MyDmup: [B+ -> D0 mu+]cc
- _MyDmum: [B+ -> D0 mu-]cc
But in the stripping, we have: [ '[B- -> D0 mu-]cc' , '[B+ -> D0 mu+]cc' ].

I think this is likely a typo in your scripts, can you confirm that?
In both of your scripts, the LoKi_Filters are named as DmuFltr, but I don't see them used anywhere in the code, so I put them in the EventPreSelector of various SelectionSequence, is this the right way to do it?

yipengsun commented 4 years ago

I still can't get this to work.

My understanding on Phoebe's instruction

I checked the comments in the YCandReco3.py, and my understanding on making D0-mu combo as a pre-filter is the following:

Define the D0-mu combo selection
Put the D0-mu selection in a selection sequence
Append the selection sequence to the DaVinci main sequence, before the Dst-mu sequence

I'm still doubtful about this, because I don't see a place where explicit filtering is happening.

The code runs, and here's the summary from the log, suggesting that indeed, the D0-mu combo selection is at least executed:

TimingAuditor.T...   INFO    DaVinciAnalysisSeq                             |    93.124 |    91.501 |    8.804    2166.1   106.62 |     749 |    68.534 |
TimingAuditor.T...   INFO     DaVinciUserSequence                           |    93.084 |    91.460 |    8.768    2166.1   106.62 |     749 |    68.504 |
TimingAuditor.T...   INFO      StateSmear                                   |     1.775 |     1.566 |    0.313      21.4     0.91 |     749 |     1.173 |
TimingAuditor.T...   INFO      MyProtoPMaker                                |     0.267 |     0.357 |    0.229       2.0     0.16 |     749 |     0.268 |
TimingAuditor.T...   INFO      StdNoPIDsVeloPions                           |     0.040 |     0.041 |    0.018       0.5     0.03 |     749 |     0.031 |
TimingAuditor.T...   INFO      SeqMyD0MuCombo                               |    34.552 |    34.326 |    5.220     224.2    16.29 |     749 |    25.711 |
TimingAuditor.T...   INFO       SELECT:AllStreams/Phys/b2D0MuXB2DMuForTauMuL|     7.850 |     7.939 |    2.474      96.6     5.87 |     749 |     5.947 |
TimingAuditor.T...   INFO       SelMyStrippedChargedK                       |     0.013 |     0.027 |    0.014       3.0     0.11 |     749 |     0.021 |
TimingAuditor.T...   INFO       SelMyStrippedChargedPi                      |     0.000 |     0.014 |    0.006       0.1     0.01 |     749 |     0.011 |
TimingAuditor.T...   INFO       SelMyD0                                     |    23.337 |    23.189 |    8.169 18446744949882880.0    12.17 |     749 |    17.369 |
TimingAuditor.T...   INFO       SELECT:Phys/StdAllLoosePions                |     0.984 |     1.044 |    0.287      15.8     0.81 |     721 |     0.753 |
TimingAuditor.T...   INFO       SelMyD0MuCombo                              |     0.180 |     0.146 |    0.106       1.2     0.07 |     721 |     0.106 |
TimingAuditor.T...   INFO      SeqMyB0                                      |     1.802 |     1.927 |    0.008      22.0     1.79 |     749 |     1.444 |
TimingAuditor.T...   INFO       SelMyDst                                    |     0.221 |     0.246 |    0.110       1.9     0.13 |     721 |     0.178 |
TimingAuditor.T...   INFO       SelMyStrippedMu                             |     0.022 |     0.015 |    0.008       0.1     0.01 |     436 |     0.007 |
TimingAuditor.T...   INFO       SelMyB0                                     |     1.032 |     1.168 |    0.835       6.3     0.46 |     436 |     0.509 |
TimingAuditor.T...   INFO       SelMyRefitB02DstMu                          |     0.097 |     0.261 |    0.209       1.0     0.09 |     410 |     0.107 |
TimingAuditor.T...   INFO      TupleB0                                      |    52.002 |    50.467 |    0.007    1918.2    95.65 |     749 |    37.800 |
TimingAuditor.T...   INFO     MonitoringSequence                            |     0.000 |     0.010 |    0.005       0.4     0.02 |     749 |     0.008 |

Here I attach the YCandReco3.py: YCandReco3.txt (as a .txt)

My implementation can be found here.

The result

The generated ntuple is exact to the previous one. I tried to tighten the momentum cuts for the D0-mu combo, in the hope that this will reduce number of input events. This doesn't change the output

More investigation

If I try to define a filter:

from PhysConf.Filters import LoKi_Filters
fltr_D0Mu_combo = LoKi_Filters(
    VOID_Code="(CONTAINS({} > 0))".format(sel_D0Mu_combo.outputLocation())
)

Then put this filter in EventPreSelector:

seq_B0 = SelectionSequence('SeqMyB0',
                           TopSelection=sel_refit_B02DstMu,
                           EventPreSelector=[fltr_D0Mu_combo])

then I get this DaVinci error:

Traceback (most recent call last):
  File "/opt/lhcb/lhcb/GAUDI/GAUDI_v32r2/InstallArea/x86_64-centos7-gcc8-opt/scripts/gaudirun.py", line 540, in <module>
    applyConfigurableUsers()
  File "/workspace/build/GAUDI/GAUDI_v32r2/InstallArea/x86_64-centos7-gcc8-opt/python/GaudiKernel/Configurable.py", line 1640, in applyConfigurableUsers_old
  File "/workspace/build/LHCB/LHCB_v45r2/InstallArea/x86_64-centos7-gcc8-opt/python/LHCbAlgs/Configuration.py", line 267, in __apply_configuration__
  File "/workspace/build/LHCB/LHCB_v45r2/InstallArea/x86_64-centos7-gcc8-opt/python/LHCbAlgs/Configuration.py", line 259, in definePersistency
  File "/workspace/build/LHCB/LHCB_v45r2/InstallArea/x86_64-centos7-gcc8-opt/python/GaudiConf/IOHelper.py", line 412, in setupSrvices
  File "/workspace/build/LHCB/LHCB_v45r2/InstallArea/x86_64-centos7-gcc8-opt/python/GaudiConf/IOHelper.py", line 809, in convertStreams
  File "/workspace/build/LHCB/LHCB_v45r2/InstallArea/x86_64-centos7-gcc8-opt/python/GaudiConf/IOHelper.py", line 781, in activeStreams
  File "/workspace/build/LHCB/LHCB_v45r2/InstallArea/x86_64-centos7-gcc8-opt/python/GaudiConf/IOHelper.py", line 210, in _isOutputStream
AttributeError: 'NoneType' object has no attribute 'split'

I searched the keyword outputLocation, and I don't see the output of the D0-mu combo used (also not in comment)

CoffeeIntoScience commented 4 years ago

please post more of the logs with davinci-related issues

yipengsun commented 4 years ago

Here's the full DaVinic log: Dst-20_07_08-cutflow_mc.log

manuelfs commented 4 years ago

Yipeng, did you try adding , EventPreSelector=[fltr])?

That is the part that Phoebe has commented out, but I don't see it in the reco script

SelMyDmum = Selection("SelMyDmum",Algorithm = _MyDmum, RequiredSelections = [SelMyD0,mulist_pre])
SeqStrip = SelectionSequence('SeqStrip', TopSelection = SelMyDmup)#, EventPreSelector=[fltr])
MyPreSelection = SeqStrip.sequence()

yipengsun commented 4 years ago

The fltr is just filtering on the stripping line:

fltr = HDRFilter ('StrippedBCands', Code = "HLT_PASS('Strippingb2D0MuXB2DMuForTauMuLineDecision')"

I put the same filter to the DaVinci().EventPreFilters to speed up execution, and we consider them mostly equivalent (We have some discussion in one of the issues, but can't find it right now).

yipengsun commented 4 years ago

I found some example from the starter kit on re-using particles build by other selections: https://lhcb.github.io/starterkit-lessons/second-analysis-steps/filter-in-trees.html

Which seems to be coincide with what Phoebe said. Unfortunately I still haven't been able to get it work. I've include D0 ntuples for debugging, but that is also not working yet.

I believe the direction is correct, probably somewhere in the decay chain I messed up some signs. Still working on it.

yipengsun commented 4 years ago

Now it's mostly working: I have D0 tree, and when I cut D0 Mu combo harder, I see fewer events in the Dst tree.

The bug is really stupid, once I spot it: When trying to combine B-, instead of feeding a D0 and a Mu, I fed a D0 and a Pion!

yipengsun commented 4 years ago

The remain tasks are:

Think about how to handle wrong-sign particles. I imagine we should apply the same cuts for wrong sign Mu and Pi samples, but I need to think more about the details
Propagate the changes (lots of them) to run 2 script
Think about how to validate this

yipengsun commented 4 years ago

Now I'm no longer apply D0 Mu combo cuts to the Dst Mu combo. In the Dst tree, the total number of events goes up from 429 to 451. I believe this is what we wanted.

yipengsun commented 4 years ago

Validation on the D* Mu combo, run 1:

Reconstruction script	D0 `cutflow`	D0 `dv_strip`	D* `cutflow`	D* `dv_strip`
old `reco_Dst.py` / `test_reco_Dst_Bminus.py`	714	717*	429	?
new `reco_Dst_D0.py`	714	719	452	455
`nSPDhits` cut	-	717	-	454

yipengsun commented 4 years ago

I'm still having problems reconciling run 2 results. Need to stare at the run 2 reconstruction script more closely!

yipengsun commented 4 years ago

I finally found the problem: I forgot to apply the run 2 HLT2 filter, which is a part of the stripping FOR RUN 2 ONLY.

yipengsun commented 4 years ago

Validation on the D* Mu combo, run 2:

Reconstruction script	D0 `cutflow`	D0 `dv_strip`	D* `cutflow`	D* `dv_strip`
old `reco_Dst.py`	?	?	396	?
new `reco_Dst_D0.py`	708	713	408	410
`nSPDhits` cut	-	708	-	408

manuelfs commented 4 years ago

Are these numbers without momentum smearing and the nSPDhits<600 cut? In any case, they look quite good

yipengsun commented 4 years ago

Yes, without momentum smearing and without nSPDhits<600.

manuelfs commented 4 years ago

so do they get closer if you apply nSPDhits<600 to dv_strip?

yipengsun commented 4 years ago

I've updated the numbers after applying nSPDhits cut. Run 1 still doesn't agree, but is consistent with @manuelfs's previous result (see #41). Run 2 does agree with cutflow numbers.

yipengsun commented 4 years ago

My guess on the run 1 discrepancy: The DaVinci version used for that stripping (stripping 21) is too old, and that old DaVinci is no longer compatible with more recent DaVinci.

yipengsun commented 4 years ago

I think I've mostly figured out the wrong-sign sample generation. The solution is: Previously, I was blindly copy Phoebe's wrong-sign decay descriptor: [B+ -> D0 mu-]cc, and the result is: Somehow the wrong-sign tree has the same number of events as the right-sign sample.

Now I blindly switched to the descriptor used in the stripping line: [B+ -> D0 mu+]cc (here the wrong-sign-ness is more subtle: D0 is wrong-sign, right-sign should be D~0), and now in the wrong-sign tree, the size is:

For run 1, ~17% of right-sign.
For run 2, ~41%.

Now, I'd like @manuelfs to check my selection logic, namely the inputs of each selection (in the RequiredSelections field for each selection) to make sure my trees are logically correct.

As a side note, for real data, we now have 5 trees:

B-
B- wrong-sign
B0
B0 wrong-sign Pion
B0 wrong-sign Muon

For MC, we have 2 trees:

B-
B0

yipengsun commented 4 years ago

Another thing that doesn't make sense to me: Since we are strictly following the stripping line reconstructions for B- trees, we should be able to reconstruct a B- from EVERY EVENT that passes stripping.

However, that is not the case, both in MC and real data. Here's a MC log of cutflow mode generation, to show the problem:

TimingAuditor.T...   INFO Algorithm                              (millisec) |    <user> |   <clock> |      min       max    sigma | entries | total (s) |
TimingAuditor.T...   INFO ----------------------------------------------------------------------------------------------------------------------
TimingAuditor.T...   INFO EVENT LOOP                                        |     3.483 |     3.444 |    0.168    2288.9    23.97 |   26633 |    91.736 |
TimingAuditor.T...   INFO  DaVinciEventSeq                                  |     3.056 |     3.020 |    0.096    2287.7    23.95 |   26633 |    80.457 |
TimingAuditor.T...   INFO   DaVinciInitAlg                                  |     0.032 |     0.033 |    0.025       0.6     0.01 |   26633 |     0.890 |
TimingAuditor.T...   INFO   FilteredEventSeq                                |     3.012 |     2.977 |    0.063    2287.7    23.95 |   26633 |    79.300 |
TimingAuditor.T...   INFO    StrippedBCands                                 |     0.097 |     0.100 |    0.061       1.6     0.04 |   26633 |     2.675 |
TimingAuditor.T...   INFO    DaVinciEventInitSeq                            |     0.026 |     0.007 |    0.006       0.1     0.00 |     754 |     0.005 |
TimingAuditor.T...   INFO     PhysInitSeq                                   |     0.000 |     0.001 |    0.001       0.0     0.00 |     754 |     0.001 |
TimingAuditor.T...   INFO     AnalysisInitSeq                               |     0.013 |     0.001 |    0.001       0.0     0.00 |     754 |     0.001 |
TimingAuditor.T...   INFO    DaVinciAnalysisSeq                             |   102.785 |   101.504 |   27.504    2287.6   101.29 |     754 |    76.534 |
TimingAuditor.T...   INFO     DaVinciUserSequence                           |   102.785 |   101.497 |   27.497    2287.5   101.29 |     754 |    76.529 |
TimingAuditor.T...   INFO      MyProtoPMaker                                |     7.440 |     7.338 |    0.303      22.1     2.06 |     754 |     5.533 |
TimingAuditor.T...   INFO      StdNoPIDsVeloPions                           |     0.079 |     0.032 |    0.010       0.1     0.02 |     754 |     0.024 |
TimingAuditor.T...   INFO      SeqMyB-                                      |    35.397 |    35.623 |    9.540     222.7    14.28 |     754 |    26.860 |
TimingAuditor.T...   INFO       SELECT:AllStreams/Phys/b2D0MuXB2DMuForTauMuL|     4.721 |     4.743 |    0.901      69.0     4.92 |     754 |     3.577 |
TimingAuditor.T...   INFO       SelMyStrippedChargedK                       |     0.026 |     0.009 |    0.007       0.0     0.00 |     754 |     0.007 |
TimingAuditor.T...   INFO       SelMyStrippedChargedPi                      |     0.000 |     0.004 |    0.002       0.0     0.00 |     754 |     0.004 |
TimingAuditor.T...   INFO       SelMyD0                                     |    27.427 |    27.485 |    6.023     197.9    11.55 |     754 |    20.724 |
TimingAuditor.T...   INFO       SelMyStrippedMu                             |     0.013 |     0.006 |    0.004       0.0     0.00 |     720 |     0.004 |
TimingAuditor.T...   INFO       SelMyB-                                     |     0.458 |     0.541 |    0.365       1.3     0.12 |     720 |     0.390 |
TimingAuditor.T...   INFO       SelMyRefitB-2D0Mu                           |     0.141 |     0.150 |    0.138       0.6     0.02 |     707 |     0.107 |

Notice that there are 754 events passed stripping, but we only reconstructed 707 B-'s in the end. I don't understand why.

However, I think for now I'm mostly satisfied with my progress on D0, and I'll move onto the Hammer task. Once Manuel finishes checking my selection logic, I'll merge the d0 branch, and add some sample ntuples as for future references.

manuelfs commented 4 years ago

First of all, nice post!

In the past you some times have omitted key information so that it is difficult to follow what you did, but here you stated clearly the problem, a possible solution, why you think it make sense, the results, and as bonus you added a helpful guide with the trees currently in the ntuples. The one thing that you should also add is a link to the script you wanted me to check. I think you mean reco_Dst.py, but it took me a while to remember that it is in the d0 branch (and renamed reco_Dst_D0.py). And even if the script is obvious, it is good etiquette to provide the link to reduce the clicking to readers of the post. But other than that, thanks for the clear and comprehensive post!

I think what you wrote is right. [B+ -> D0 mu-]cc probably is the same thing as [B- -> D0 mu-]cc because the reconstruction only cares about the final particles. When you changed it to [B+ -> D0 mu+]cc then you are forcing wrong combinations of the final particles.

manuelfs commented 4 years ago

As for "we should be able to reconstruct a B- from EVERY EVENT that passes stripping", I'm not sure that is right since our reconstruction may be tighter than the stripping.

yipengsun commented 4 years ago

As for "we should be able to reconstruct a B- from EVERY EVENT that passes stripping", I'm not sure that is right since our reconstruction may be tighter than the stripping.

But our reconstruction for the B- (D0 Mu combo) has the same cuts as defined in the stripping line (which has been checked by me several times), so I think what I claimed makes sense.

yipengsun commented 4 years ago

Phoebe confirmed that we should be able to reconstruct a D0 Mu combo from events that pass stripping, at least mostly the case. We are suspecting that we have tighter cuts somewhere.

CoffeeIntoScience commented 4 years ago

Is that log snippet above Run1 or Run2?

CoffeeIntoScience commented 4 years ago

In any case, I see you are loosing events fitting the D0 -> Kpi. The cuts look right... I am wondering if you are secretly not using the same fitter algorithm in combineparticles as the stripping did. IIRC the stripping lines should all be using LoKi::VertexFitter now. You should be sure you're not picking up OfflineVertexFitter as a default.

Side note, when you refit the particles you do expect to lose some, it's this D0 building that's strange.

manuelfs commented 4 years ago

Thank you Phoebe, good point. If the events on the log line are the input to the module, we lose 34 events in SelMyD0 and 13 in SelMyB- (which could be due to a difference induced by SelMyD0). There may be additional events lost in SelMyRefitB-2D0Mu, but for that we'd have to check the number of event in the ntuples.

The cuts in reco_Dst_D0.py

algo_D0 = CombineParticles('MyD0')
algo_D0.DecayDescriptor = '[D0 -> K- pi+]cc'
algo_D0.DaughtersCuts = {
    'K+': '(PT > 300*MeV) & (MIPCHI2DV(PRIMARY) > 45.0) &' + \
          '(TRCHI2DOF < 4) & (PIDK > 4) & (TRGHOSTPROB < 0.5)',
    'pi-': '(PT > 300*MeV) & (MIPCHI2DV(PRIMARY) > 45.0) &' + \
           '(TRCHI2DOF < 4) & (PIDK < 2) & (TRGHOSTPROB < 0.5)'
}
algo_D0.CombinationCut = "(ADAMASS('D0') < 200*MeV)"
algo_D0.MotherCut = "(ADMASS('D0') < 100*MeV) & (VFASPF(VCHI2/VDOF) < 100)"

do not seem tighter than those in the Run 1 stripping

so it may be worth asking in the DaVinci list if something has changed under the hood.

yipengsun commented 4 years ago

My internet proxy is not working, so I can't send emails or slack messages for now (note that github is NOT blocked in China).

Here's my draft email, @manuelfs can you send the email for me if you think my email is OK?

Events get cut unexpectedly in DaVinci reconstruction

Dear experts,

We are trying to reconstruct B- -> D0 mu- with the identical inputs and cuts as in the run 1 stripping line: http://lhcbdoc.web.cern.ch/lhcbdoc/stripping/config/stripping21/semileptonic/strippingb2d0muxb2dmunufortaumuline.html

We are running over 2012 MagDown real data, and we are filtering on the stripping line above. Ideally, we should be able to reconstruct a B- from every event that passes the stripping. However, from the DaVinci log: https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/a50e0e054a2a778120f66863fa0df3751907c7d4/run1-b2D0MuXB2DMuNuForTauMuLine/logs/Dst_D0-20_08_04-std.log#L4185-L4206

After the StrippedBCands and Hlt2TriggeredD0, we have 2285 events remaining. But after the SelMyD0, we have 2276 events, and after SelMyB-, we have only 1953 events left.

The SelMyD0 is identical to that of the stripping, as far as we are concerned: https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/a50e0e054a2a778120f66863fa0df3751907c7d4/run1-b2D0MuXB2DMuNuForTauMuLine/reco_Dst_D0.py#L277-L294 https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/a50e0e054a2a778120f66863fa0df3751907c7d4/run1-b2D0MuXB2DMuNuForTauMuLine/reco_Dst_D0.py#L328-L332

Same for the SelMyB-: https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/a50e0e054a2a778120f66863fa0df3751907c7d4/run1-b2D0MuXB2DMuNuForTauMuLine/reco_Dst_D0.py#L336-L348 https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/a50e0e054a2a778120f66863fa0df3751907c7d4/run1-b2D0MuXB2DMuNuForTauMuLine/reco_Dst_D0.py#L373-L377

Why do we lost events in these two selections that involves CombineParticles?

System info:

DaVinci/v45r4

Regards.

yipengsun commented 4 years ago

I've also investigated a bit about the default VertexFitter in CombineParticles, and I found the following:

The vertex fitter seems to NOT user-configurable: https://gitlab.cern.ch/lhcb/Phys/-/blob/master/Phys/ParticleCombiners/src/CombineParticles.h
No idea about the default particleCombiner: https://gitlab.cern.ch/lhcb/Phys/-/blob/master/Phys/ParticleCombiners/src/CombineParticles.cpp#L487 https://gitlab.cern.ch/lhcb/Phys/-/blob/master/Phys/ParticleCombiners/CombKernel/ParticleCombiner.h

manuelfs commented 4 years ago

Thank you Yipeng for the draft email. Your formatting seemed designed for GitHub?

I think you've sent good emails in the past asking questions, but just in case, a few general pointers that I try to follow

Summarize upfront what the email is going to be, and give some key details
Make it easy to read by avoiding too many links in the email
- I don't like to hide the actual link too much, so the most important links I move down, though some time I leave a few hyperlinked or inline
Copy the most relevant snippets in the email so that it may be possible to respond without clicking any link
Give thanks at the end

yipengsun commented 4 years ago

I just realized that since @manuelfs has checked my inputs to various selections (see #48 for more detail), The d0 branch can now be merged to master.

I'll do that, and delete the d0 branch.

umd-lhcb / lhcb-ntuples-gen