umd-lhcb / lhcb-ntuples-gen

ntuples generation with DaVinci and in-house offline components
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Simulation of triggers for tracker-only MC #59

Closed yipengsun closed 3 years ago

yipengsun commented 3 years ago
yipengsun commented 3 years ago

Given that current DaVinci scripts are ready for FullSim, but not for tracker-only, I'll develop on tracker-only branch to not interfere with submission works of the FullSim.

manuelfs commented 3 years ago

I think the FullSim scripts still need to have the PID removed, like the FastSim. The only difference between the two may be the trigger

yipengsun commented 3 years ago

@Svende here's my question:

For L0TOS emulation (which will be taken from data directly), this script uses a variable called _L0Calo_HCAL_xProjection. If you search that keyword in the same repository, it is unclear on how these variables are added.

My guess this comes with the real data sample, but we probably need additional TupleTool for that, but the question can be: What is the input for the L0TOS emulation? Do we need to add additional TupleTools for this sample (something from real data, presumably)?

Svende commented 3 years ago

I think the corresponding TupleTool is called TupleToolL0Calo/HCALtool and is added here: https://gitlab.cern.ch/lhcb-slb/B02DplusTauNu/-/blob/master/tuple_production/B2Dmu_AllSpecies_MC2016.py#L315. Lucia was mentioned it in our last meeting with them if I remember correctly. I think this script also adds the other TupleTools which are needed.

yipengsun commented 3 years ago

Svende is right. The TupleToolL0Calo is in Analysis/Phys/DecayTreeTupleTrigger.

yipengsun commented 3 years ago

Here's what I've done so far.

  1. Add the RelatedInfoTool to DaVinci source then compile. It works out of box with DaVinci/v45r6. For more info, see this README.
  2. Add TupleToolTrackPosition so that variables like k_X are added. https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/910ae3eb8cff5689608576245a127ae483ac89bf/run2-rdx/reco_Dst_D0.py#L798
  3. Add RelInfoHLT1Emulation to DaVinci's mainSequence. https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/910ae3eb8cff5689608576245a127ae483ac89bf/run2-rdx/reco_Dst_D0.py#L636-L652
  4. Define the info that we need to extract from HLT1 variables https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/910ae3eb8cff5689608576245a127ae483ac89bf/run2-rdx/reco_Dst_D0.py#L612-L633 https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/910ae3eb8cff5689608576245a127ae483ac89bf/run2-rdx/reco_Dst_D0.py#L798-L806
  5. Compute HLT1 MVA BDT scores of certain particle combination https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/910ae3eb8cff5689608576245a127ae483ac89bf/run2-rdx/reco_Dst_D0.py#L809-L824

I've removed the TupleToolL0Calo, and ran over the 3 sample tracker-only dst files that I have locally, and confirmed that TupleToolL0Calo is not needed provided that the step 3 is followed, and the resulting ntuple still have branches added by emulation with non-sensible values.

So the problems remains:

  1. The newly added branches have non-sensible values
  2. No HLT2 line gets added at all, despite my configuration
Svende commented 3 years ago

thanks Yipeng, I will take a look at that later. Do you have your tuple also somewhere such that I can take a look at it as well?

yipengsun commented 3 years ago

My locally generated ntuple can be accessed at:

/afs/cern.ch/user/s/suny/public/trigger_emulation/mc.root
yipengsun commented 3 years ago

BTW, RD+ people also added lots of additional variables in their DaVinci script, but these variables seem unused and the sample DaVinci ntuple doesn't have these variables either, so I didn't add them for now.

yipengsun commented 3 years ago

Just a side: Initially, these were added to my script. I removed them because once I add the RelInfoTool, added or not, these code don't make a difference:

  tuple.addTool(TupleToolL0Calo, name="HCALtool")
  tuple.HCALtool.WhichCalo="HCAL"
  tuple.HCALtool.TriggerClusterLocation="/Event/Trig/L0/Calo"
  tuple.ToolList+=[ "TupleToolL0Calo/HCALtool"]
yipengsun commented 3 years ago

As for the HLT2 emulation, previously my claim was that they did exactly the same thing as us, well that is not entirely true. In their code, the only trigger-related part is this:

evtTuple                 = EventTuple()
evtTuple.ToolList       += ["TupleToolEventInfo", "TupleToolTrigger"]

But this evtTuple is not used anywhere else, so it's extra confusing to me on why the sample DaVinci ntuple they provided has all these trigger branches at all.

yipengsun commented 3 years ago

Another thing that we should ask: They used TupleToolApplyIsolationMC instead the regular TupleToolApplyIsolation. The former adds more truth info:

                    // -- Retrieve truth-level information
                    truepid = 0;
                    const LHCb::MCParticle* mcmaxpart(NULL);
                    for ( std::vector<IParticle2MCAssociator*>::const_iterator iMCAss = m_p2mcAssocs.begin(); iMCAss != m_p2mcAssocs.end(); ++iMCAss ) {
                        mcmaxpart = (*iMCAss)->relatedMCP(maxpart);
                        if ( mcmaxpart ) {
                            truepid = mcmaxpart->particleID().pid();
                            break;
                        }
                    }

and added PT and ETA:

                ptransverse = maxpart->momentum().Pt();
                eta = maxpart->momentum().Eta();

I'm not sure if we need these info. I think Phoebe didn't use them for run 1 MC. We should ask if these additional (mostly) truth info would benefit our analysis.

Svende commented 3 years ago

Thanks for the whole documentation of what you did. I have been looking at your code and compared it to theirs. I have some questions to you:

  1. I can’t find where you specify the Input of the RelatedInfotool compared to this line https://gitlab.cern.ch/lhcb-slb/B02DplusTauNu/-/blob/master/tuple_production/B2Dmu_AllSpecies_MC2016.py#L157
  2. Also in their code they add the tool: LoKi::Hybrid::TupleTool/Hlt1TwoTrackMVAEmulation here which I can't find in your script. Do you know what it does or what it used for?
Svende commented 3 years ago

I think the HLT1 emulation the happens in this script: https://gitlab.cern.ch/lhcb-slb/B02DplusTauNu/-/blob/master/tuple_processing_chain/emulate_HLT1_cuts.py

yipengsun commented 3 years ago

For the LoKi::Hybrid::TupleTool/Hlt1TwoTrackMVAEmulation, I think it is just a specific instance of LoKi::Hybrid::TupleTool with the name Hlt1TwoTrackMVAEmulation. As stated previously, I don't see these variables getting used in the HLT1 offline script, so I didn't add them.

I fixed the bug that the output location is not specified. Please take a look to see if you agree with my changes. Still the output ntuple stayed the same.

yipengsun commented 3 years ago

If you agree with my statement above, we've reached a point of no known problems yet the emulation doesn't work. I think we should ask Simone and Julian.

yipengsun commented 3 years ago

For the TupleTool, from starter kit:

Its name, specified in the addTupleTool call after a /. This is very useful (and recommended) if we want to have different LoKi::Hybrid::TupleTool for each of our branches.

So I think my statement on that is right.

yipengsun commented 3 years ago

The latest ntuple with ALL branches added is available at:

/afs/cern.ch/user/s/suny/public/trigger_emulation/mc_more_brs.root
yipengsun commented 3 years ago

And all the newly added branches are identically -2000. I suspect the RelatedInfoTool is not properly added/incompatible with DaVinci/v45r6. If not, the tracker-only MC may be problematic (unlikely).

yipengsun commented 3 years ago

My locally generated ntuple can be accessed at:

/afs/cern.ch/user/s/suny/public/trigger_emulation

L0

  1. In our correspondence, it seems that L0 TOS is entirely simulated offline. My interpretation is that this simulation only uses common variables that are available in DaVinci ntuples, and we don't add additional TupleTools for this. Am I correct?

HLT1

  1. First I tried to add the RelatedInfoTool to DaVinci. I stole your code and rearranged them according to the correct hierarchy and placed them here. They compiled fine with DaVinci/v45r6
  2. I added this tool and all related configuration to our DaVinci reconstruction script:

    https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/f9f3a5ef31b8beeb4fc9b2abbdddd72e9f3ad193/run2-rdx/reco_Dst_D0.py#L605-L650

    https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/f9f3a5ef31b8beeb4fc9b2abbdddd72e9f3ad193/run2-rdx/reco_Dst_D0.py#L835-L868

    Note that the TupleToolL0Calo is not added. This is because once I add RelInfoHLT1Emulation to DaVinci's main sequence, all the L0 Calo branches are automatically added.

  3. However, most of the branches added by the RelatedInfoTool, such as _DOCA_COMB_1_2 are identically -2000, we suspect that we didn't add the RelatedInfoTool properly. What DaVinci version were you using? Could you take a look and see if there's an obvious mistake?

HLT2

HLT2 is supposed to be emulated by DaVinci, and indeed, your script doesn't contain specific treatment for HLT2, AFAIK. However, we specified a HLT2 line in TupleToolTrigger yet we don't even have that branch in our tracker-only ntuple. We are at a loss here, could you give us more hint?

yipengsun commented 3 years ago

@Svende @manuelfs Above is my draft regarding our questions to Julian and Simone. Feel free to edit as you see fit. Also, in our email, can we send a link pointing to that github comment directly? Because the code is better displayed there.

simeloni commented 3 years ago

@yipengsun just a follow up on my email today. The first thing I would check is if you are able to save into your ntuples the variables that we add with the lines you removed (see this message):

For the LoKi::Hybrid::TupleTool/Hlt1TwoTrackMVAEmulation, I think it is just a specific instance of LoKi::Hybrid::TupleTool with the name Hlt1TwoTrackMVAEmulation. As stated previously, I don't see these variables getting used in the HLT1 offline script, so I didn't add them.

I fixed the bug that the output location is not specified. Please take a look to see if you agree with my changes. Still the output ntuple stayed the same.

I suspect the RELINFO Filter is not able to get the correct numbers for some reasons.

yipengsun commented 3 years ago

Hey Simone, thanks for the quick reply!

The comment you quote is no longer valid. Before sending the email, I already added all these variables back, and yes, all of them are at -2000.

Thanks for the hint that -2000 is likely just standard output for MVA for inputs are null. I'll dig deeper and see if we screwed up input somewhere or if the TES has changed for our tracker-only MC.

simeloni commented 3 years ago

Sorry I did not see the relevant reply! A few comments below:

  1. In our correspondence, it seems that L0 TOS is entirely simulated offline. My interpretation is that this simulation only uses common variables that are available in DaVinci ntuples, and we don't add additional TupleTools for this. Am I correct?

That's correct! Some things are needed from the TupleToolL0Calo.

  1. However, most of the branches added by the RelatedInfoTool, such as _DOCA_COMB_1_2 are identically -2000, we suspect that we didn't add the RelatedInfoTool properly. What DaVinci version were you using? Could you take a look and see if there's an obvious mistake?

We are using v42

HLT2 is supposed to be emulated by DaVinci, and indeed, your script doesn't contain specific treatment for HLT2, AFAIK. However, we specified a HLT2 line in TupleToolTrigger yet we don't even have that branch in our tracker-only ntuple. We are at a loss here, could you give us more hint?

It is correct not having that branch into the tracker-only. HLT2 requires HLT1 to be run, hence also HLT2 must be emulated offline. Given that HLT2 already works on offline-quality data, we did not expect big differences. Additionally, HLT2 is written in the DaVinci-combineParticles framework, so it is simple to emulate lines. We just looked into the HLT2 line, and added by hand the cuts on the relevant objects to our DaVinci script. PID selections required by HLT2 are emulated with PIDCalib

yipengsun commented 3 years ago

I'm trying to reduce the problem to a bare minimum, and now I have something like this:

relinfo = AddRelatedInfo('RelInfo_HLT1_' + seq_B0.name())  # Only add for B0 sequence
relinfo.addTool(RelInfoHLT1Emulation, 'RelInfoHLT1Emulation')
relinfo.Tool = "RelInfoHLT1Emulation"
relinfo.Location = 'HLT1Emulation'
relinfo.Inputs = [seq_B0.outputLocation()]

dt_hlt1_emu = getattr(relinfo, 'RelInfoHLT1Emulation')
dt_hlt1_emu.Variables = []
dt_hlt1_emu.nltValue = int(DaVinci().DataType[2:])  # figure out the year by yourself, DaVinci!

DaVinci().appendToMainSequence([relinfo])

relinfo_output = tp.Inputs[0].replace('Particles', 'HLT1Emulation')
print('!!!! SOME DEBUG STUFF')
print(tp.Inputs[0])
print(relinfo_output)
print('!!!!')

tt_hlt1_emu = getattr(tp, B_meson).addTupleTool(
    'LoKi::Hybrid::TupleTool/Hlt1TwoTrackMVAEmulation')
tt_hlt1_emu.Preambulo = []

tt_hlt1_emu.Variables['alt_ndaughters'] = "RELINFO('"+ relinfo_output+ "', 'NDAUGHTERS', 0)"

The printed debug info:

!!!! SOME DEBUG STUFF
Phys/SelMyB0/Particles
Phys/SelMyB0/HLT1Emulation
!!!!

Yet the branch _alt_ndaughters is still identically -2000. I need to think harder on why the input to AddRelatedInfo is not working.

yipengsun commented 3 years ago

I think I know why...Previously I was mindlessly adding RelatedInfoTool to DaVinci's main sequence, then adding the selection sequences and tupling sequences. Thinking about it, when RelatedInfoTool was running, the required input that is supposed to be provided by a selection sequence is not there! That's why we get a bunch of -2000s.

yipengsun commented 3 years ago

Hmm, the chi2, fdchi2, nt, sumpt still don't look correct.

yipengsun commented 3 years ago

Also, branches like pi_L0Calo_HCAL_{TriggerET, TriggerHCALET, xTrigger, yTrigger} are either at -1 or 0. This is still wrong.

yipengsun commented 3 years ago

@simeloni I'm confused about the usage of TupleToolL0Calo. I'm trying to figure out why the pi_L0Calo_HCAL_{TriggerET, TriggerHCALET, xTrigger, yTrigger} branches are either all 0 or all -1, and I looked at the source

It turns out that m_fillTriggerEt is always set to false, and this is not configurable. So these branches are never really populated, rather a default value is filled.

I then changed that TupleTool to something like this:

TupleToolL0Calo::TupleToolL0Calo( const std::string& type,
                                  const std::string& name,
                                  const IInterface*  parent )
    : TupleToolBase( type, name, parent ),
      m_caloDe( 0 ),
      m_adcsHcal( NULL )

{
  declareInterface<IParticleTupleTool>( this );
  declareProperty( "WhichCalo", m_calo = "HCAL" );
  declareProperty( "TriggerClusterLocation", m_location = "" );
  declareProperty( "FillTriggerEt", m_fillTriggerEt = false );
}

Which should make the FillTriggerEt configurable. I set this property to True in my DaVinci script, but I see no effect.

My current suspicion is that is TupleToolL0Calo is somehow invoked by the RelatedInfoTool automatically, without the filling flag set. This is because I somehow still get these L0 branches even without adding the TupleToolL0Calo at all! But I can't find any mentioning of this tool in your code. Could you give me some pointers on why the L0 Calo branches are added automatically? Did this happen with your DaVinci v42?

simeloni commented 3 years ago

Hi @yipengsun, We did not write the TupleToolL0Calo tool, so I don't know it to the barebones. I guess what you are looking at is objects from the trigger. Being Tracker Only, no trigger has been run, so those variables are expected to be filled with standard values.

The branches you are interested in are others from that tool. They are the ones that represent the coordinates of the projection of the track on the HCAL plane, the Real ET, and the CALO region code.

The RelatedInfoHt1Emulation Tool should not be calling anything from the TupleToolL0Calo, so I honestly don't know why you already have the variables in the ntuples.

yipengsun commented 3 years ago

I figured out the reason that why we have branches added by TupleToolL0Calo in our ntuple even if I don't copy Simone's code: It was already added (what a surprise).

yipengsun commented 3 years ago

We consider HLT1 emulation to be fully working. L0 Global might need some further check and definitely need another plot showing the decay mode independence. L0 Hadron still need more work on figuring out the regression variable for the BDT.

The L0 Hadron process will be tracked in umd-lhcb/TrackerOnlyEmu#3.