trackreco / mkFit

Vectorized, Parallelized Tracking
https://trackreco.github.io/
Apache License 2.0
17 stars 15 forks source link

Loss of high pT tracks and high eta tracks #73

Closed kmcdermo closed 6 years ago

kmcdermo commented 7 years ago

Still needs to be investigated: will plot nHits as a function of barrel and endcap and see what comes out. Same with last layer the track ended up on (I will submit another PR on this addition to the validation).

https://kmcdermo.web.cern.ch/kmcdermo/full-det-tracking-validation/SNB_ToyMC_Barrel_EFF_pt.png https://kmcdermo.web.cern.ch/kmcdermo/full-det-tracking-validation/SNB_ToyMC_Barrel_FR_pt.png https://kmcdermo.web.cern.ch/kmcdermo/full-det-tracking-validation/SNB_ToyMC_Barrel_EFF_eta.png

Remember: FR pT is "high" at pT < 1 and pT >10, because we in fact only simulate tracks with 1 < pT < 10. And since the reconstructed value can be anything, there are no real tracks at pT < 1 or pT >10 to bring this rate down.

cerati commented 7 years ago

@kmcdermo, on which kind of events are these plots made? Can you check if the inefficiency at high eta is also there when running on single track event? In other words, is there an issue with the kalman filter itself, or it just gets confused with density?

kmcdermo commented 7 years ago

@cerati , I made these with 10k tracks per event x 10 events, with the toyMC with the default parameters inside Config.cc for full-det-tracking.

I will run the validation with 1 track/event and see what happens, should be quick.

kmcdermo commented 7 years ago

Okay, so a few things.

  1. So, by construction, single track MC events will by definition have an efficiency = 100%, FR = 0%, DR = 0%. This is because with only one sim track / event, all hits in the reco track will always be from the same sim track, so the number of matching rec to sim hits is 100%!

  2. However, I made a plot of nHits / rec track, divided in barrel (blue) and endcap (red), in the plots below (linear and log): nhits_region_lin nhits_region_log

  3. I encountered an infinite loop in the clone engine single track events that I currently do not have the brainpower to solve. It happened as early as event 23. I simply turned on the debug, and attached the log for that event. It never went past "processing lay=-1" after lay 27. @osschar , any idea what is happening here? log: blah.txt

As such, the plot was only produced in BH, but it shouldn't matter for single-track events, because again, only one track to explore!

With all that said, it looks the building in the endcap is markedly worse than the barrel -- only half of the tracks make it to the end. Also we should probably fix whatever bug there is here for the CE.

cerati commented 7 years ago

Hi @kmcdermo, thanks for the update. I guess the next is to check what is the reason for missing some hits in the endcap: do we check on the correct layers? if yes, are the hits in the window? if yes, it should be the chi2... how far is it from the cut value?

osschar commented 7 years ago

I don't remember 100% ... but didn't we also have some performance loss in CMSSW endcaps?

kmcdermo commented 7 years ago

Some more on this issue. I made a PR #76 to address the question of which layers are we checking.

Running 100k single-track events BH building (as CE still broken in single-track case... will open a separate issue on this), I made a plot documenting the difference between the last layer the simulated track ended up vs. the last layer the reconstructed track ended up on (linear with nEvents, log scaled to unity): lyrdiff lyrdiff

The mean of this difference is in the legend for each subdetector. As you can see, the endcaps are nearly identical in the last layer they end up on. And it looks like the tracks are propagating to the right sections of the detector, otherwise the diffs would have a much larger range than stopping at 7 -- meaning at worst, tracks stop at their last respective seeding layer.

Example: if a simulated track ended up on the last disk in EM (lay=27), and the reco track ended up on the second disk in EP after seeding (lay=12), 27-12 = 15, which we obviously do not see.

FYI, the macro to produce such a plot is here: https://github.com/kmcdermo/mictest/blob/full-det-tracking/test/lastlyr.C. Given its utility in testing out various geometries with simulation and reconstruction, I figured to push it to the test directory for future use.

kmcdermo commented 7 years ago

Okay, so I lied a little. In making the above layer diff plot, I restricted the plotting range to 0-10, so naturally, no tracks with a diff of 15 or -20 or something odd would show up (and being an overlaid histogram, I dropped the stats boxes with UF/OF).

However, expanding to a larger range reveals that in fact we do have some outliers. However, out of 100k single-track events, only 8 showed this problem:

entry: 18350 gen_peta:  0.69 gen_meta: -1.46 mclyr_meta: -1.46 buildlyr_meta: -1.62 mc_lyr: 27 build_lyr: 2
entry: 38520 gen_peta: -0.39 gen_meta: -1.48 mclyr_meta: -1.48 buildlyr_meta: -1.46 mc_lyr: 27 build_lyr: 2
entry: 49528 gen_peta: -3.73 gen_meta:  1.50 mclyr_meta:  1.50 buildlyr_meta:  1.50 mc_lyr: 18 build_lyr: 2
entry: 64591 gen_peta:  1.72 gen_meta:  1.44 mclyr_meta:  1.44 buildlyr_meta:  1.45 mc_lyr: 18 build_lyr: 2
entry: 75681 gen_peta:  3.68 gen_meta: -1.44 mclyr_meta: -1.44 buildlyr_meta: -1.45 mc_lyr: 27 build_lyr: 2
entry: 81991 gen_peta: -1.93 gen_meta: -1.44 mclyr_meta: -1.44 buildlyr_meta: -1.45 mc_lyr: 27 build_lyr: 2
entry: 84280 gen_peta: -3.26 gen_meta:  1.46 mclyr_meta:  1.46 buildlyr_meta:  1.47 mc_lyr: 18 build_lyr: 2

Here, "peta" == geometrical position eta, where "meta" == momentum eta. Currently the validation only stores momentum related variables, which could of course be extended to position ones if needed.

"gen" == values taken at the point of the track origin (trackstate from simtracks) "mclyr" == last layer index the simtrack ended up on "buildlyr"== last layer index the recotrack ended up on

As you can see, all of these errant tracks were supposed to end up in the endcaps, but wound up stuck in the barrel. While the momentum etas between gen, mc last layer, and reco last layer are nearly identical, it is the fact that the geometrical eta is sufficiently different to force the reco track through the transition region and fails to extend beyond the last seeding barrel layer (entry 81991 is the outlier to this hypothesis).

Most interestingly, this happens at a momentum eta of ~|1.45|, which happens to be in fact right at the inside eta edge of the barrel layer. Namely, dumping root -l test/ClyCowWLids.cc, the edge for layer 2 (i.e. the last barrel seeding layer) is |1.55|, and at z += -3 cm from the edge of layer 2, eta = |1.45|. The closet edge of the first building layer in the endcap is at eta = |1.28|, so it is odd that the building doesn't try to hop over...

I guess the main thing here is that we seem to be unintentionally exploring the transition region even with the eta generation hack in place, which only covers the momentum eta, and not also the geometrical eta. Is it worth adding another hack to fully isolate the transition region?

osschar commented 7 years ago

The barrel layers are progressively shorter as one goes outwards ... and the switchover eta is intentionally staggered to produce a larger transition region. Run test/CylCowWLids.C, it plots the cow and prints out layer info / limits.

For purpose of the test I set |eta_min| according to the first non-seeding layer.

bld_lyr==2 means no hits were found past the seed ... and that the last hit was in the barrel, as you say. It's possible the track actually hit another barrel layer before going into endcap. Maybe it's worth printing out hit layer/index of simtrack.

osschar commented 7 years ago

I was thinking a bit more ... there are several places where this could go wrong:

  1. Simulation. Smearing sometimes places hits outside of the detector (will fix that ... I don't think this is really an issue here).
  2. PropagateToZ ... both positions and errors
  3. The search window becomes too small (for some reason). I thought we have minimal search window ... let me check ... well, yes and no, Config::minDPhi and Config::minDZ are 0. Also, Barel and Endcap processing is slightly different ... I will review this.

Seeing 2D histogram N_found_hits vs. eta might be instructive.

osschar commented 7 years ago

I fixed the outlaying hits ... on to differences in SelectHitsIndices.

kmcdermo commented 7 years ago

Hi @osschar , what do you mean by outlaying hits? One's that are outside of the chi2? Or one's that are smeared off the detector in the simulation?

Also, could you push this commit so I could stare at it :)?

osschar commented 7 years ago

One that is smeared / scattered out of the detector in simulation.

Sorry, forgot to push :)

kmcdermo commented 6 years ago

I think we can close this issue for now, as ultimately we need to reassess low pT matching.