cms-patatrack / cmssw

CMSSW fork of the Patatrack project
https://patatrack.web.cern.ch/patatrack/index.html
Apache License 2.0
2 stars 5 forks source link

Understand non-perfect reproducibility in tracks to PV association #397

Open fwyzard opened 4 years ago

fwyzard commented 4 years ago

During the validation of recent PRs (e.g. #395, #396) we have observed a non-perfect reproducibility of the performance of the pixel tracks associated to the primary vertex, in the TTbar realistic sample.

The results seem to oscillate between these two sets, even when no relevant changes are introduced:

  development-10824.52 testing-10824.52
Number of TrackingParticles (after cuts) 4950 5017
Number of matched TrackingParticles 2757 2790
Number of tracks 4371 4416
Number of true tracks 3860 3905
Number of fake tracks 511 511
Number of pileup tracks 0 0
Number of duplicate tracks 0 0

The discrepancy is at the 1% level, and the validation plots show only very small differences between the "development" and "testing" points - affecting only a couple of bins, and well within the errors.

fwyzard commented 4 years ago

@VinInn @makortel I'm moving this here, because it seems unrelated to #395 and #396.

makortel commented 4 years ago

Number of TrackingParticles (after cuts) | 4950 | 5017

Is it known why the MC truth gives different numbers?

VinInn commented 4 years ago

this is realistic (but running as ideal for that concern the cluster-shape-cut) Quadruplets, isn't' it? Never seen a discrepancy for "design'?

fwyzard commented 4 years ago

Yes, I think I've seen it only in the realistic sample.

VinInn commented 4 years ago

could you check the vertex table as in http://innocent.home.cern.ch/innocent/RelVal/pixOnlyRun3_cpu/plots_vertex.html I may be enough just one vertex not associated to the MC-truth PV

fwyzard commented 4 years ago

Indeed, see here:

Pixel vertices

  development-10824.52 testing-10824.52
Events 100 100
PV reco+tag efficiency 0.9200 0.9300
Efficiency 0.5623 0.5623
Fake rate 0.0829 0.0829
Merge rate 0.1144 0.1144
Duplicate rate 0.0069 0.0069

Selected pixel vertices

  development-10824.52 testing-10824.52
Events 100 100
PV reco+tag efficiency 0.9500 0.9600
Efficiency 0.4358 0.4358
Fake rate 0.0249 0.0249
Merge rate 0.1274 0.1274
Duplicate rate 0.0037 0.0037

But I do not know if it means one less associated vertex, or one less reconstructed vertex.

fwyzard commented 4 years ago

Same behaviour in the re-validation of #396, and in the validation of #398 .

At this point I wonder if this is simply triggered by re-compiling some extra packages ?

fwyzard commented 4 years ago

Mhm, no - development and testing are checking out the same packages.

VinInn commented 4 years ago

efficiency is the same (and other rates as well), so looks association in principle should be visible in the first of these plots http://innocent.home.cern.ch/innocent/RelVal/pixOnlyRun3_cpu/plots_pixelVertices/pvtagging.pdf

fwyzard commented 4 years ago

Indeed (orange vs black): image

VinInn commented 4 years ago

so in black the real PV is in position 1 instead of 0 it could be sorting (or spitting, even if the number of vertices is the same) or pt of some track... sum-pt2 is done in parallel in float: it is not stable "by design" (one of the main subject of my lecture on floating point...) We can accumulate on double... Still this would mean that the two highest sum-pt2 vertices have very close sum-pt2....