Closed yipengsun closed 4 years ago
@CoffeeIntoScience was able to adapt our ntuple_options.py
to run on DaVinci/v36r1p2
. An interesting thing is that her ntuple now contains only 34 entries, whereas ours contains 243 entries.
I've updated 2 more comparisons here: https://umd-lhcb.github.io/lhcb-ntuples-gen/data/comparison_run1_RDst/comparison/
I'd say the the ntuples generated by the same script with different DaVinci
's look much more similar. This means that there's some bug in my DaVinci
script in the sense that my script doesn't fully replicate @CoffeeIntoScience's functionality.
The main difference between ntuples generated by DaVinci/v36r1p2
and DaVinci/v42r8p1
seems to be in the isolation variables.
Phoebe suggested that this is likely due to the way the NoPID pions is set up. This can be checked by looking into Y_ISOLATION_Type
variables.
@CoffeeIntoScience I've updated the comparison between the following variables generated by DaVinci/v36r1p2
and DaVinci/v42r8p1
:
Y_ISOLATION_Type
Y_ISOLATION_Type2
Y_ISOLATION_Type3
Y_ISOLATION_Type4
The comparison can be found here.
this is very curious. For your education:
track type 1 is VELO only, with no other compatible segments found, track type 3 is Forward tracks, which are VELO+(TT)+downstream segment (TT hits optional) track type 4 is Upstream, VELO+TT
what's not obvious to me at all is whether what we're looking at is your version picking up MORE velo tracks, or FEWER other types. Like are we looking at the VELO tracks filling up a deficit or are we looking at VELO tracks being shifted to higher BDT values by some other change in the tracking or fitting?
I would suggest, as a next step in this thread, doing this comparison and the comparison of BDT scores again excluding VELO tracks and seeing if we get agreement then
By excluding VELO tracks, do you mean commenting out the following code: https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/a2caa86e3cb196567f9d6738ff9915911528511f/2012-b2D0MuXB2DMuNuForTauMuLine/reco_Dst.py#L34-L43
Easier would be commenting out the one line https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/a2caa86e3cb196567f9d6738ff9915911528511f/2012-b2D0MuXB2DMuNuForTauMuLine/reco_Dst.py#L59
OK then. I'm about to generate the new ntuple with that line commented. Could you re-run my script with DaVinci/v36r1p2
? You probably need the following files:
2012-b2D0MuXB2DMuNuForTauMuLine/reco_Dst.py
2012-b2D0MuXB2DMuNuForTauMuLine/conds/cond-data-Dst.py
The raw data files should be placed in data
folder at the same level of reco_Dst.py
, the generated ntuples will be placed at the same folder of the reco_Dst.py
so there's no need to create gen
folder.
I'm invoking DaVinci
in the following way:
lb-run DaVinci/v42r8p1 ./conds/cond-data-Dst.py reco_Dst.py
Now I only see track type 3 and 4, as expected.
@CoffeeIntoScience I've updated ntuples generated with your TupleToolApplyIsolation
. There's no obvious improvement.
Before cutting on the Y_ISOLATION_BDT
variables, for 2012 Mag Down data:
The difference (= ABS(Ours-Phoebe's)/Phoebe's) is about 10%.
After the Y_ISOLATION_BDT < 0.15
cut:
The difference is about 14%.
To disable DecayTreeFitter
, I should replace the TopSelection
option with sel_B0
?
https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/22b34a8e1851efd1d9cd9cade9bc6fb515f9325e/run1-b2D0MuXB2DMuNuForTauMuLine/reco_Dst.py#L419-L435
Also, I start to remember things, and noticed this line: https://github.com/umd-lhcb/lhcb-ntuples-gen/blob/22b34a8e1851efd1d9cd9cade9bc6fb515f9325e/run1-b2D0MuXB2DMuNuForTauMuLine/reco_Dst.py#L226-L227
I think I'll uncomment that for better backward compatibility.
Weird. The generated .root
file cannot be opened by root.
Turns out my root installation is messed up.
@CoffeeIntoScience I've created a script for older version of DaVinci. You can download it from here. It should be able to run on DaVinci/v36r1p2 without any editing.
@CoffeeIntoScience @manuelfs I've updated ntuples generated with DaVinci/v36r1p2
and DaVinci/v42r8p1
without final refitting. The results are available here.
Now the momentum of the Kaons and Pions fully agree, but the momentum of D0 still differs. This makes sense---D0 is reconstructed by CombineParticle
, which uses some fitting tool.
Actually, the isolation variables now have larger differences.
Can you rerun your script (just over like 300 events is fine) with the output level set to DEBUG and send me the log? (This used to be done via from Gaudi.Configuration import *
followed by MessageSvc().OutputLevel=DEBUG
)
The debug option you provided still works. I reran my script over exact 300 events with debug option enabled, and the log file is ~60 MB. The log file can be found here.
DaVinci
change logs regarding fitStarting from DaVinci/v36r1p2
, with Phys/v19r8
.
End at DaVinci/v42r8p1
, with Phys/v23r9p1
.
version | log |
---|---|
v36r6 |
In Phys/Tesla v1r4 : Support refitted PV persistence. |
v37r2 |
Include as patch packages RelatedInfoTools , DaVinciTools , LoKiCore , LoKiPhys , LoKiAlgo and GaudiConf |
v40r1 |
Added TrackSys to used configurables. It configures the tracking related tools used by LoKi::VertexFitter and DistanceCalculator |
v42r4 |
Allow to add selections directly to the list of UserAlgorithms . No need in the explicit appearence of SelectionSequence |
Phys
change logs regarding fitversion | log |
---|---|
v19r2 |
ParticleClassificator : Add merged (=="stable") pi0 into list of "di-gamma-like" particles. |
v19r3 |
VertexFit, VertexFitWithTracks, FastVertex : replace all matrix inversions by faster method based on Cholesky's decomposition. |
v19r4 |
Particle2State : use vectorized similarity transform from LHCbMath -- which is (in this case) 4.8x faster |
v19r5 |
KalmanFilter : Use cholesky decompositon for inversion of positively definite covariance matrices. |
v19r6 |
|
v19r7 |
KalmanFilter : Added IParticleClassifier interface needed for bypassing full vertex fit. |
v20r0 |
|
v20r1 |
|
v20r3 |
|
v21r0 |
|
v21r1 |
VertexFit : PropertimeFitter : Fixed bug (the tool was returning c.tau (mm) in case of negative covariance and tau (ns) when the fit is OK) |
v21r3 |
LokiFitters : VertexFitter : Change default of transporter to DaVinci::ParticleTransporter:PUBLIC . |
v21r4 |
KalmanFilter, LoKiFitters : Added protection against extrapolations to very large z values. |
v21r5 |
LoKiFitters : Switch on warnings in LoKi::PVReFitter when a PV refit fails with a repetition rate of 1. |
v23r3 |
DaVinciFilters, PhysSelPython : Add new algorithm FilterUnique and extend MergedSelection |
Obviously the log comparison is still in-progress, but the first possibly-suspicious thing I see is we are connecting to different locations to fetch the latest DB tags and ending up with different ones. Mine if fetching
dddb-20130929-1
cond-20141007
while yours finds
dddb-20171030-2
cond-20150409-1
I think it would be worth your time to attempt to run forcing the DB tags that mine thinks are the "latest" to see if anything changes. I should also double check if the momentum scaling map is the same, though it looked okay on the first pass
EDIT: formatting
Specifying tag doesn't help at all.
Phoebe found some duplicated tracks in our ntuples generated by both versions of DaVinci
This is likely due to the fact that the ms_velo_pions
are not set correctly to only pick VELO particles. Examples of correct usage can be found at: https://twiki.cern.ch/twiki/bin/view/LHCb/RestrippingMDST
(Dark Magic warning).
But the trackSelector
is undefined.
I still see duplicated events:
If we scan over the momenta of all 3 tracks, the events are sharing tracks, but no duplicate event is present:
The trackSelector
and updateDoD
are defined at: https://gitlab.cern.ch/lhcb/Stripping/blob/2017-patches/Phys/CommonParticlesArchive/python/CommonParticlesArchive/Stripping33r1/CommonParticles/Utils.py
After fixing a comparison bug, I see:
These csv files are selected by: abs(bdt_score_diff) > 0.1
.
@CoffeeIntoScience I've included more Y-branches in the csv files linked above. I still think the values are mostly similar.
just for future reference, y is more sensitive than x to fitting differences because the uncertainty in that direction is typically greater (less precise downstream measurements).
Hmmm.
would still like to see PT and maybe, I guess, IPCHI2 of the Dst's slow pion
I've updated the track type vs chi^2 plots. Indeed there's a significant shift for VELO only tracks.
Once Phoebe uploads her new ntuple to the shared folder, I'll redo plots for ntuples with both refitting and momentum rescaling disabled.
We've decided to move up to DaVinci/v45
, the recommended version for both run 1 and 2, and see if we still have these disagreements.
The progress will be tracked at umd-lhcb/TupleToolSemileptonic#1.
@CoffeeIntoScience I've finally updated my local DaVinci
to v45r3
, the recommended version for all run 1 and run 2 analyses. I regenerated the no-refit ntuple, and made comparisons between that and v36, v42 ntuples.
The result is: v45 is still different from v36, v45 is identical to v42. Please take a look at: https://umd-lhcb.github.io/lhcb-ntuples-gen/data/comparison_run1_Dst/comparison/#comparison-between-davinci-v36r1p2-and-v45r3-without-refitting
okaaay? I didn't think we expected anything different in these comparisons?
On Thu, Nov 14, 2019 at 9:53 PM Yipeng Sun notifications@github.com wrote:
@CoffeeIntoScience https://github.com/CoffeeIntoScience I've finally updated my local DaVinci to v45r3, the recommended version for all run 1 and run 2 analyses. I regenerated the no-refit ntuple, and made comparisons between that and v36, v42 ntuples.
The result is: v45 is still different from v36, v45 is identical to v42. Please take a look at:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/umd-lhcb/lhcb-ntuples-gen/issues/16?email_source=notifications&email_token=ALEZBSCXMPPNA2TOXGJ7EZTQTYFJVA5CNFSM4H5I5FI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEEDVFI#issuecomment-554187413, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEZBSEUAYBGRMKINEOQN5TQTYFJVANCNFSM4H5I5FIQ .
I thought the idea (your idea) was that given v45 is the officially recommended version for both run 1 and run 2, it may automatically fallback to the previous algorithm.
On Fri, Nov 15, 2019, 12:53 Phoebe H notifications@github.com wrote:
okaaay? I didn't think we expected anything different in these comparisons?
On Thu, Nov 14, 2019 at 9:53 PM Yipeng Sun notifications@github.com wrote:
@CoffeeIntoScience https://github.com/CoffeeIntoScience I've finally updated my local DaVinci to v45r3, the recommended version for all run 1 and run 2 analyses. I regenerated the no-refit ntuple, and made comparisons between that and v36, v42 ntuples.
The result is: v45 is still different from v36, v45 is identical to v42. Please take a look at:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/umd-lhcb/lhcb-ntuples-gen/issues/16?email_source=notifications&email_token=ALEZBSCXMPPNA2TOXGJ7EZTQTYFJVA5CNFSM4H5I5FI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEEDVFI#issuecomment-554187413 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALEZBSEUAYBGRMKINEOQN5TQTYFJVANCNFSM4H5I5FIQ
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/umd-lhcb/lhcb-ntuples-gen/issues/16?email_source=notifications&email_token=AIBM3QBRZHDAAXJTBQ6L6Z3QT3N4ZA5CNFSM4H5I5FI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGGB7I#issuecomment-554459389, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBM3QEF3475YESHCNBAX43QT3N4ZANCNFSM4H5I5FIQ .
Hmmm. I guess that was not impossible. I don't know where the switch to the simplified geometry is done, and its very possible its not easily togglable.
Anyway, the bigger point is that lack of better agreement here shouldn't be a road block and you should resume trying to get efficiencies from the inclusive MC samples and close the loop on this factor of 5 as quickly as is reasonable
Obviously the isolation differences will be an issue going forward -- it might just be that you need to retrain the BDT for Run2 due to this difference, because the change does seem to have affected VELO and everything else differently
We've tried to use DaVinci v45r3, and we have the same "problem". Closed for now, as this problem might be resolved by retraining the BDT for run 2.
We are comparing:
/2012-b2D0MuXB2DMuNuForTauMuLine/data/sample/YCands_sample-mag_down-data.root
/2012-b2D0MuXB2DMuNuForTauMuLine/gen/YCands.root
It seems that the muon momentum
muplus_P
and all its components have a ~5% deviation from Phoebe's. We need to investigate why.