Closed Chriisbrown closed 1 year ago
Without knowing which line in L1TrackNtupleMaker throws the error I can't help. Who wrote that Analyzer?
It is this line: https://github.com/cms-L1TK/cmssw/blob/fd9a5da928b0c633034f0ebba44df76eb09984dc/L1Trigger/TrackFindingTracklet/test/L1TrackNtupleMaker.cc#L918
I don't know how the hitpattern helper is different between the Old/New KFs but it is something in there
I think that was written by @Jingyan95. Can you help Jack?
I think that was written by @Jingyan95. Can you help Jack?
Hello @tschuh , I saw that new KF is outputting -122.402 as the z0 (with an eta of 0.307698) for the track that is causing the crash, which sounds a bit strange. Is this expected ?
Well the KF has no mechanism of killing tracks, if that is the result it will send it. The Track Quality should mark this track as garbage and downstream consumer should not take it into account.
I ran CMSSW with the gdb debugger, which crashes with traceback below, so I guess in the original message reported by @Chriisbrown of "_n (which is 38)", it was the value of binCot which was 38, and so outside the expected range.
It's important to realize that when we run with HYBRID_NEWKF, the TTTrack collection is produced by the Tracklet pattern recognition. The new KF currently does not write a TTTrack collection (though Thomas tells me we should add this sometime). It only writes an EDProduct of its digitised output stream. Therefore it's plausible that the tracks lie outside the expected eta sectors etc., and this causes the crash. We should probably disable the running of L1TrackNtupleMaker.cc when using HYBRID_NEWKF to avoid confusion?
@tomalin This makes sense as the source of the issue. If we can't run the L1TrackNtupleMaker with the HYBRID_NEWKF how should standard track plots be made so the performance can be compared against the previous HYBRID approach as currently the workflow is to create the ntuples with the L1TrackNtupleMaker and then run them through the L1TrackNtuplePlot root macro?
Hello @tomalin and @Chriisbrown ,
Thank you for checking. I am working on a fix to avoid out of bound error like this. I will make a PR soon.
Hello @tomalin and @Chriisbrown ,
Thank you for checking. I am working on a fix to avoid out of bound error like this. I will make a PR soon.
Thanks @Jingyan95 . Even when using the old HYBRID, people do sometimes choose to disable the DR & KF, so that the TTTrack collection is produced only using Tracklet pattern reco. This is useful to understand how DR & KF affect performance. I guess the same crash can occur then, so it would be good if HitPatternHelper could avoid crashing, even if it just disables itself in that cfg. (I'm not sure it's worth the complexity of adding lots of new code).
@tomalin This makes sense as the source of the issue. If we can't run the L1TrackNtupleMaker with the HYBRID_NEWKF how should standard track plots be made so the performance can be compared against the previous HYBRID approach as currently the workflow is to create the ntuples with the L1TrackNtupleMaker and then run them through the L1TrackNtuplePlot root macro?
@Chriisbrown we can already compare L1TrackNtuplePlot results for old HYBRID with results from TrackFindingTracklet/test/AnalyzerKFout.cc for HYBRID_NEWKF, but you're right that it would be better to use L1TrackNtuplePlot for everything.
To achieve that, we'd have to modify TrackFindingTracklet/plugins.ProducerKFout.cc so it wrote a TTTrack collection. @tschuh could say if this would be easy enough that someone could do it under his guidance. Function "consume" https://github.com/cms-L1TK/cmssw/blob/L1TK-dev-12_6_0_pre5/L1Trigger/TrackerTFP/test/AnalyzerKF.cc#L234 illustrates how to convert the streams of stub & track digi data from the KF into TTTrack objects. (For some reason, TrackFindingTracklet/test/AnalyzerKFout.cc doesn't do this). I guess we'd have to add similar code to ProducerKFout. We could consider that after your KFout PR is merged.
My original understanding was incorrect. @tschuh tells me that his New KF does actually write a TTTrack collection, which is written by the EDProducer named ProducerTT. It runs on the output of the KF rather than the KFout (which we should fix sometime).
I've discovered that the track which causes the crash has a crazy value of z0 stored in the TTTrack. If one prints tmp_trk_z0 at https://github.com/cms-L1TK/cmssw/blob/L1TK-dev-12_6_0_pre5/L1Trigger/TrackFindingTracklet/test/L1TrackNtupleMaker.cc#L937 , it has a value of -122cm. The tanL = +0.33 is sensible, but because of the crazy z0, HitPatternHelper::Setup::etaRegion() assigns it to a KF eta sector in the forward region of the Tracker, which makes a very big angle to the track, causing the digitisation of HitPatternHelper::cotBin_ to overflow.
@tschuh says the Hybrid KF applies no cuts at all to the output fitted tracks, so they can be crazy. (Though he thinks this track implies a KF bug). He can see the crash on the RAL linux machine in event 2 of /opt/ppd/data/cms/L1TrkMC/609bdf8d-43a5-435f-8726-105cf1d97243.root .
Well the KF has no mechanism of killing tracks, if that is the result it will send it. The Track Quality should mark this track as garbage and downstream consumer should not take it into account.
@tomalin I think according to Thomas' comment above, new KF keeps every track regardless of the track quality.
@Jingyan95 I think I've spotted a bug. HitPatternHelper::useNewKF_ is "false", despite L1TrackNtupleMaker_cfg.py being run with HYBRID_NEWKF. Therefore the HPH thinks the NewKF is not being used. If you agree, I'll make PR to set process.HitPatternHelperSetup.useNewKF = True in L1TrackNtupleMaker_cfg.py if NewKF is being used.
So fixing above bug via PR https://github.com/cms-L1TK/cmssw/pull/223, and rejecting tracks with abs(z0) > 30cm in L1TrackNtupleMaker.cc stops crash. In ttbar+200PU, I observe that 0.2% of tracks have abs(z0) > 30cm. These tracks all have chi2/dof > 100 and only 4 stubs. Their Pt & eta are normal.
I'm closing this, but I think cuts to reject crazy tracks really should be applied at the output of the KF.
When running the L1TrackNtupleMaker with the algo set to "HYBRID_NEWKF" over the recent Phase2Fall22DRMiniAOD large TTbar sample the ntuple maker occasionally throws an error on events in the sample, this happens frequently across the sample but here is a specific example taken from store/mc/Phase2Fall22DRMiniAOD/TT_TuneCP5_14TeV-powheg-pythia8/GEN-SIM-DIGI-RAW-MINIAOD/PU200_PUTP_125X_mcRun4_realistic_v2-v1/2540000/609bdf8d-43a5-435f-8726-105cf1d97243.root
Event number 2, track number 55
This issue happens before line 1087 of the track ntuple maker as when running with the debug as none of the track debug is being outputted from this track
I can't recreate this issue when setting the algo to "HYBRID" so is specifically in the NEWKF
This is using the current L1TK-dev-12_6_0_pre5 branch