KFin emulator - Githubissues

tschuh commented 2 years ago

This PR replaces the KFin simulator with an emulator.

skinnari commented 2 years ago

@tschuh please also note that the CI validation checks have failed for this PR.

tschuh commented 2 years ago

@tschuh please also note that the CI validation checks have failed for this PR.

I was also wondering what that is about. I can run HybridTracksNewKF_cfg.py without problems locally. Have you any clue? @tomalin did not respond yet.

skinnari commented 2 years ago

For the CI, I see that you made the PR from your personal repo. Unfortunately that won't run the CI correctly then. the CI's require the PR to be made from a branch pushed to the cms-L1TK to the development branch of cms-L1TK.

tschuh commented 2 years ago

For the CI, I see that you made the PR from your personal repo. Unfortunately that won't run the CI correctly then. the CI's require the PR to be made from a branch pushed to the cms-L1TK to the development branch of cms-L1TK.

What do you mean with personal repo? I am not doing a PR from cms-L1TK/tschuh_kfin2 to cms-L1TK/L1TK-dev-12_0_0_pre4?

skinnari commented 2 years ago

that is what you need to do for the CI scripts to run...

tschuh commented 2 years ago

that is what you need to do for the CI scripts to run...

but that is what I do...

tomalin commented 2 years ago

The reason this fails CI is that you need to add TBout and the associated analyzer to L1TrackNTupleMaker_cfg.py . (A corrected example is on RAL linux in ~tomalin/ggg/CMSSW_12_0_0_pre4/src/L1Trigger/TrackFindingTracklet/test/L1TrackNTupleMaker_cfg.py ).

With this fix, it still fails, complaining that the FW & SW are inconsistent (which is unsurprising as I am running the emulation with no FW).

Jingyan95 commented 2 years ago

The code crashes when running new KF on the following sample: /store/relval/CMSSW_11_3_0_pre6/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_113X_mcRun4_realistic_v6_2026D76PU200-v1/00000/1934efdd-ad80-4a9b-a8e2-4795b57080ff.root @tschuh could you maybe take a look ?

Jingyan95 commented 2 years ago

Old KF is not affected by this PR, here are some plots: L1TK_elec-pu0_eff_pt L1TK_elec-pu0_notgenuine_pt

Jingyan95 commented 2 years ago

it looks like the crash is caused by KFout module and specifically the size of the container on L171: https://github.com/cms-L1TK/cmssw/blob/tschuh_kfin2/L1Trigger/TrackFindingTracklet/plugins/ProducerKFout.cc#L171 does not agree with size of the container on L181: https://github.com/cms-L1TK/cmssw/blob/tschuh_kfin2/L1Trigger/TrackFindingTracklet/plugins/ProducerKFout.cc#L181 This causes a seg-fault at L248: https://github.com/cms-L1TK/cmssw/blob/tschuh_kfin2/L1Trigger/TrackFindingTracklet/plugins/ProducerKFout.cc#L248 Maybe @Chriisbrown can comment on this?

Chriisbrown commented 2 years ago

it looks like the crash is caused by KFout module and specifically the size of the container on L171: https://github.com/cms-L1TK/cmssw/blob/tschuh_kfin2/L1Trigger/TrackFindingTracklet/plugins/ProducerKFout.cc#L171 does not agree with size of the container on L181: https://github.com/cms-L1TK/cmssw/blob/tschuh_kfin2/L1Trigger/TrackFindingTracklet/plugins/ProducerKFout.cc#L181 This causes a seg-fault at L248: https://github.com/cms-L1TK/cmssw/blob/tschuh_kfin2/L1Trigger/TrackFindingTracklet/plugins/ProducerKFout.cc#L248 Maybe @Chriisbrown can comment on this?

So having had a look at the code I believe the issue is that when creating the input containers for the eta router I only allow them to be as large as the maximum number of tracks that can be outputted, but in rare cases the number of input tracks from the KF can be larger. I already deal with this truncation later on and never saw this issue when testing previously so I suggest changing lines 159 & 171 to: for (int iTrack = 0; iTrack < setup_->numFramesIO(); iTrack++)

tschuh commented 2 years ago

I pushed chris's suggestion, could you please check if that solved it @Jingyan95?

Jingyan95 commented 2 years ago

I pushed chris's suggestion, could you please check if that solved it @Jingyan95?

OK, the problem has been fixed, the overall efficiency appears to consistent when running the new KF L1TK_elec-pu0_eff_eta However, the track rate seems to be much higher (~3x more tracks) when running the ntuple-maker L1TK_elec-pu0_ntrkPerSector_all I can confirm this is not caused by your last commit (Chris's suggestion). Can you @tschuh help check the track rate ?

tschuh commented 2 years ago

This New KF now corresponds to what we have in f/w. And in f/w we have no duplicate removal.

Jingyan95 commented 2 years ago

This New KF now corresponds to what we have in f/w. And in f/w we have no duplicate removal.

OK, many of the metrics used to evaluate tracking performance won't be directly comparable given that this new KF is running without a duplicate removal. @skinnari

Jingyan95 commented 2 years ago

z0 resolution seems to be worse, is this expected @tschuh ? L1TK_elec-pu0_resVsEta_z0_68

tschuh commented 2 years ago

the maths are now bit accurate while before it was double precision. So some differences are expected, but that looks quite big. can you add to the plot a line for the tracks before KF fit?

Jingyan95 commented 2 years ago

the maths are now bit accurate while before it was double precision. So some differences are expected, but that looks quite big. can you add to the plot a line for the tracks before KF fit?

tschuh commented 2 years ago

many thanks @Jingyan95 . at least the kf is not doing total nonesense. One further difference is that the f/w tracklet code does not provide multiple stubs per layer. Given the tiny impact on efficiency I wouldn't expect big impact on z0 resolution. But you could also try to deactivate that feature with main&&noDR. Doing this plot per seed type might also be telling.

Jingyan95 commented 2 years ago

it seems like some of your recent commits (commits you made after Chris's suggestion) are causing an exception when running new KF on /store/relval/CMSSW_11_3_0_pre6/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_113X_mcRun4_realistic_v6_2026D76PU200-v1/00000/1f4f191f-b64a-4079-9f4c-ebadd6121262.root it appears to be related to the range of rinv: https://github.com/cms-L1TK/cmssw/blob/tschuh_kfin2/L1Trigger/TrackerTFP/src/KalmanFilterFormats.cc#L75

tschuh commented 2 years ago

yes I know, will fix that when I have time

tschuh commented 2 years ago

yes I know, will fix that when I have time

I just realized that you are not referring to the CI events, I fixed the config for those now and likely that is also fixing your issue. Please let me know if not, I will then also download and test those events. When I look into my standard z0 resolution plots I also see an asymmetry between + and - eta for big eta.

tschuh commented 2 years ago

resZ0.pdf I compared now the z0 resolution after KF fit when using TrackBuilder stub residuals vs TTStubRef residuals. The latter gives the expected performance while the first gives us the current performance. For me it looks like that this are the consequences of the residual corruption in the TrackBuilder I have reported some months ago. I will play now a bit with the TB stub residuals, maybe we just have some sign definition problems.

tschuh commented 2 years ago

resZ0Improved.pdf I was able to improve the z0 resolution by multiplying the TrackBuilder r residual with with -tanL instead of |tanL| to get a z residual.

Jingyan95 commented 2 years ago

yes I know, will fix that when I have time

I just realized that you are not referring to the CI events, I fixed the config for those now and likely that is also fixing your issue. Please let me know if not, I will then also download and test those events. When I look into my standard z0 resolution plots I also see an asymmetry between + and - eta for big eta.

the issue I reported has been fixed, also we can see some improvement from your recent commit L1TK_elec-pu0_resVsEta_z0_68 I will play with the new KF on the main branch a bit more to understand the differences in performance

tschuh commented 2 years ago

Many thanks @Jingyan95, regarding the remaining performance difference: as far as I can see the main source is the residual corruption in the TrackBuilder which are originated from the MatchCalculator I think.

skinnari commented 2 years ago

Many thanks @Jingyan95, regarding the remaining performance difference: as far as I can see the main source is the residual corruption in the TrackBuilder which are originated from the MatchCalculator I think.

@tschuh Can you please elaborate on this? I don't think I follow what the problem is / what needs to be fixed? thanks

tschuh commented 2 years ago

@tschuh Can you please elaborate on this? I don't think I follow what the problem is / what needs to be fixed? thanks

When you look at resZ0Improved.pdf you see the z0 resolution over eta after KF for a) using stub residuals calculated from TTStubRefs and TTTrackParameter in c++ (red) and b) using stub residuals calculated in f/w (blue). Using residuals from f/w reduces the z0 resolution by half a cm at high |eta|. So fixing the residual calculation in MatchCalculator should fix the issue.

tschuh commented 2 years ago

Just found a bug in the z residual base. Could you please redo the z0 resolution plots @Jingyan95 ?

Jingyan95 commented 2 years ago

Just found a bug in the z residual base. Could you please redo the z0 resolution plots @Jingyan95 ? It seems like this fixes much of the problem in high eta: Things in barrel still look the same though

tomalin commented 2 years ago

@tschuh & @Jingyan95 if I understand correctly, we're still seeing a substantial loss of z0 resolution with this PR. This needs to be understood before it can be merged. In addition, this PR apparently needs rebasing to resolve the git conflict reported above.

tomalin commented 2 years ago

I see that at least half the code changes in this PR are just format changes, presumably caused by the automatic CMS formatting tool? This makes it harder to identify the real changes. It also increases the risk of git conflicts with other PRs. We should figure out how to avoid this in future.

tomalin commented 2 years ago

The comments in TrackFindingTracklet/plugins/ProducerKFin.cc are out of date. They should refer to tt::Frames instead of TTDTC::Frames. Also please add '#include "DataFormats/L1TrackTrigger/interface/TTTypes.h"' to TrackFindingTracklet/interface/KFin.h to make it more obvious where types such as StreamsStub & FrameStub are defined.

tschuh commented 2 years ago

The comments in TrackFindingTracklet/plugins/ProducerKFin.cc are out of date. They should refer to tt::Frames instead of TTDTC::Frames. Also please add '#include "DataFormats/L1TrackTrigger/interface/TTTypes.h"' to TrackFindingTracklet/interface/KFin.h to make it more obvious where types such as StreamsStub & FrameStub are defined.

done

tschuh commented 2 years ago

In addition, this PR apparently needs rebasing to resolve the git conflict reported above.

done

tschuh commented 2 years ago

I see that at least half the code changes in this PR are just format changes, presumably caused by the automatic CMS formatting tool? This makes it harder to identify the real changes. It also increases the risk of git conflicts with other PRs. We should figure out how to avoid this in future.

as soon as it is impossible to merge code when the CI fails and having a CI failing when code-format introduces changes this will never happen. What confuses me is that the CI already checks that I believe, that means one can merge even when CI fails ...

tschuh commented 2 years ago

scram b code-format does not change a line, could you please look into the CI @tomalin

cms-L1TK / cmssw

KFin emulator #143