cms-L1TK / cmssw

Fork of CMSSW where improvements to L1 tracking code are developed.
http://cms-sw.github.io/
Apache License 2.0
4 stars 5 forks source link

Track Processing update #285

Open tschuh opened 1 month ago

tschuh commented 1 month ago

This PR updates mainly the modules TrackerTFP, TrackFindingTracklet and TrackTrigger. It brings the Kalman Filter, Track Multiplexer, Duplicate Removal and Track Quality steps up to date.

Detailed list for Ian:

  1. DataFormats/L1TrackTrigger
    • TTBV update
    • added kf chi2 output product (pair of doubles)
  2. L1Trigger/TrackFindingTracklet
    • HitPatternHelper updated to cope changes, likely broken
    • KFin deleted
    • KalmanFilter added
    • 5 parameter fit option added (https://indico.cern.ch/event/1449862/contributions/6116147/attachments/2924526/5133676/thomas.pdf)
    • ProducerKalmanFilter added
    • State added
    • DRin renamed in TrackMultiplexer
    • L1FPGATrackProducer updated
    • DR renamed in DuplicateRemoval and updated
    • KFout removed
    • TBout removed
    • ChannelAssignment updated
    • FitTrack updated
    • L1TrackNtupleMaker updated with layer encoding specifics
  3. L1Trigger/TrackTrigger
    • SensorModule updated
    • Setup updated
    • L1TrackQuality moved to trackerTFP
  4. L1Trigger/TrackerDTC
    • DTC updated
    • LayerEncoding updated
    • Stub updated
  5. L1Trigger/TrackerTFP
    • L1TrackQuality renamed in TrackQuality and turned into ESProduct
    • TrackFindingProcessor added.
    • CleanTrackBuilder added
    • DataFormats updated
    • Demonstrator updated
    • Duplicate Removal added
    • GeometricProcessor updated
    • HoughTransform updated
    • KalmanFilter updated
    • LayerEncoding updated
    • MiniHoughTransform deleted
    • State updated
    • ZHoughTransform deleted
    • KFin deleted
tomalin commented 1 month ago

As you're changing 160 files, the text describing this PR needs to be longer and more detailed. What are the changes that have been implemented? And how do they affect tracking performance? Have you given a talk which describes in detail the changes and the effect on performance. If so, please link it from here.

tomalin commented 1 month ago

This fails CI quite spectacularly https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7858190 . e.g. For HYBRID_NEWKF, the tracking efficiency is up by 0.3% (good), but the number of reconstructed tracks is down by 80% (very weird, given that no efficiency is lost!), and the z0 resolution is a factor 5 worse (bad).

For comparison, this is the git CI for a dummy PR that changed no code https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7924082 .

tomalin commented 1 month ago

This fails CI quite spectacularly https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7858190 . e.g. For HYBRID_NEWKF, the tracking efficiency is up by 0.3% (good), but the number of reconstructed tracks is down by 80% (very weird, given that no efficiency is lost!), and the z0 resolution is a factor 5 worse (bad).

For comparison, this is the git CI for a dummy PR that changed no code https://gitlab.cern.ch/cms-l1tk/cmssw_CI/-/pipelines/7924082 .

==== Thomas says CI failure explained by poor z0 resolution. He's retuning the digitisation ranges to fix this. But he believes that the tracklet helix parameters are wrong (inconsistent with seed positions by ~1cm) for a small fraction of the tracklets, which is also contributing to poor z0 resolution.

tschuh commented 1 month ago

new KF maths are fixed now, also passing CI using new KF. However, CI using old KF fails due to slightly lower z0 resolution. My PR should not impact those...

tomalin commented 3 weeks ago

I understand that the ProducerTM usually orders tracks according to their seed type. This is important for the functioning of the DR, so should be described in the comment at the top of the class header file, which explains what the class does. And at the line of the code that causes this ordering to happen, an additional comment should appear to highlight it. A comment mentioning this inside ProducerDR would also be wise.

tschuh commented 3 weeks ago

I understand that the ProducerTM usually orders tracks according to their seed type. This is important for the functioning of the DR, so should be described in the comment at the top of the class header file, which explains what the class does. And at the line of the code that causes this ordering to happen, an additional comment should appear to highlight it. A comment mentioning this inside ProducerDR would also be wise.

done

tomalin commented 3 weeks ago

@Chriisbrown & @cgsavard -- Thomas's overview description of this PR says "L1TrackQuality renamed in TrackQuality and turned into ESProduct". Are you both and Andreas happy with this change?

tomalin commented 3 weeks ago

At the top of the header of the KF class where you do the maths (KalmanFilter.h?), add the comments copied from L7-31 of https://github.com/cms-L1TK/cmssw/blob/tschuh/L1Trigger/TrackFindingTMTT/interface/KFParamsComb.h#L7 , so the maths is documented. Although as you treat r-phi and r-z planes independently, you may need to modify a few lines of this.

tomalin commented 3 weeks ago

Overview description of this PR should mention that new class TrackFindingTracklet/ProducerKF has been added.

tomalin commented 2 weeks ago

All your code uses a few classes a lot, such as TTBV, Frame, FrameStub & FrameTrack, StreamStub & StreamTrack, DataFormats & DataFormat, Setup. But they're only documented in a few sentences in /L1Trigger/TrackerTFP/README.md . I suggest adding a dedicated section to this README, explaining these things and giving examples of common manipulations, such as extracting a float from the digi data.

tomalin commented 2 weeks ago

Please improve the comments at the top of the various .h and _cfg.py files with "Demonstrator" in the title, so that anyone wishing to use this code to check SW vs FW would understand what the various modules and doing and how they need to set up Vivado etc. to work with them. e.g. TrackerTFP/test/AnalyzerDemonstrator.cc & both have the same very short comment, despite the fact that one is an ESProducer and the other is an EDAnalyzer. Please make clear what the difference between them is. e.g. What is the ESProducer for? What is the EDAnalyzer doing?

tomalin commented 2 weeks ago

Expand existing comment at start of LayerEncoding.h, to explain what the encoding actually is. i.e. How are the layers encoded?

tomalin commented 2 weeks ago

Some points in PR description are too short to understand. e.g. "code to create clean TP and association maps added". Please explain what the purpose of this is. And provide evidence (e.g. link to talk) that it is useful. I guess it's goal is to save CPU?

tomalin commented 2 weeks ago

If it's true that this PR contains a 5-param KF fit, please so say in the description, and give link to a talk where you explain how to use it. e.g. Just one just need to set ParameterFit5 = True in Setup_cfi.py?

tschuh commented 2 weeks ago

I added the comments I could in the parts that I am familiar with. Unfortunately there are a lot of files I am unfamiliar with and so I couldn't be of much help there. I am not opposed to changing tracker quality to its own EDProducer if you feel that it is a better setup. Although right now the HYBRID version still uses it as a function and not EDProducer. Can we make that consistent?

There are three main concerns I brought up:

1. The change in naming scheme from chi2rz and chi2rphi to chi20 and chi21 only adds confusion to any others who look in. I think it's better to change these to the naming scheme that is generally accepted.

2. The MVA1 variable is not being calculated correctly as many of the BDT inputs are wrong. The correct variables that should be input can be seen further down in the TrackQuality.cc file and I made notes on changes need to each variable.

3. The tttracks being created in TrackFindingProcessor.cc have incorrect chi2 and mva variables. The chi2rz and chi2rphi should not be pdof and the mva should be in the 0-1 range and not binned upon tttrack initialization.

Thanks for your review. I also spotted that there are two mva calculations, one used in the emulator one in the simulator. And those are different... As soon as the new KF replaces the old one the simulation function call will become obsolete and can be removed. to 1) sorry my bad no problem to stick whatever convention you have. to 2. I do not now which emulation reflects the f/w, therefore I do not want to change the behavior. Best person to debug this is @Chriisbrown. to 3. my bad again, I fixed it.

tschuh commented 2 weeks ago

I made a forced push squashing all previous commits to make it less painful when I have to rebase in future. I also used the opportunity to remove files which shouldn't be in the PR to begin with.

tschuh commented 1 week ago

@tomalin the scram b code-format on the ci machine seems to be different as on my machine. I can't fix the CI failure. Could you please have a look?

tschuh commented 1 week ago

@tomalin another I realized is that ci complains about L1Trigger/TrackFindingTracklet/interface/Settings.h failing format, however L1Trigger/TrackFindingTracklet/interface/Settings.h gets not touched in this PR ...

cgsavard commented 1 week ago

I am seeing some pretty big differences in the tracks being produced before and after this big restructuring. For example, the tracks have less nstub, as we can see in this plot where I test 100 evts from ttbar 200 PU: Screenshot 2024-09-17 at 4 38 47 PM And very weirdly, the tracks seem to have a pT cutoff at 15 GeV as seen here: Screenshot 2024-09-17 at 4 25 41 PM 56e22-5331-4e8b-9833-91d9c8825ee5)

Other differences involve much smaller chi2 variables and mostly all tracks coming from seed 0 after these changes. Is this all understood?

tschuh commented 6 days ago

For the pt I found a bug (factor 4) in the TTTrack conversion. The seed 0 dominance originates most likely from the DR, tending to keep lower seeds when duplicates are found. The nStub thing likely originates from the move to a single KF worker. You might want to double check by setting this to False. For the chi2s I must confess that I might have changed the behavior. I did not fully understand the original code and replaced it with something which emulates what I would write in f/w. I will look deeper into that.

tschuh commented 6 days ago

The Chi2s between the old calculation and floating point look similar enough, i don't think that the original code was buggy. Do you look at chi2 or chi2 / ndf?

cgsavard commented 6 days ago

For the pt I found a bug (factor 4) in the TTTrack conversion.

This indeed looks like it's fixed.

The seed 0 dominance originates most likely from the DR, tending to keep lower seeds when duplicates are found.

Does this mean the DR behavior has been changed? Because I am seeing that before this PR the seed distribution looks very different. Screenshot 2024-09-18 at 11 07 43 AM

The nStub thing likely originates from the move to a single KF worker. You might want to double check by setting this to False.

I do see that the nstub distribution looks more similar before and after the PR when setting this variable to false. Based on the variable name and description though, it sounds like this is disabling truncation and not changing the number of KF workers. Is this the same thing? And was this truncation not enabled before and that's why we see differences?

The Chi2s between the old calculation and floating point look similar enough, i don't think that the original code was buggy. Do you look at chi2 or chi2 / ndf?

Here are the distributions of chi2 and chi2/dof before and after this PR. We can see that it has significantly changed so something does still seem to be off. Screenshot 2024-09-18 at 11 13 07 AM Screenshot 2024-09-18 at 11 13 33 AM

tschuh commented 6 days ago

Does this mean the DR behavior has been changed?

Yes.

tschuh commented 6 days ago

I do see that the nstub distribution looks more similar before and after the PR when setting this variable to false. Based on the variable name and description though, it sounds like this is disabling truncation and not changing the number of KF workers. Is this the same thing? And was this truncation not enabled before and that's why we see differences?

less KF workers means more truncation, therefore is disabling truncation a good test to simulate a many worker scenario

tschuh commented 6 days ago

Here are the distributions of chi2 and chi2/dof before and after this PR. We can see that it has significantly changed so something does still seem to be off.

Well the KF got debugged. The old code got sometimes the parameter very wrong leading to bad resolutions and high chi2s. I guess we just see now the debugged chi2 plots. A good cross check would be to compare against oldKF.

cgsavard commented 1 day ago

Well the KF got debugged. The old code got sometimes the parameter very wrong leading to bad resolutions and high chi2s. I guess we just see now the debugged chi2 plots. A good cross check would be to compare against oldKF.

Interestingly, I see different behaviors comparing chi2rz and chi2rphi with the old and new kf before these changes were implemented. For chi2rz/dof, I see that the current distribution looks like the old KF before changes. For chi2rphi/dof, I see that the current distribution looks like the new KF before changes. If we assume the oldKF distributions are correct, then it seems the issue might be in the chi2rphi calculation specifically. Here are both distributions below: Screenshot 2024-09-23 at 11 33 59 AM Screenshot 2024-09-23 at 11 34 27 AM

cgsavard commented 1 day ago

The nStub thing likely originates from the move to a single KF worker. You might want to double check by setting this to False.

Is the fact that we now have mostly 4 stub tracks acceptable? Don't we want to be producing more 5-6 stub tracks as they tend to be better tracks? I understand you said this is probably because there is only 1 KF worker, but does this indicate that we should be using more than one worker to get better tracks?

tschuh commented 19 hours ago

If we assume the oldKF distributions are correct, then it seems the issue might be in the chi2rphi calculation specifically

Well the newKF did show improved r-phi parameter resolution. So it is not unexpected to see lower chi2rphi. The difference is quite big though. I wonder if the oldKF is scaling the chi2rphi up, I think it does that when calculating overall chi2. @tomalin do you know what the oldKF stores in the TTTracks?

tschuh commented 18 hours ago

Is the fact that we now have mostly 4 stub tracks acceptable? Don't we want to be producing more 5-6 stub tracks as they tend to be better tracks? I understand you said this is probably because there is only 1 KF worker, but does this indicate that we should be using more than one worker to get better tracks?

That is a thing wee need to decide. I am not sure if more worker is sensible as long as we are not able to build f/w. It would be good to understand what we are actually gaining by the additional stubs.