cms-L1TK / cmssw

Fork of CMSSW where improvements to L1 tracking code are developed.
http://cms-sw.github.io/
Apache License 2.0
4 stars 5 forks source link

Fill MVA1 when using newKF #228

Closed cgsavard closed 3 months ago

cgsavard commented 1 year ago

Copy of closed PR #224.

PR description:

When using the new KF, TTTracks are converted from the KFTracks here. The trk_MVA1 variable is set to 0 however, and so this PR runs the track quality code to fill this variable.

PR validation:

The L1TrackNtupleMaker ntuple is now filled with non-zero trk_MVA1 variables.

cgsavard commented 1 year ago

@tomalin I do agree that it makes more sense to convert everything after KFOut is run and ideally doing this means that we don't need to rerun track quality again (since it's already done in KFOut with the track word variables).

However, there are 2 reasons I see why we will need to rerun the track quality code again in this way even when ProducerTT is done after KFOut:

  1. Bendchi2 is not yet implemented in emulation which means that the tq mva variable in the track word is currently meaningless. In fact, bendchi2 is actually being calculated in DataFormats here and I followed this model to calculate the MVA using an actual bendchi2 variable. When bendchi2 is implemented in emulation/firmware, both of these issues can be fixed. But for now we need to recalculate MVA with the actual bendchi2.
  2. The tqMVA variable in the trk word is only given 3 bits, and so the conversion to the TTTrack MVA will only give 8 distinct values. If we want to allow people to study the MVA variable and how different cuts on the MVA will affect the performance of their physics object reconstruction, for example, I think it is more useful to pass through the full 0-1 regression output of the MVA than the binned values. These studies will help inform us how we should be binning the tq value. If it's already binned, then we lose all of this information and have no way of studying how the bins affect things. I'm personally trying to figure out what the bins should be but I can't do it unless I have the full 0-1 mva value, unbinned. It is possible to NOT rerun the track quality if we set the TTTracks MVA value in KFOut before the tq digitisation is done (we will need access to TTTracks), but this can't be done right now because, again, bendchi2 is not implemented in emulation yet.

So I think that the ProducerTT can (more likely should) be run after KFOut, but I don't think that will change how I currently have things set up until (1) bendchi2 is implemented and (2) we are certain on the bins set for MVA and are okay with not allowing anyone to access the true 0-1 tq MVA output (always nice to give an option so future people can study the mva again).

If you think there are other solutions then what I have proposed, let me know. My suggestion would be to merge this and then we can have another MR to change the order of ProducerTT and KFOut (I'm probably not the right person to do this).

cgsavard commented 1 year ago

Commenting since there has been no activity in a bit @tomalin @skinnari

skinnari commented 1 year ago

this also seems OK to me. @tomalin can we merge?

tomalin commented 1 year ago

This PR makes it possible to study TQ with the NewKF, which is good, and I can understand Clare's motivation.

However, as I make clear above, it is a bodge. It calls the TQ from in front of the KFout modules instead of from after it. As such, it will not give us a bit accurate emulation of the Hybrid chain with the NewKF. We need such an emulation for the ESR (Nov. 2023) to validate the FW, so this work must be done in a non-bodged way in the coming months. And if we merge this PR, then when this work is done, whoever does it will need to undo all the changes in this PR.

There is also the question of coding style. This PR is calling the TQ from the DataFormat class, which as its name suggests is not intended to run algorithm steps. And it means that the TQ is being called twice, one from DataFormat and once from KFout, which wastes CPU.

My suggestion would be to leave this as a branch, so Clare can do her TQ studies, but not merge it. And find a volunteer to do it properly ...

cgsavard commented 1 year ago

If this solution is not implemented, then we should really find another way to send a correct TQ value down the chain for others to use before we make the newKF the standard in central CMSSW. I'm looking at using the bdt for track jets now which won't be possible anymore with the newKF switch, and vertexing has been looking at using this variable as well in their algorithm. I understand that this is a temporary solution until bendchi2 is implemented, but we need something there so that it doesn't ruin the algorithms using this variable down the line.

My current suggestion is a temporary fix that exactly mimics the temporary fix for bendchi2, which we will also have to undo when it is implemented correctly in firmware.

Another solution that might work with Ian's points is to set the bendchi2 with the simulated version (knowing that this will need to eventually be changed to the emulated version) in track quality inputs ProducerKF.out here so that the bdt can actually function as expected and we don't need to run the classifier twice. Then we can set the TTTrack value MVA1 with the output before the digitization occurs here. Is this more along what you were imagining for the KFout->TQ chain? Currently we have the emulation TQ classifier running towards the end of the KFout module, not after it.

tomalin commented 3 months ago

@cgsavard I assume that this PR, with solved the "missing TQ" info with a bodge, is no longer needed, since we've merged your new PR https://github.com/cms-L1TK/cmssw/pull/281 , which fixes the "missing TQ" issue properly. I'll therefore close this PR. Let me know if you object.