cms-L1TK / cmssw

Fork of CMSSW where improvements to L1 tracking code are developed.
http://cms-sw.github.io/
Apache License 2.0
4 stars 5 forks source link

Fixing integer range check in FPGAWord. #125

Closed tschuh closed 2 years ago

tschuh commented 2 years ago

As reported on last November I see overflowing z residuals out of TrackBuilder. This PR fixes an existing range check inside the tracklet code. This reveals not only overflowing z residuals, there are several overflows per event across the tracklet code. As earlier this gets merged, and this is quasi a one-line-PR, as earlier people can start fixing the actual bugs.

The error messages revealed by this PR are:

1) value too large:256 256 (/net/home/ppd/tomalin/UK_TrackTrig/hybrid12/CMSSW_12_0_0_pre4/src/L1Trigger/TrackFindingTracklet/src/Residual.cc:28) , which is overflow of barrel layers 1+2 z residual.

2) value too large:2048 2048 (/net/home/ppd/tomalin/UK_TrackTrig/hybrid12/CMSSW_12_0_0_pre4/src/L1Trigger/TrackFindingTracklet/src/Projection.cc:35) , which is overflow of variable irzproj the endcap (layerdisk = 8)

aryd commented 2 years ago

Git did odd things to the code I pasted in... a bit hard to read.

aryd commented 2 years ago

I proposed a different fix in the discussion about this PR. Instead of adding these arbitrary offsets I think a cleaner fix is to change some <= to < etc.

-Anders

Anders Ryd @.**@.>

On Jan 28, 2022, at 2:22 PM, Brent R. Yates @.**@.>> wrote:

@bryates commented on this pull request.


In L1Trigger/TrackFindingTracklet/src/FPGAWord.cchttps://github.com/cms-L1TK/cmssw/pull/125#discussion_r794789989:

@@ -38,8 +38,8 @@ void FPGAWord::set(int value, int nbits, bool positive, int line, const char* fi } assert(value < (1 << nbits)); } else {

  • if (value > (1 << (nbits - 1))) {
  • edm::LogProblem("Tracklet") << "value too large:" << value << " " << (1 << (nbits - 1)) << " (" << file << ":"
  • if (value > (1 << (nbits - 1)) - 1) {
  • edm::LogProblem("Tracklet") << "value too large:" << value << " " << (1 << (nbits - 1)) - 1 << " (" << file << ":" << line << ")"; } assert(value <= (1 << (nbits - 1)));

I confirm Thomas's observation that the z residuals overflow the digitisation range, and that this was previously not noticed because of the bug fixed by this PR. This may hypothetically lead to disagreement with the MatchCalculator HLS code, since either the LUT tables containing the digitized matching windows or the test data of matched stubs may be wrong.

The problem stubs are in layer 1+2 of the barrel, with the cut at https://github.com/cms-L1TK/cmssw/blob/L1TK-dev-12_0_0_pre4/L1Trigger/TrackFindingTracklet/src/MatchCalculator.cc#L223 being slightly too loose to reject the stubs that are outside the digi range. A solution would be to replace "15.0" by "15.0-0.06" at L685-686 of https://github.com/cms-L1TK/cmssw/blob/L1TK-dev-12_0_0_pre4/L1Trigger/TrackFindingTracklet/interface/Settings.h . Though ideally a check that the match windows do not exceed the digi range could be added, for example, to TrackletLUT::initmatchcut().

@bryateshttps://github.com/bryates Since this affects the MatchCalculator, could you please add a fix for this to Thomas's PR?

The problem projection is in layerdisk=8 of the endcap. I suspect the problem may be that whilst TrackletCalculatorBase.cchttp://TrackletCalculatorBase.cc checks that the barrel projection izproj lies within the digi range, it doesn't make the same check for the endcap one irprojdisk .

@aeharthttps://github.com/aehart Since this may affect agreement with the TC HLS, would you be able to check?

I've pushed my changes https://github.com/cms-L1TK/cmssw/blob/83769fa8cddb52a029ad3e452ed877fb0dbdd6d6/L1Trigger/TrackFindingTracklet/interface/Settings.h#L683-L686

— Reply to this email directly, view it on GitHubhttps://github.com/cms-L1TK/cmssw/pull/125#discussion_r794789989, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABI4LTNZD6HPJLWGQOZPRODUYLUH7ANCNFSM5M6SV3FQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because your review was requested.Message ID: @.***>

aryd commented 2 years ago

I have just committed/pushed fixes to this. I undid Brents changes to add -0.06 and changed the assert in FPGAWord so that it is consistent with the error message and then fixed the code in TrackletCalculatorBase, MatchCalculator, and MatchProcessor such that we don't trip the overflow. As far as I'm concerned this is ready to do so I will close this.

tschuh commented 2 years ago

Hi Anders, I did look at your code changes and they look good to me. Could you please approve and merge with squash isntead of close, thanks.