Open smuzaffar opened 2 years ago
A new Issue was created by @smuzaffar Malik Shahzad Muzaffar.
@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
assign reconstruction
New categories assigned: reconstruction
@jpata,@clacaputo,@mandrenguyen you have been requested to review this Pull request/Issue and eventually sign? Thanks
by the way, the following check at https://github.com/cms-sw/cmssw/blob/master/RecoVertex/KinematicFit/interface/KinematicConstrainedVertexUpdatorT.h#L141 allowed the workflow 10803.0
to run for el9_aarch64_gcc11
val += g * delta_alpha;
lambda = v_g_sym * val;
+ if (std::isnan(lambda[0]) || std::isinf(lambda[0])) {
+ return RefCountedKinematicVertex();
+ }
but I had to drop ofast-flag
from https://github.com/cms-sw/cmssw/blob/master/RecoEgamma/EgammaPhotonAlgos/BuildFile.xml#L1 for isnan
and isinf
to work properly.
but I also need to drop
ofast-flag
from
what about edm::isFinite?
ah ok, let me try that. thanks @slava77
thanks @slava77 , the following patch after https://github.com/cms-sw/cmssw/blob/master/RecoVertex/KinematicFit/interface/KinematicConstrainedVertexUpdatorT.h#L141 (without dropping ofast-math
dep) allowed the failing relval to run ( on both el8_amd64 and el9_aarch64)
+ if (! edm::isFinite(lambda[0])) {
+ //edm::LogWarning("KinematicConstrainedVertexUpdatorFailed") << "some error/warnings message\n";
+ //LogDebug("KinematicConstrainedVertexUpdatorFailed") << "some error/warning message\n";
+ return RefCountedKinematicVertex();
+ }
@cms-sw/reconstruction-l2 , if this is the correct fix then can you please open a PR with correct error/warning message ?
Well @smuzaffar , I would defer to @slava77 here as tracking POG convener. If it helps, I can make a pull request adding the message "Kinematic constrained vertex updator failed". Are you recommending just LogDebug or also LogWarning?
@mandrenguyen , as a previous check https://github.com/cms-sw/cmssw/blob/master/RecoVertex/KinematicFit/interface/KinematicConstrainedVertexUpdatorT.h#L131-L135 has both LogDebug and LogWarning
so that is why I suggested to add both for this check too
urgent
The failure came back in workflow 14.0 step 3 in CMSSW_13_3_X_2023-08-21-2300 on el8_aarch64_gcc11
Begin processing the 95th record. Run 1, Event 2493, LumiSection 13 on stream 1 at 22-Aug-2023 05:43:16.615 CEST
GammaContinuedFraction::a too large, ITMAX too small
GammaContinuedFraction::a too large, ITMAX too small
----- Begin Fatal Exception 22-Aug-2023 05:43:17 CEST-----------------------
An exception of category 'Vertex' occurred while
[0] Processing Event run: 1 lumi: 13 event: 2494 stream: 3
[1] Running path 'dqmofflineOnPAT_1_step'
[2] Prefetching for module SingleTopTChannelLeptonDQM_miniAOD/'singleTopElectronMediumDQM_miniAOD'
[3] Prefetching for module PATMuonSlimmer/'slimmedMuons'
[4] Prefetching for module PATMuonSelector/'selectedPatMuons'
[5] Prefetching for module PATMuonProducer/'patMuons'
[6] Prefetching for module MuonProducer/'muons'
[7] Prefetching for module PFProducer/'particleFlowTmp'
[8] Prefetching for module PFBlockProducer/'particleFlowBlock'
[9] Prefetching for module PFElecTkProducer/'pfTrackElec'
[10] Prefetching for module PFConversionProducer/'pfConversions'
[11] Calling method for module ConversionProducer/'allConversions'
Exception Message:
Refitted track not found in list.
pt used for comparison: -nan
----- End Fatal Exception -------------------------------------------------
Should we reopen this issue or open a new one?
We can reopen this but not sure if github will close it again as #39298 (which claims to fix this issue) is merged. Lets reopen it and if github closes it then we can open a new one
looks like this has been fixed. We have not seen this exceptions since long (few months). I would suggest to close this issue
Most of the errors reported in https://github.com/cms-sw/cmssw/issues/36788 were fixed by https://github.com/cms-sw/cmssw/pull/39183 . We still have 13 workflows failing with same error [a] . Looks like we still have some
nan/inf
number generation at https://github.com/cms-sw/cmssw/blob/master/RecoVertex/KinematicFit/interface/KinematicConstrainedVertexUpdatorT.h#L156. Value ofval
andlambda
here areel9_aarch64_gcc11
IBsel8_amd64_gcc10
IBs ( where tests do not fail but we still seeinf
numbers)@cms-sw/reconstruction-l2 , can you please look in to this and provide a fix
[a]