cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.09k stars 4.33k forks source link

TrackingTruthAccumulator: Something has gone wrong with the indexing #46469

Open vlimant opened 1 month ago

vlimant commented 1 month ago

I was looking at the recent premix production and there are jobs in "cool off" (meaning they will get resubmitted)

Fatal Exception (Exit Code: 8002)
An exception of category 'StdException' occurred while
   [0] Processing  Event run: 1 lumi: 1890 event: 1889249 stream: 1
   [1] Running path 'PREMIXoutput_step'
   [2] Prefetching for module PoolOutputModule/'PREMIXoutput'
   [3] Calling method for module MixingModule/'mix'
Exception Message:
A std::exception was thrown.
TrackingTruthAccumulator: Something has gone wrong with the indexing. Parent track index is 476.

one log is available under https://eoscmsweb.cern.ch/eos/cms/store/logs/prod/recent/PRODUCTION/pdmvserv_task_PPD-RunIIISummer24PrePremix-00004__v1_T_241021_203252_7210/PPD-RunIIISummer24PrePremix-00004_0/cmsgwms-submit12.fnal.gov-10291964-0-log.tar.gz

chances are that the resubmitted job will hit a different seed and pass. But in fine, this might be truncating the PU profile that is being effectively produced (due to selection bias).

https://cmsweb.cern.ch/reqmgr2/fetch?rid=pdmvserv_task_PPD-RunIIISummer24PrePremix-00004__v1_T_241021_203252_7210

    "CMSSWVersion": "CMSSW_14_0_18",
    "GlobalTag": "140X_mcRun3_2024_realistic_v26",
cmsbuild commented 1 month ago

cms-bot internal usage

cmsbuild commented 1 month ago

A new Issue was created by @vlimant.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

davidlange6 commented 1 month ago

any notable differences in the GT since v20 used to make the gen-sim?

makortel commented 1 month ago

assign simulation

cmsbuild commented 1 month ago

New categories assigned: simulation

@civanch,@kpedro88,@mdhildreth you have been requested to review this Pull request/Issue and eventually sign? Thanks

makortel commented 1 month ago

The exception is thrown from https://github.com/cms-sw/cmssw/blob/8340b52ad5b3272e98a300259ff0c67959520800/SimGeneral/TrackingAnalysis/plugins/TrackingTruthAccumulator.cc#L888-L911 (btw, the exception type should be changed to cms::Exception)

vlimant commented 1 month ago

@davidlange6 : https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/140X_mcRun3_2024_realistic_v20/140X_mcRun3_2024_realistic_v26 nothing sticks out that would/could explain this (I would not know what to look for though ...)

davidlange6 commented 1 month ago

for sure the beam spot changing is interesting...(also from the point of view of usefulness of the resulting premixed sample)

vlimant commented 1 month ago

it's the reco beamspot, no the gen/sim one