cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.07k stars 4.29k forks source link

Strange stack trace for ARM job which crashed at end of job #40995

Open Dr15Jones opened 1 year ago

Dr15Jones commented 1 year ago

The following are the pruned thread stack traces for the failed job

https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/raw/el8_aarch64_gcc11/CMSSW_13_1_X_2023-03-07-2300/pyRelValMatrixLogs/run/11634.15_TTbar_14TeV+2021_JMENano/step3_TTbar_14TeV+2021_JMENano.log

This appears to be happening during the 'endJob' phase as many Service's have already reported their job summaries.

Thread 8 (Thread 0x4000bf8b9250 (LWP 2578704) "cmsRun"):
[A thread controlled by Eigen started by tensorflow]
#3  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
#4  0x000040004ebc63a0 in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-03-07-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
...

Thread 7 (Thread 0x4000bf6a9250 (LWP 2578701) "cmsRun"):
[another Eigen thread started by tensorflow]
#3  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
#4  0x000040004ebc63a0 in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-03-07-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2

Thread 6 (Thread 0x4000bf499250 (LWP 2578700) "cmsRun"):
[yet another Eigen thread started by tensorflow]
#3  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
#4  0x000040004ebc63a0 in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-03-07-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2

Thread 5 (Thread 0x400071a09250 (LWP 2577395) "cmsRun"):
...
#3  0x00004000243a71b8 in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x000040006683621c in tensorflow::CancellationManager::StartCancel() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-03-07-2300/external/el8_aarch64_gcc11/lib/libtensorflow_framework.so.2
#6  0x00004000668364d8 in tensorflow::CancellationManager::StartCancel() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-03-07-2300/external/el8_aarch64_gcc11/lib/libtensorflow_framework.so.2
#7  0x000040005575ed24 in tensorflow::DirectSession::Close() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-03-07-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#8  0x000040004bd8a850 in tensorflow::closeSession(tensorflow::Session*&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libPhysicsToolsTensorFlow.so
#9  0x000040004bd8d808 in TfGraphDefWrapper::~TfGraphDefWrapper() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libPhysicsToolsTensorFlow.so
#10 0x000040006ad0b5f8 in edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<TfGraphDefProducer, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> >, TfGraphRecord, edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> >(TfGraphDefProducer*, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> > (TfGraphDefProducer::*)(TfGraphRecord const&), edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> const&, edm::es::Label const&)::{lambda(TfGraphRecord const&)#1}, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> >, TfGraphRecord, edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> >, TfGraphRecord, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/pluginPhysicsToolsTensorFlowPlugins.so
#11 0x000040001cdfd07c in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#12 0x000040001cdfd0dc in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#13 0x000040001cd89c80 in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#14 0x000040001e7f2a64 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x4000d1f9a000, waiter=<synthetic pointer>..., this=0x40001f623f00) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/task_dispatcher.h:322
[cut ]

Thread 4 (Thread 0x40006db29250 (LWP 2577394) "cmsRun"):
[waiting TBB thread]

Thread 3 (Thread 0x40006d119250 (LWP 2577393) "cmsRun"): 
#3  <signal handler called>
#4  0x000040001ed113c4 in __lll_lock_wait () from /lib64/libpthread.so.0
#5  0x000040001ed0a030 in pthread_mutex_lock () from /lib64/libpthread.so.0
#6  0x000040001e32ffe0 in malloc_mutex_lock_final (mutex=0x40006f0132d8) at include/jemalloc/internal/mutex.h:151
#7  je_malloc_mutex_lock_slow (mutex=mutex@entry=0x40006f0132d8) at src/mutex.c:90
#8  0x000040001e320ec8 in malloc_mutex_lock (mutex=0x40006f0132d8, tsdn=0x40006d11ee40) at include/jemalloc/internal/mutex.h:217
#9  je_edata_cache_put (tsdn=0x40006d11ee40, edata_cache=0x40006f0132c0, edata=0x40006f8a8380) at src/edata_cache.c:37
[je malloc doing memory consolidation?]
#38 operator delete (ptr=<optimized out>, size=<optimized out>) at src/jemalloc_cpp.cpp:200
#39 0x0000400046643ab0 in CaloTPGTranscoderULUT::~CaloTPGTranscoderULUT() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libCalibCalorimetryCaloTPG.so
#40 0x0000400046643b84 in CaloTPGTranscoderULUT::~CaloTPGTranscoderULUT() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libCalibCalorimetryCaloTPG.so
#41 0x0000400046612f58 in edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<CaloTPGTranscoderULUTs, std::unique_ptr<CaloTPGTranscoder, std::default_delete<CaloTPGTranscoder> >, CaloTPGRecord, edm::eventsetup::CallbackSimpleDecorator<CaloTPGRecord> >(CaloTPGTranscoderULUTs*, std::unique_ptr<CaloTPGTranscoder, std::default_delete<CaloTPGTranscoder> > (CaloTPGTranscoderULUTs::*)(CaloTPGRecord const&), edm::eventsetup::CallbackSimpleDecorator<CaloTPGRecord> const&, edm::es::Label const&)::{lambda(CaloTPGRecord const&)#1}, std::unique_ptr<CaloTPGTranscoder, std::default_delete<CaloTPGTranscoder> >, CaloTPGRecord, edm::eventsetup::CallbackSimpleDecorator<CaloTPGRecord> >, CaloTPGRecord, std::unique_ptr<CaloTPGTranscoder, std::default_delete<CaloTPGTranscoder> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/pluginCalibCalorimetryCaloTPGPlugins.so
#42 0x000040001cdfd07c in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#43 0x000040001cdfd0dc in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#44 0x000040001cd89c80 in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
[cut ]

Thread 2 (Thread 0x4000289b9250 (LWP 2576851) "cmsRun"):
[stack trace helper thread]

Thread 1 (Thread 0x40001e94d070 (LWP 2574070) "cmsRun"):
[je malloc memory consolidation?]
#18 operator delete (ptr=<optimized out>, size=<optimized out>) at src/jemalloc_cpp.cpp:200
#19 0x000040006b24e000 in std::default_delete<HepPDT::ParticleDataTable>::operator()(HepPDT::ParticleDataTable*) const [clone .part.0] () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/pluginSimGeneralHepPDTESSource.so
#20 0x000040006b24ea04 in edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<HepPDTESSource, std::unique_ptr<HepPDT::ParticleDataTable, std::default_delete<HepPDT::ParticleDataTable> >, PDTRecord, edm::eventsetup::CallbackSimpleDecorator<PDTRecord> >(HepPDTESSource*, std::unique_ptr<HepPDT::ParticleDataTable, std::default_delete<HepPDT::ParticleDataTable> > (HepPDTESSource::*)(PDTRecord const&), edm::eventsetup::CallbackSimpleDecorator<PDTRecord> const&, edm::es::Label const&)::{lambda(PDTRecord const&)#1}, std::unique_ptr<HepPDT::ParticleDataTable, std::default_delete<HepPDT::ParticleDataTable> >, PDTRecord, edm::eventsetup::CallbackSimpleDecorator<PDTRecord> >, PDTRecord, std::unique_ptr<HepPDT::ParticleDataTable, std::default_delete<HepPDT::ParticleDataTable> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/pluginSimGeneralHepPDTESSource.so
#21 0x000040001cdfd07c in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#22 0x000040001cdfd0dc in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#23 0x000040001cd89c80 in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
[doing a TBB synchronous wait]
#27 0x000040001cda1668 in edm::EventProcessor::taskCleanup() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02775/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-06-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#28 0x000000000040ba14 in tbb::detail::d1::task_arena_function<main::{lambda()#1}::operator()() const::{lambda()#1}, void>::operator()() const ()
#29 0x000040001e7f0f5c in tbb::detail::r1::task_arena_impl::execute (ta=..., d=...) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/arena.cpp:694
#30 0x000000000040f73c in main::{lambda()#1}::operator()() const ()
#31 0x0000000000406fc4 in main ()

Notice that thread 1, 3 and 5 are all doing 'IOV cleanup' tasks. I would have thought those should have been completed in the 'end processing loop' stage?

Dr15Jones commented 1 year ago

assign core

Dr15Jones commented 1 year ago

@wddgit FYI

cmsbuild commented 1 year ago

New categories assigned: core

@Dr15Jones,@smuzaffar,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild commented 1 year ago

A new Issue was created by @Dr15Jones Chris Jones.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

wddgit commented 1 year ago

The sequence should be:

  1. endJob
  2. Last IOVs are ended
  3. Services destroyed

If the Service summaries are printing in endJob, this looks like intentional Framework behavior. If the summaries are printing in the Service destructors, then something is wrong in the Framework. The last IOVs get ended by a sentry that goes out of scope just after endJob in cmsRun.cpp. The failure does seem to be occurring while waiting for the last IOVs to end. I don't know why. Maybe a tensorflow problem???

makortel commented 1 year ago

I see tensorflow::CancellationManager::StartCancel() makes use of TensorFlow's mutex https://github.com/tensorflow/tensorflow/blob/v2.6.4/tensorflow/core/framework/cancellation.cc#L38-L82 that we've seen before to be unreliable on ARM

I'm not sure of the relevance for this problem though, as the stack strace shows only one thread in TensorFlow. But maybe there were multiple earlier that lead to corruption of CancellationManager's state?

What level concurrency do we have in the IOV cleanup?

wddgit commented 1 year ago

I'll look more carefully tomorrow. Based on a quick look and my memory of how this works, the IOVs are all ending concurrently.

I don't recall there even being anything to wait until an IOV actually ends except that future IOVs will not start until there are open slots, so in theory there could be multiple IOVs ending per record although typically only the last IOV would be open for each record the vast majority of the time.

wddgit commented 1 year ago

After looking more carefully, I do not see any Framework problems in this job. The summary messages are all printing in endJob and then the IOVs are closing. The Framework is waiting for the IOVs to close when the seg fault occurs. The Framework seems to be behaving correctly.

The IOVs close concurrently. I can see that from the code and also in the stack traces we can see the IOV cache objects being deleted at the same time on multiple threads (~CaloTPGTranscoderULUT, ~TfGraphDefWrapper and std::default_delete).

I have not looked in tensorflow now (or ever). Matti, I won't look in there unless you ask me to. I've got no expertise in tensorflow. It seems curious startCancel appears twice at the end of the stack trace but it is possible that is normal. Maybe it is the mutex you mentioned... I don't know.

Another comment unrelated to this issue. In this job, there is only 1 luminosity block so there is only 1 IOV per record. So what I am about to say is not relevant for this seg fault. But when we migrated to using group I just noticed there were changes in how we handle waiting for the IOVs to end. The current code correctly waits for the last IOV for each record to end, but it no longer waits for the other IOVs to end. Practically it is probably exceedingly rare, but I see nothing to prevent the next to last IOV from still being in the process of ending when the wait ends. Before PR #32804 we were waiting for them all with waitForIOVsInFlight_, but we don't anymore. To hit a problem with this would require multiple IOVs in flight and the next to last IOV for a record would still be in the processing of ending when the wait ended and then bad things might happen. Probably this should be fixed, although the probability might be so low that the problem does not occur practically. Or maybe I am missing something...

makortel commented 1 year ago

Thanks David.

I have not looked in tensorflow now (or ever). Matti, I won't look in there unless you ask me to. I've got no expertise in tensorflow. It seems curious startCancel appears twice at the end of the stack trace but it is possible that is normal. Maybe it is the mutex you mentioned... I don't know.

No need. What I looked the TF code, the CancellationManager can contain other CancellationManager objects as well, and the StartCancel() call propagates to the contained objects.

Probably this should be fixed, although the probability might be so low that the problem does not occur practically. Or maybe I am missing something...

@Dr15Jones Could you take a look? We tend to be eventually hitting into rare problems, so if there is a chance for misbehavior, I'd like to get it fixed.

wddgit commented 1 year ago

Actually, there could be different IOVs for beginRun, beginLumi and endRun. So it is within the realm of possibility this is the problem. I don't know what establishes the IOV start and end for this record.

wddgit commented 1 year ago

Ignore the last comment, the IOV for that record is run 1 to the end of time. There should only be 1 IOV.

makortel commented 1 year ago

Crash in CMSSW_13_1_X_2023-03-12-2300 11605.0 step 3 has an interesting stack trace

Thread 5 (Thread 0x400074a09250 (LWP 912275) "cmsRun"):
#3  0x000040002814a1ec in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x0000400059a7eda4 in nsync::nsync_dll_splice_after_(nsync::nsync_dll_element_s_*, nsync::nsync_dll_element_s_*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#6  0x0000400059a7edcc in nsync::nsync_dll_make_first_in_list_(nsync::nsync_dll_element_s_*, nsync::nsync_dll_element_s_*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#7  0x0000400059a7ee0c in nsync::nsync_dll_make_last_in_list_(nsync::nsync_dll_element_s_*, nsync::nsync_dll_element_s_*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#8  0x0000400059a7effc in nsync::nsync_mu_lock_slow_(nsync::nsync_mu_s_*, nsync::waiter*, unsigned int, nsync::lock_type_s*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#9  0x0000400059a7f0ec in nsync::nsync_mu_lock(nsync::nsync_mu_s_*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#10 0x000040006a5361c8 in tensorflow::CancellationManager::StartCancel() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/external/el8_aarch64_gcc11/lib/libtensorflow_framework.so.2
#11 0x000040006a5364d8 in tensorflow::CancellationManager::StartCancel() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/external/el8_aarch64_gcc11/lib/libtensorflow_framework.so.2
#12 0x000040005945ed24 in tensorflow::DirectSession::Close() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#13 0x000040004fa8a850 in tensorflow::closeSession(tensorflow::Session*&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libPhysicsToolsTensorFlow.so
#14 0x000040004fa8d808 in TfGraphDefWrapper::~TfGraphDefWrapper() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libPhysicsToolsTensorFlow.so
#15 0x000040006ea0b5f8 in edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<TfGraphDefProducer, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> >, TfGraphRecord, edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> >(TfGraphDefProducer*, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> > (TfGraphDefProducer::*)(TfGraphRecord const&), edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> const&, edm::es::Label const&)::{lambda(TfGraphRecord const&)#1}, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> >, TfGraphRecord, edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> >, TfGraphRecord, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/pluginPhysicsToolsTensorFlowPlugins.so
#16 0x0000400020bfd1ec in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#17 0x0000400020bfd24c in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#18 0x0000400020b89dc0 in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#19 0x00004000225f2a64 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x4000d5ac9b00, waiter=<synthetic pointer>..., this=0x400023423f00) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/task_dispatcher.h:322
#20 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (waiter=<synthetic pointer>..., t=0x0, this=0x400023423f00) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/task_dispatcher.h:458
#21 tbb::detail::r1::arena::process (tls=..., this=0x400023423780) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/arena.cpp:137
#22 tbb::detail::r1::market::process (this=0x40002344b080, j=...) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/market.cpp:599
#23 0x00004000225fac5c in tbb::detail::r1::rml::private_worker::run (this=0x400023b8c000) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/private_server.cpp:271
#24 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x400023b8c000) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/private_server.cpp:221
#25 0x0000400022b078b8 in start_thread () from /lib64/libpthread.so.0
#26 0x0000400022b63afc in thread_start () from /lib64/libc.so.6

Thread 3 (Thread 0x400070e19250 (LWP 912268) "cmsRun"):
#2  0x00004000281469cc in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x000040002213ac88 in util_prefetch_write (ptr=<optimized out>) at include/jemalloc/internal/util.h:101
#5  util_prefetch_write_range (sz=1112, ptr=<optimized out>) at include/jemalloc/internal/util.h:117
#6  tcache_bin_flush_metadata_visitor (alloc_ctx=<synthetic pointer>, szind_sum_ctx=<synthetic pointer>) at src/tcache.c:257
#7  emap_edata_lookup_batch (result=result@entry=0x4000723c02c0, metadata_visitor_ctx=<synthetic pointer>, metadata_visitor=<optimized out>, ptr_getter_ctx=ptr_getter_ctx@entry=0x4000723c0280, ptr_getter=<optimized out>, nptrs=nptrs@entry=256, emap=<optimized out>, tsd=tsd@entry=0x400070e18528) at include/jemalloc/internal/emap.h:353
#8  tcache_bin_flush_edatas_lookup (tsd=tsd@entry=0x400070e1ee40, arr=arr@entry=0x0, nflush=nflush@entry=32, edatas=edatas@entry=0x400070e18460, binind=1178501120) at src/tcache.c:288
#9  0x000040002213b444 in tcache_bin_flush_impl (small=true, nflush=32, ptrs=0x0, binind=1178501120, cache_bin=0x400070e1f1c8, tcache=0x0, tsd=0x20) at src/tcache.c:331
#10 tcache_bin_flush_bottom (small=<optimized out>, rem=<optimized out>, binind=<optimized out>, cache_bin=<optimized out>, tcache=<optimized out>, tsd=tsd@entry=0x20) at src/tcache.c:519
#11 je_tcache_bin_flush_small (tsd=tsd@entry=0x400070e1ee40, tcache=0x0, cache_bin=0x400070e1f1c8, binind=4294538056, rem=<optimized out>) at src/tcache.c:529
#12 0x00004000220f0bdc in tcache_dalloc_small (slow_path=false, binind=<optimized out>, ptr=0x4001ccb1e000, tcache=<optimized out>, tsd=0x400070e1ee40) at include/jemalloc/internal/tcache_inlines.h:157
#13 arena_sdalloc (slow_path=<optimized out>, caller_alloc_ctx=<optimized out>, tcache=<optimized out>, size=<optimized out>, ptr=<optimized out>, tsdn=<optimized out>) at include/jemalloc/internal/arena_inlines_b.h:418
#14 isdalloct (slow_path=<optimized out>, alloc_ctx=<optimized out>, tcache=<optimized out>, size=<optimized out>, ptr=<optimized out>, tsdn=<optimized out>) at include/jemalloc/internal/jemalloc_internal_inlines_c.h:133
#15 isfree (slow_path=false, tcache=<optimized out>, usize=<optimized out>, ptr=0x4001ccb1e000, tsd=0x400070e1ee40) at src/jemalloc.c:2982
#16 je_sdallocx_default (ptr=0x4001ccb1e000, size=<optimized out>, flags=<optimized out>) at src/jemalloc.c:3924
#17 0x0000400022141b90 in sizedDeleteImpl (size=<optimized out>, ptr=<optimized out>) at src/jemalloc_cpp.cpp:195
#18 operator delete (ptr=<optimized out>, size=<optimized out>) at src/jemalloc_cpp.cpp:200
#19 0x000040004a343ab0 in CaloTPGTranscoderULUT::~CaloTPGTranscoderULUT() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libCalibCalorimetryCaloTPG.so
#20 0x000040004a343b84 in CaloTPGTranscoderULUT::~CaloTPGTranscoderULUT() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libCalibCalorimetryCaloTPG.so
#21 0x000040004a312f58 in edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<CaloTPGTranscoderULUTs, std::unique_ptr<CaloTPGTranscoder, std::default_delete<CaloTPGTranscoder> >, CaloTPGRecord, edm::eventsetup::CallbackSimpleDecorator<CaloTPGRecord> >(CaloTPGTranscoderULUTs*, std::unique_ptr<CaloTPGTranscoder, std::default_delete<CaloTPGTranscoder> > (CaloTPGTranscoderULUTs::*)(CaloTPGRecord const&), edm::eventsetup::CallbackSimpleDecorator<CaloTPGRecord> const&, edm::es::Label const&)::{lambda(CaloTPGRecord const&)#1}, std::unique_ptr<CaloTPGTranscoder, std::default_delete<CaloTPGTranscoder> >, CaloTPGRecord, edm::eventsetup::CallbackSimpleDecorator<CaloTPGRecord> >, CaloTPGRecord, std::unique_ptr<CaloTPGTranscoder, std::default_delete<CaloTPGTranscoder> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/pluginCalibCalorimetryCaloTPGPlugins.so
#22 0x0000400020bfd1ec in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#23 0x0000400020bfd24c in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#24 0x0000400020b89dc0 in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#25 0x00004000225f2a64 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x4000d5aca100, waiter=<synthetic pointer>..., this=0x400023423e80) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/task_dispatcher.h:322
#26 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (waiter=<synthetic pointer>..., t=0x0, this=0x400023423e80) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/task_dispatcher.h:458
#27 tbb::detail::r1::arena::process (tls=..., this=0x400023423780) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/arena.cpp:137
#28 tbb::detail::r1::market::process (this=0x40002344b080, j=...) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/market.cpp:599
#29 0x00004000225fac5c in tbb::detail::r1::rml::private_worker::run (this=0x400023b8c080) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/private_server.cpp:271
#30 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x400023b8c080) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-d1db7ed5e1d50722d8d27a149fd6e0b9/tbb-v2021.8.0/src/tbb/private_server.cpp:221
#31 0x0000400022b078b8 in start_thread () from /lib64/libpthread.so.0
#32 0x0000400022b63afc in thread_start () from /lib64/libc.so.6

Thread 1 (Thread 0x40002277ca30 (LWP 909123) "cmsRun"):
#0  0x0000400022be5934 in nanosleep () from /lib64/libc.so.6
#1  0x0000400022be57d8 in sleep () from /lib64/libc.so.6
#2  0x00004000281469cc in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x0000400020a18dbc in do_lookup_x (undef_name=undef_name@entry=0x40006ef4293f "_ZN6HepPDT18ResonanceStructureD1Ev", new_hash=new_hash@entry=1954549647, old_hash=old_hash@entry=0xffffe2beb918, ref=0x40006ef407e0, result=result@entry=0xffffe2beb928, scope=<optimized out>, i=463, version=version@entry=0x0, flags=flags@entry=5, skip=<optimized out>, skip@entry=0x0, type_class=type_class@entry=1, undef_map=undef_map@entry=0x400036a80000) at dl-lookup.c:384
#5  0x0000400020a19748 in _dl_lookup_symbol_x (undef_name=0x40006ef4293f "_ZN6HepPDT18ResonanceStructureD1Ev", undef_map=undef_map@entry=0x400036a80000, ref=ref@entry=0xffffe2beb9b0, symbol_scope=0x400036a80398, version=0x0, type_class=type_class@entry=1, flags=5, skip_map=skip_map@entry=0x0) at dl-lookup.c:855
#6  0x0000400020a1f130 in _dl_fixup (l=0x400036a80000, reloc_arg=816) at dl-runtime.c:94
#7  0x0000400020a110c4 in _dl_runtime_resolve () at ../sysdeps/aarch64/dl-trampoline.S:99
#8  0x000040006ef4dfec in std::default_delete<HepPDT::ParticleDataTable>::operator()(HepPDT::ParticleDataTable*) const [clone .part.0] () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/pluginSimGeneralHepPDTESSource.so
#9  0x000040006ef4dfec in std::default_delete<HepPDT::ParticleDataTable>::operator()(HepPDT::ParticleDataTable*) const [clone .part.0] () from /cvmfs/cms-ib.cern.ch/sw/aarch64/week0/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-03-12-2300/lib/el8_aarch64_gcc11/pluginSimGeneralHepPDTESSource.so
#10 0x00004000d5c07d60 in ?? ()

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_aarch64_gcc11/CMSSW_13_1_X_2023-03-12-2300/pyRelValMatrixLogs/run/11605.0_SingleGammaPt35+2021/step3_SingleGammaPt35+2021.log#/

hinting more towards a problem in TensorFlow's nsync mutex (although I find this crash to be a bit weird way for it to manifest itself).

makortel commented 1 year ago

Crash in CMSSW_13_1_X_2023-04-18-2300 workflow 136.859 step 3

19-Apr-2023 03:53:14 CEST  Closed file file:step2.root

Thread 5 (Thread 0x40007dfc9250 (LWP 2368260) "cmsRun"):
#4  0x00004000571ab544 in _ZN5boost11multi_index6detail18ordered_index_implINS0_6memberIN22EcalElectronicsMapping7MapItemE5DetIdXadL_ZNS5_4cellEEEEESt4lessIS6_ENS1_9nth_layerILi1ES5_NS0_10indexed_byINS0_14ordered_uniqueIS7_N4mpl_2naESE_EENSC_INS3_IS5_17EcalElectronicsIdXadL_ZNS5_4elidEEEEESE_SE_EENSC_INS3_IS5_24EcalTriggerElectronicsIdXadL_ZNS5_6trelidEEEEESE_SE_EENS0_18ordered_non_uniqueINS0_13const_mem_funIS5_iXadL_ZNKS5_5dccIdEvEEEESE_SE_EENSM_INS0_13composite_keyIS5_SO_NSN_IS5_iXadL_ZNKS5_7towerIdEvEEEENS_6tuples9null_typeEST_ST_ST_ST_ST_ST_ST_EESE_SE_EENSM_INSQ_IS5_SO_SR_NSN_IS5_iXadL_ZNKS5_7stripIdEvEEEEST_ST_ST_ST_ST_ST_ST_EESE_SE_EENSM_INSN_IS5_iXadL_ZNKS5_5tccIdEvEEEESE_SE_EENSM_INSQ_IS5_SZ_NSN_IS5_iXadL_ZNKS5_4ttIdEvEEEEST_ST_ST_ST_ST_ST_ST_ST_EESE_SE_EENSM_INSQ_IS5_SZ_S11_NSN_IS5_iXadL_ZNKS5_13pseudoStripIdEvEEEEST_ST_ST_ST_ST_ST_ST_EESE_SE_EESE_SE_SE_SE_SE_SE_SE_SE_SE_SE_SE_EESaIS5_EEENS_3mpl7vector0ISE_EENS1_18ordered_unique_tagENS1_19null_augment_policyEE16delete_all_nodesEPNS1_18ordered_index_nodeIS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1_15index_node_baseIS5_S18_EEEEEEEEEEEEEEEEEEEE.isra.0 () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/pluginGeometryEcalMappingPlugins.so
#5  0x00004000571ab668 in _ZN5boost11multi_index6detail18ordered_index_implINS0_6memberIN22EcalElectronicsMapping7MapItemE5DetIdXadL_ZNS5_4cellEEEEESt4lessIS6_ENS1_9nth_layerILi1ES5_NS0_10indexed_byINS0_14ordered_uniqueIS7_N4mpl_2naESE_EENSC_INS3_IS5_17EcalElectronicsIdXadL_ZNS5_4elidEEEEESE_SE_EENSC_INS3_IS5_24EcalTriggerElectronicsIdXadL_ZNS5_6trelidEEEEESE_SE_EENS0_18ordered_non_uniqueINS0_13const_mem_funIS5_iXadL_ZNKS5_5dccIdEvEEEESE_SE_EENSM_INS0_13composite_keyIS5_SO_NSN_IS5_iXadL_ZNKS5_7towerIdEvEEEENS_6tuples9null_typeEST_ST_ST_ST_ST_ST_ST_EESE_SE_EENSM_INSQ_IS5_SO_SR_NSN_IS5_iXadL_ZNKS5_7stripIdEvEEEEST_ST_ST_ST_ST_ST_ST_EESE_SE_EENSM_INSN_IS5_iXadL_ZNKS5_5tccIdEvEEEESE_SE_EENSM_INSQ_IS5_SZ_NSN_IS5_iXadL_ZNKS5_4ttIdEvEEEEST_ST_ST_ST_ST_ST_ST_ST_EESE_SE_EENSM_INSQ_IS5_SZ_S11_NSN_IS5_iXadL_ZNKS5_13pseudoStripIdEvEEEEST_ST_ST_ST_ST_ST_ST_EESE_SE_EESE_SE_SE_SE_SE_SE_SE_SE_SE_SE_SE_EESaIS5_EEENS_3mpl7vector0ISE_EENS1_18ordered_unique_tagENS1_19null_augment_policyEE16delete_all_nodesEPNS1_18ordered_index_nodeIS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1G_IS1E_NS1_15index_node_baseIS5_S18_EEEEEEEEEEEEEEEEEEEE.isra.0 () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/pluginGeometryEcalMappingPlugins.so
#6  0x00004000571adb60 in edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<EcalElectronicsMappingBuilder, std::unique_ptr<EcalElectronicsMapping, std::default_delete<EcalElectronicsMapping> >, EcalMappingRcd, edm::eventsetup::CallbackSimpleDecorator<EcalMappingRcd> >(EcalElectronicsMappingBuilder*, std::unique_ptr<EcalElectronicsMapping, std::default_delete<EcalElectronicsMapping> > (EcalElectronicsMappingBuilder::*)(EcalMappingRcd const&), edm::eventsetup::CallbackSimpleDecorator<EcalMappingRcd> const&, edm::es::Label const&)::{lambda(EcalMappingRcd const&)#1}, std::unique_ptr<EcalElectronicsMapping, std::default_delete<EcalElectronicsMapping> >, EcalMappingRcd, edm::eventsetup::CallbackSimpleDecorator<EcalMappingRcd> >, EcalMappingRcd, std::unique_ptr<EcalElectronicsMapping, std::default_delete<EcalElectronicsMapping> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/pluginGeometryEcalMappingPlugins.so
#7  0x000040002b74d68c in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#8  0x000040002b74d6ec in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so

Thread 4 (Thread 0x40007d5b9250 (LWP 2368259) "cmsRun"):
#2  0x0000400034482f88 in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x000040002d6b37e4 in syscall () from /lib64/libc.so.6
#5  0x000040002d1397b8 in tbb::detail::r1::futex_wait (comparand=2, futex=0x40003331c124) at /data/cmsbuild/jenkins_c/workspace/auto-builds/CMSSW_13_1_0_pre3-el8_aarch64_gcc11/build/CMSSW_13_1_0_pre3-build/BUILD/el8_aarch64_gcc11/external/tbb/v2021.8.0-8f30f4fc8c5b3860b0ce8f2b70736d15/tbb-v2021.8.0/src/tbb/semaphore.h:103

Thread 3 (Thread 0x40007cba9250 (LWP 2368258) "cmsRun"):
#3  0x0000400034487888 in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x0000400061ffec98 in nsync::nsync_dll_splice_after_(nsync::nsync_dll_element_s_*, nsync::nsync_dll_element_s_*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#6  0x0000400061ffeccc in nsync::nsync_dll_make_first_in_list_(nsync::nsync_dll_element_s_*, nsync::nsync_dll_element_s_*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#7  0x0000400061ffed0c in nsync::nsync_dll_make_last_in_list_(nsync::nsync_dll_element_s_*, nsync::nsync_dll_element_s_*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#8  0x0000400061ffeefc in nsync::nsync_mu_lock_slow_(nsync::nsync_mu_s_*, nsync::waiter*, unsigned int, nsync::lock_type_s*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#9  0x0000400061ffefec in nsync::nsync_mu_lock(nsync::nsync_mu_s_*) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#10 0x0000400072ab61c8 in tensorflow::CancellationManager::StartCancel() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-2300/external/el8_aarch64_gcc11/lib/libtensorflow_framework.so.2
#11 0x0000400072ab64d8 in tensorflow::CancellationManager::StartCancel() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-2300/external/el8_aarch64_gcc11/lib/libtensorflow_framework.so.2
#12 0x00004000619dec24 in tensorflow::DirectSession::Close() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_1_X_2023-04-18-2300/external/el8_aarch64_gcc11/lib/libtensorflow_cc.so.2
#13 0x000040005800a4e0 in tensorflow::closeSession(tensorflow::Session*&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/libPhysicsToolsTensorFlow.so
#14 0x000040005800d328 in TfGraphDefWrapper::~TfGraphDefWrapper() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/libPhysicsToolsTensorFlow.so
#15 0x00004000770eb2f8 in edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<TfGraphDefProducer, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> >, TfGraphRecord, edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> >(TfGraphDefProducer*, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> > (TfGraphDefProducer::*)(TfGraphRecord const&), edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> const&, edm::es::Label const&)::{lambda(TfGraphRecord const&)#1}, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> >, TfGraphRecord, edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> >, TfGraphRecord, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/pluginPhysicsToolsTensorFlowPlugins.so
#16 0x000040002b74d68c in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#17 0x000040002b74d6ec in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so

Thread 1 (Thread 0x40002d29d090 (LWP 2365421) "cmsRun"):
#2  0x0000400034482f88 in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  free_fastpath (size_hint=true, size=24, ptr=0x40017204fde0) at src/jemalloc.c:3097
#5  je_je_sdallocx_noflags (ptr=0x40017204fde0, size=24) at src/jemalloc.c:3950
#6  0x000040002cc91b90 in sizedDeleteImpl (size=<optimized out>, ptr=<optimized out>) at src/jemalloc_cpp.cpp:195
#7  operator delete (ptr=<optimized out>, size=<optimized out>) at src/jemalloc_cpp.cpp:200
#8  0x0000400076b99cf0 in edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<SiStripClusterizerConditionsESProducer, std::unique_ptr<SiStripClusterizerConditions, std::default_delete<SiStripClusterizerConditions> >, SiStripClusterizerConditionsRcd, edm::eventsetup::CallbackSimpleDecorator<SiStripClusterizerConditionsRcd> >(SiStripClusterizerConditionsESProducer*, std::unique_ptr<SiStripClusterizerConditions, std::default_delete<SiStripClusterizerConditions> > (SiStripClusterizerConditionsESProducer::*)(SiStripClusterizerConditionsRcd const&), edm::eventsetup::CallbackSimpleDecorator<SiStripClusterizerConditionsRcd> const&, edm::es::Label const&)::{lambda(SiStripClusterizerConditionsRcd const&)#1}, std::unique_ptr<SiStripClusterizerConditions, std::default_delete<SiStripClusterizerConditions> >, SiStripClusterizerConditionsRcd, edm::eventsetup::CallbackSimpleDecorator<SiStripClusterizerConditionsRcd> >, SiStripClusterizerConditionsRcd, std::unique_ptr<SiStripClusterizerConditions, std::default_delete<SiStripClusterizerConditions> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/pluginRecoLocalTrackerSiStripClusterizerPlugins.so
#9  0x000040002b74d68c in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so
#10 0x000040002b74d6ec in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02781/el8_aarch64_gcc11/cms/cmssw/CMSSW_13_1_X_2023-04-17-2300/lib/el8_aarch64_gcc11/libFWCoreFramework.so

Current Modules:
Module: none (crashed)
Module: none
Module: none
Module: none

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_aarch64_gcc11/CMSSW_13_1_X_2023-04-18-2300/pyRelValMatrixLogs/run/136.859_RunDisplacedJet2018A/step3_RunDisplacedJet2018A.log#/

The job was shutting down because of an exception

----- Begin Fatal Exception 19-Apr-2023 03:52:43 CEST-----------------------
An exception of category 'PixelCPEClusterRepair::localError' occurred while
   [0] Processing  Event run: 315489 lumi: 1 event: 494169 stream: 1
   [1] Running path 'dqmoffline_7_step'
   [2] Prefetching for module SMPDQM/'SMPDQM'
   [3] Prefetching for module MuonProducer/'muons'
   [4] Prefetching for module MuonIdProducer/'muons1stStep'
   [5] Prefetching for module HBHEIsolatedNoiseReflagger/'hbhereco@cpu'
   [6] Prefetching for module TrackExtrapolator/'trackExtrapolator'
   [7] Prefetching for module DuplicateListMerger/'generalTracks'
   [8] Prefetching for module TrackProducer/'mergedDuplicateTracks'
   [9] Prefetching for module DuplicateTrackMerger/'duplicateTrackCandidates'
   [10] Prefetching for module TrackCollectionMerger/'preDuplicateMergingGeneralTracks'
   [11] Prefetching for module TrackCollectionMerger/'earlyGeneralTracks'
   [12] Calling method for module TrackProducer/'lowPtTripletStepTracks'
Exception Message:

ERROR: Negative pixel error yerr = -53690.9

----- End Fatal Exception -------------------------------------------------
iarspider commented 1 year ago

Crash in CMSSW_13_3_X_2023-08-17-2300, RelVal 14234.0 step 2:

Thread 4 (Thread 0x4000858d9bc0 (LWP 536647) "cmsRun"):
#0  0x000040003b01f768 in poll () from /lib64/libc.so.6
#1  0x000040003cb7da6c in full_read.constprop () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#2  0x000040003cb4c708 in edm::service::InitRootHandlers::stacktraceFromThread() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#3  0x000040003cb47ec8 in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x000040005b66fcc4 in tsl::CancellationManager::StartCancelWithStatus(tsl::Status const&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/external/el9_aarch64_gcc11/lib/libtensorflow_cc.so.2
#6  0x000040005b670228 in tsl::CancellationManager::StartCancel() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/external/el9_aarch64_gcc11/lib/libtensorflow_cc.so.2
#7  0x00004000631efa64 in tensorflow::DirectSession::Close() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/external/el9_aarch64_gcc11/lib/libtensorflow_cc.so.2
#8  0x00004000579ea0bc in tensorflow::closeSession(tensorflow::Session*&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/libPhysicsToolsTensorFlow.so
#9  0x00004000579ec058 in TfGraphDefWrapper::~TfGraphDefWrapper() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/libPhysicsToolsTensorFlow.so
#10 0x000040008262b7c4 in edm::eventsetup::CallbackProductResolver<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<TfGraphDefProducer, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> >, TfGraphRecord, edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> >(TfGraphDefProducer*, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> > (TfGraphDefProducer::*)(TfGraphRecord const&), edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> const&, edm::es::Label const&)::{lambda(TfGraphRecord const&)#1}, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> >, TfGraphRecord, edm::eventsetup::CallbackSimpleDecorator<TfGraphRecord> >, TfGraphRecord, std::unique_ptr<TfGraphDefWrapper, std::default_delete<TfGraphDefWrapper> > >::invalidateCache() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/pluginPhysicsToolsTensorFlowPlugins.so
#11 0x0000400038e7ab4c in edm::eventsetup::EventSetupRecordImpl::invalidateProxies() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/libFWCoreFramework.so
#12 0x0000400038e7abac in edm::FunctorWaitingTask<edm::eventsetup::EventSetupRecordIOVQueue::startNewIOVAsync(edm::WaitingTaskHolder const&, edm::WaitingTaskList&)::{lambda(edm::LimitedTaskQueue::Resumer)#1}::operator()(edm::LimitedTaskQueue::Resumer)::{lambda(std::__exception_ptr::exception_ptr const*)#1}>::execute() () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/libFWCoreFramework.so
#13 0x0000400038e0967c in tbb::detail::d1::function_task<edm::WaitingTaskHolder::doneWaiting(std::__exception_ptr::exception_ptr)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/aarch64/nweek-02798/el9_aarch64_gcc11/cms/cmssw-patch/CMSSW_13_3_X_2023-08-17-2300/lib/el9_aarch64_gcc11/libFWCoreFramework.so
#14 0x000040003a832b88 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x4000aab7aa00, waiter=<synthetic pointer>..., this=0x40003b983e00) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el9_aarch64_gcc11/external/tbb/v2021.9.0-73c9534380ca142d041902611d608a2c/tbb-v2021.9.0/src/tbb/task_dispatcher.h:322
#15 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (waiter=<synthetic pointer>..., t=0x0, this=0x40003b983e00) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el9_aarch64_gcc11/external/tbb/v2021.9.0-73c9534380ca142d041902611d608a2c/tbb-v2021.9.0/src/tbb/task_dispatcher.h:458
#16 tbb::detail::r1::arena::process (tls=..., this=0x40003b983780) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el9_aarch64_gcc11/external/tbb/v2021.9.0-73c9534380ca142d041902611d608a2c/tbb-v2021.9.0/src/tbb/arena.cpp:137
#17 tbb::detail::r1::market::process (this=0x40003b9ab080, j=...) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el9_aarch64_gcc11/external/tbb/v2021.9.0-73c9534380ca142d041902611d608a2c/tbb-v2021.9.0/src/tbb/market.cpp:599
#18 0x000040003a83aec8 in tbb::detail::r1::rml::private_worker::run (this=0x40003c0ec100) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el9_aarch64_gcc11/external/tbb/v2021.9.0-73c9534380ca142d041902611d608a2c/tbb-v2021.9.0/src/tbb/private_server.cpp:271
#19 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x40003c0ec100) at /data/cmsbld/jenkins_a/workspace/build-any-ib/w/BUILD/el9_aarch64_gcc11/external/tbb/v2021.9.0-73c9534380ca142d041902611d608a2c/tbb-v2021.9.0/src/tbb/private_server.cpp:221
#20 0x000040003afc2a28 in start_thread () from /lib64/libc.so.6
#21 0x000040003af6bb9c in thread_start () from /lib64/libc.so.6