cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.06k stars 4.24k forks source link

testFWCoreConcurrencyCatch2 failing in MULTIARCH_X, ROOT632_X, ROOT6_X #45194

Open iarspider opened 3 weeks ago

iarspider commented 3 weeks ago

In CMSSW_141{MULTIARCH_X, ROOT632_X, ROOT6_X}, test testFWCoreConcurrencyCatch2 is failing:

src/FWCore/Concurrency/test/test_catch2_WaitingThreadPool.cc:118: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal
iarspider commented 3 weeks ago

assign FWCore/Concurrency

cmsbuild commented 3 weeks ago

New categories assigned: core

@Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild commented 3 weeks ago

cms-bot internal usage

cmsbuild commented 3 weeks ago

A new Issue was created by @iarspider.

@smuzaffar, @rappoccio, @Dr15Jones, @sextonkennedy, @makortel, @antoniovilela can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

iarspider commented 3 weeks ago

First occurance: CMSSW_14_1_ROOT632_X_2024-06-09-2300

Dr15Jones commented 2 weeks ago

The problem does not appear to be reproducible (as is true for many threading tests). Was the machine running the tests under a high load?

Dr15Jones commented 2 weeks ago

The actual failure in the log is

testFWCoreConcurrencyCatch2: src/FWCore/Utilities/interface/ReusableObjectHolder.h:92: edm::ReusableObjectHolder<T, Deleter>::~ReusableObjectHolder() [with T = edm::impl::WaitingThread; Deleter = std::default_delete<edm::impl::WaitingThread>]: Assertion `0 == m_outstandingObjects' failed.
Dr15Jones commented 2 weeks ago

I believe there is a race-condition in edm::asyncRun when the WaitingThreadPool is being destroyed, there is no guaranteed that the threads it spawned (via WaitingThread) have actually finished being returned to the ReusableObjectHolder so they can be properly deleted.