JeffersonLab / halld_recon

Reconstruction for the GlueX Detector
7 stars 9 forks source link

ReactionFilter plugin crashes on rhel8 with DTreeInterface::Fill error #613

Closed nsjarvis closed 8 months ago

nsjarvis commented 2 years ago

The ReactionFilter crashes frequently on os8 and rhel8, and usually after processing most of the events. I first noticed this at CMU with version set 5.1.0 and so ran a test on jlabl5 (rhel8) using the nightly build labelled Jan 6, and it crashes there too.

So far I have not seen the crashes in jobs using PNTHREADS=1.

`gxenv /u/scratch/gluex/nightly/2022-01-06/Linux_RHEL8-x86_64-gcc8.5.0/version_2022-01-06.xml

hd_root /cache/halld/RunPeriod-2017-01/recon/ver03/REST/030283/dana_rest_030283_095.hddm -PPLUGINS=ReactionFilter -PReaction1=1_14__1_1_11_12_14 -PReaction1:Flags=B4 -PReaction2=1_14__1_1_8_9_14 -PReaction2:Flags=B4 -PNTHREADS=16`

#5  DTreeInterface::Fill (this=0x7f40388396b0, locTreeFillData=...) at libraries/ANALYSIS/DTreeInterface.cc:292
#6  0x00000000007fbf3d in DEventWriterROOT::Fill_DataTree (this=0x7f40385101e0, locEventLoop=0x7f4038000b60, locReaction=<optimized out>, locParticleCombos=std::deque with 19 elements = {...}) at libraries/ANALYSIS/DEventWriterROOT.cc:1408
#7  0x00000000007fc8bd in DEventWriterROOT::Fill_DataTrees (this=this
entry=0x7f40385101e0, locEventLoop=locEventLoop
entry=0x7f4038000b60, locDReactionTag="ReactionFilter") at libraries/ANALYSIS/DEventWriterROOT.cc:1173
#8  0x00007f404f57b9d5 in DEventProcessor_ReactionFilter::evnt (this=<optimized out>, locEventLoop=0x7f4038000b60, locEventNumber=<optimized out>) at /usr/include/c++/8/ext/new_allocator.h:79
#9  0x000000000118cedf in jana::JEventLoop::OneEvent (this=0x7f4038000b60) at src/JANA/JEventLoop.cc:693
#10 0x000000000118d4f4 in jana::JEventLoop::Loop (this=this
entry=0x7f4038000b60) at src/JANA/JEventLoop.cc:496
#11 0x000000000116a6a9 in LaunchThread (arg=0x7ffc4b5781a0) at src/JANA/JApplication.cc:1382
#12 0x00007f40684ce17a in start_thread () from /lib64/libpthread.so.0
#13 0x00007f40681fddc3 in clone () from /lib64/libc.so.6

===========================================================
nsjarvis commented 1 year ago

This is still the case with version set 5.9.0.

aaust commented 9 months ago

This problem is still present on Alma9

rjones30 commented 8 months ago

Hello, I am now running a test as described below within the alma9 container. So far 1000 repeats over 10000 event input (simulation) rest files, and I cannot get it to crash. Is it only on raw data? Only on specific runs? Only on CMU cluster? I am running hd_root from the standard alma9 container build that Alex maintains for us, which I access via cvmfs. -Richard Jones

simulation Run 50986 software version 5.14.2

nsjarvis commented 8 months ago

Try jlabl5 (if that is still os8) - I verified that it crashed there before posting it as an issue.

I found the problem using real data. I don't recall if I tried it with simulated.

Naomi.

On Tue, Jan 30, 2024 at 9:35 AM Richard Jones @.***> wrote:

Hello, I am now running a test as described below within the alma9 container. So far 1000 repeats over 10000 event input (simulation) rest files, and I cannot get it to crash. Is it only on raw data? Only on specific runs? Only on CMU cluster? I am running hd_root from the standard alma9 container build that Alex maintains for us, which I access via cvmfs. -Richard Jones

simulation Run 50986 software version 5.14.2

  • PNTHREADS=8
  • PLUGINS monitoring_hists,ReactionFilter COMBO:MAX_NEUTRALS 15 Reaction1 1_14__7_8_9_14 Reaction1:Flags B0_M7 Reaction2 1_14__1_7_14 Reaction2:Flags B0_M7 -PEVENTS_TO_SKIP=0 -PEVENTS_TO_KEEP=100000000 -PTHREAD_TIMEOUT_FIRST_EVENT=3600 -PTHREAD_TIMEOUT=600 --nthreads=8

— Reply to this email directly, view it on GitHub https://github.com/JeffersonLab/halld_recon/issues/613#issuecomment-1917015444, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXOCVUBG6VEWA3UXMXVTRDYREALBAVCNFSM5LPXSIBKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJRG4YDCNJUGQ2A . You are receiving this because you authored the thread.Message ID: @.***>

rjones30 commented 8 months ago

Naomi, would be best if you could specify a container. Specifying a machine is too variable, because everyone has a different environment. For me, the first thing I do on any machine is pop into a container, so my experience is platform-agnostic. -Richard Jones

On Tue, Jan 30, 2024 at 10:14 AM nsjarvis @.***> wrote:

Try jlabl5 (if that is still os8) - I verified that it crashed there before posting it as an issue.

I found the problem using real data. I don't recall if I tried it with simulated.

Naomi.

On Tue, Jan 30, 2024 at 9:35 AM Richard Jones @.***> wrote:

Hello, I am now running a test as described below within the alma9 container. So far 1000 repeats over 10000 event input (simulation) rest files, and I cannot get it to crash. Is it only on raw data? Only on specific runs? Only on CMU cluster? I am running hd_root from the standard alma9 container build that Alex maintains for us, which I access via cvmfs. -Richard Jones

simulation Run 50986 software version 5.14.2

  • PNTHREADS=8
  • PLUGINS monitoring_hists,ReactionFilter COMBO:MAX_NEUTRALS 15 Reaction1 1_14__7_8_9_14 Reaction1:Flags B0_M7 Reaction2 1_14__1_7_14 Reaction2:Flags B0_M7 -PEVENTS_TO_SKIP=0 -PEVENTS_TO_KEEP=100000000 -PTHREAD_TIMEOUT_FIRST_EVENT=3600 -PTHREAD_TIMEOUT=600 --nthreads=8

— Reply to this email directly, view it on GitHub < https://github.com/JeffersonLab/halld_recon/issues/613#issuecomment-1917015444>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADXOCVUBG6VEWA3UXMXVTRDYREALBAVCNFSM5LPXSIBKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJRG4YDCNJUGQ2A>

. You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/JeffersonLab/halld_recon/issues/613#issuecomment-1917139658, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3YKWD4FY2IPVX4HJVUIQTYREE45AVCNFSM5LPXSIBKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJRG4YTGOJWGU4A . You are receiving this because you commented.Message ID: @.***>

nsjarvis commented 8 months ago

Could you try the exact same command & file that I mentioned in the original post?

nsjarvis commented 8 months ago

I cannot specify a container, sorry, I haven't been using them. Maybe someone else could find one matching the environment mentioned in my original post.

rjones30 commented 8 months ago

To a broader audience, has anyone else encountered this crashing ReactionFilter on any platform besides Naomi? I am unable to reproduce it in a controlled environment like a container. -Richard Jones

On Tue, Jan 30, 2024 at 10:19 AM nsjarvis @.***> wrote:

I cannot specify a container, sorry, I haven't been using them. Maybe someone else could find one matching the environment mentioned in my original post.

— Reply to this email directly, view it on GitHub https://github.com/JeffersonLab/halld_recon/issues/613#issuecomment-1917168265, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3YKWHDN753JWO4ESCNP43YREFRLAVCNFSM5LPXSIBKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJRG4YTMOBSGY2Q . You are receiving this because you commented.Message ID: @.***>

rjones30 commented 8 months ago

Good news, after many attempts, I finally saw a crash. It might take a while to diagnose it, being as rare as it seems to be in this alma9 environment. But it is definitely a thing. Here is the full crash dump. You can see that it happens after the last event has been read from the input file, so it is a program termination bug. -Richard JANA >> --- Configuration Parameters -- JANA >> COMBO:MAX_NEUTRALS = 15
JANA >> EVENTS_TO_KEEP = 1000000000
JANA >> JANA:RESOURCE_DEFAULT_PATH =
JANA >> NTHREADS = 8
JANA >> PLUGINS = monitoring_hists,ReactionFilter JANA >> Reaction1 = 1_14__7_8_9_14
JANA >> Reaction1:Flags = B0_M7
JANA >> Reaction2 = 1_14__1_7_14
JANA >> Reaction2:Flags = B0_M7
JANA >> THREAD_TIMEOUT = 600
JANA >> THREAD_TIMEOUT_FIRST_EVENT = 3600
JANA >> ------------------------------- JANA >>events processed (9.7k events read) 764.0Hz (avg.: 474.5Hz)
JANA >> JANA >>No more event sources JANA >>No more event sources JANA >> JANA >>No more event sources JANA >>Thread 0x7f565f7fe640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >>Thread 0x7f5671189640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >>Thread 0x7f565ffff640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >> JANA >>No more event sources JANA >>Thread 0x7f565dffb640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >> JANA >>No more event sources JANA >>Thread 0x7f565effd640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >> JANA >>No more event sources JANA >>Thread 0x7f5664d8e640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >> JANA >>No more event sources JANA >>Thread 0x7f5670988640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >>Merging thread 0 (0x7f565f7fe640) ...) 530.0Hz (avg.: 475.8Hz)
JANA >>Merging thread 1 (0x7f5671189640) ... JANA >>Merging thread 2 (0x7f565ffff640) ... JANA >>Merging thread 3 (0x7f565dffb640) ... JANA >>Merging thread 4 (0x7f565effd640) ...

=========================================================== There was a crash. This is the entire stack trace of all threads:

Thread 6 (Thread 0x7f565e7fc640 (LWP 1838177) "hd_root"):

0 0x00007f5688cfa30f in wait4 () from /lib64/libc.so.6

1 0x00007f5688c43953 in do_system () from /lib64/libc.so.6

2 0x00007f568b993cbc in TUnixSystem::StackTrace() () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/root/root-6.24.04/lib/libCore.so

3 0x00007f568b9912f5 in TUnixSystem::DispatchSignals(ESignals) () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/root/root-6.24.04/lib/libCore.so

4

5 0x0000000000a3051a in DTreeInterface::Fill(DTreeFillData&) ()

6 0x0000000000827e0e in DEventWriterROOT::Fill_DataTree(jana::JEventLoop, DAnalysis::DReaction const, std::deque<DAnalysis::DParticleCombo const, std::allocator<DAnalysis::DParticleCombo const> >&) const ()

7 0x0000000000828fda in DEventWriterROOT::Fill_DataTrees(jana::JEventLoop*, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) const ()

8 0x00007f5678478d93 in DEventProcessor_ReactionFilter::evnt(jana::JEventLoop*, unsigned long) () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/halld_recon/halld_recon-4.43.1^ccdb1610/Linux_Alma9-x86_64-gcc11.4.1-cntr/plugins/ReactionFilter.so

9 0x0000000001486e62 in jana::JEventLoop::OneEvent (this=0x7f5640000b60) at src/JANA/JEventLoop.cc:693

10 0x0000000001487484 in jana::JEventLoop::Loop (this=this

entry=0x7f5640000b60) at src/JANA/JEventLoop.cc:496

11 0x000000000145c3b5 in LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1382

12 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

13 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7f565effd640 (LWP 1838176) "hd_root"):

0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from /lib64/libc.so.6

1 0x00007f5688c87d8f in pthread_rwlock_wrlock

GLIBC_2.2.5 () from /lib64/libc.so.6

2 0x0000000000a2f647 in DTreeInterface::~DTreeInterface() ()

3 0x0000000000800899 in DEventWriterROOT::~DEventWriterROOT() ()

4 0x0000000000800e69 in DEventWriterROOT::~DEventWriterROOT() ()

5 0x00000000007424c6 in DEventWriterROOT_factory::fini() ()

6 0x00000000014867e9 in jana::JEventLoop::~JEventLoop (this=0x7f564c000b60, __in_chrg=) at /usr/include/c++/11/bits/stl_vector.h:1043

7 0x0000000001486a99 in jana::JEventLoop::~JEventLoop (this=0x7f564c000b60, __in_chrg=) at src/JANA/JEventLoop.cc:152

8 0x000000000145c442 in pthread_cleanup_class::~pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

9 pthread_cleanup_class::~pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

10 LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1391

11 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

12 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7f5664d8e640 (LWP 1838173) "hd_root"):

0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from /lib64/libc.so.6

1 0x00007f5688c87d8f in pthread_rwlock_wrlock

GLIBC_2.2.5 () from /lib64/libc.so.6

2 0x0000000000a2f647 in DTreeInterface::~DTreeInterface() ()

3 0x0000000000800899 in DEventWriterROOT::~DEventWriterROOT() ()

4 0x0000000000800e69 in DEventWriterROOT::~DEventWriterROOT() ()

5 0x00000000007424c6 in DEventWriterROOT_factory::fini() ()

6 0x00000000014867e9 in jana::JEventLoop::~JEventLoop (this=0x7f5650000b60, __in_chrg=) at /usr/include/c++/11/bits/stl_vector.h:1043

7 0x0000000001486a99 in jana::JEventLoop::~JEventLoop (this=0x7f5650000b60, __in_chrg=) at src/JANA/JEventLoop.cc:152

8 0x000000000145c442 in pthread_cleanup_class::~pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

9 pthread_cleanup_class::~pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

10 LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1391

11 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

12 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f5670988640 (LWP 1838172) "hd_root"):

0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from /lib64/libc.so.6

1 0x00007f5688c87d8f in pthread_rwlock_wrlock

GLIBC_2.2.5 () from /lib64/libc.so.6

2 0x0000000000a2f647 in DTreeInterface::~DTreeInterface() ()

3 0x0000000000800899 in DEventWriterROOT::~DEventWriterROOT() ()

4 0x0000000000800e69 in DEventWriterROOT::~DEventWriterROOT() ()

5 0x00000000007424c6 in DEventWriterROOT_factory::fini() ()

6 0x00000000014867e9 in jana::JEventLoop::~JEventLoop (this=0x7f5658000b60, __in_chrg=) at /usr/include/c++/11/bits/stl_vector.h:1043

7 0x0000000001486a99 in jana::JEventLoop::~JEventLoop (this=0x7f5658000b60, __in_chrg=) at src/JANA/JEventLoop.cc:152

8 0x000000000145c442 in pthread_cleanup_class::~pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

9 pthread_cleanup_class::~pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

10 LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1391

11 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

12 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f568581f640 (LWP 1838140) "hd_root"):

0 0x00007f5688c7e39a in __futex_abstimed_wait_common () from /lib64/libc.so.6

1 0x00007f5688c89838 in __new_sem_wait_slow64.constprop.0 () from /lib64/libc.so.6

2 0x00007f5688bcc5a0 in XrdPosixFile::DelayedDestroy(void*) () from /lib64/libXrdPosix.so.3

3 0x00007f56860796a8 in XrdSysThread_Xeq () from /lib64/libXrdUtils.so.3

4 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

5 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f5685893400 (LWP 1838139) "hd_root"):

0 0x00007f5688c7e39a in __futex_abstimed_wait_common () from /lib64/libc.so.6

1 0x00007f5688c832d3 in __pthread_clockjoin_ex () from /lib64/libc.so.6

2 0x000000000146a38a in jana::JApplication::Run (this=0x7ffc7a5f93d0, proc=, Nthreads=) at /usr/include/c++/11/bits/stl_vector.h:1043

3 0x0000000000717e2d in main ()

===========================================================

The lines below might hint at the cause of the crash. You may get help by asking at the ROOT forum https://root.cern.ch/forum Only if you are really convinced it is a bug in ROOT then please submit a report at https://root.cern.ch/bugs Please post the ENTIRE stack trace from above as an attachment in addition to anything else that might help us fixing this issue.

5 0x0000000000a3051a in DTreeInterface::Fill(DTreeFillData&) ()

6 0x0000000000827e0e in DEventWriterROOT::Fill_DataTree(jana::JEventLoop, DAnalysis::DReaction const, std::deque<DAnalysis::DParticleCombo const, std::allocator<DAnalysis::DParticleCombo const> >&) const ()

7 0x0000000000828fda in DEventWriterROOT::Fill_DataTrees(jana::JEventLoop*, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) const ()

8 0x00007f5678478d93 in DEventProcessor_ReactionFilter::evnt(jana::JEventLoop*, unsigned long) () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/halld_recon/halld_recon-4.43.1^ccdb1610/Linux_Alma9-x86_64-gcc11.4.1-cntr/plugins/ReactionFilter.so

9 0x0000000001486e62 in jana::JEventLoop::OneEvent (this=0x7f5640000b60) at src/JANA/JEventLoop.cc:693

10 0x0000000001487484 in jana::JEventLoop::Loop (this=this

entry=0x7f5640000b60) at src/JANA/JEventLoop.cc:496

11 0x000000000145c3b5 in LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1382

12 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

13 0x00007f5688c21314 in clone () from /lib64/libc.so.6

===========================================================

Quitting after error, code=139 Reason: hd_root crashed total 1784

nsjarvis commented 8 months ago

In reply to your earlier question, Alex reproduced it at JLab months ago, using the os8 node. I believe that CMU is the only GlueX group using os8.

aaust commented 8 months ago

I remember vaguely that the EVENTS_TO_KEEP option was suppressing the issue, but this could be anecdotal. I will also try to reproduce it again.

zihlmann commented 8 months ago

I suspect that if you increase the number of threads the probability to fail will increase.

On 1/30/24 10:49, Richard Jones wrote:

Good news, after many attempts, I finally saw a crash. It might take a while to diagnose it, being as rare as it seems to be in this alma9 environment. But it is definitely a thing. Here is the full crash dump. You can see that it happens after the last event has been read from the input file, so it is a program termination bug. -Richard JANA >> --- Configuration Parameters -- JANA >> COMBO:MAX_NEUTRALS = 15 JANA >> EVENTS_TO_KEEP = 1000000000 JANA >> JANA:RESOURCE_DEFAULT_PATH = JANA >> NTHREADS = 8 JANA >> PLUGINS = monitoring_hists,ReactionFilter JANA >> Reaction1 = 1_14__7_8_9_14 JANA >> Reaction1:Flags = B0_M7 JANA >> Reaction2 = 1_14__1_7_14 JANA >> Reaction2:Flags = B0_M7 JANA >> THREAD_TIMEOUT = 600 JANA >> THREAD_TIMEOUT_FIRST_EVENT = 3600 JANA >> ------------------------------- JANA >>events processed (9.7k events read) 764.0Hz (avg.: 474.5Hz) JANA >> JANA >>No more event sources JANA >>No more event sources JANA >> JANA >>No more event sources JANA >>Thread 0x7f565f7fe640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >>Thread 0x7f5671189640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >>Thread 0x7f565ffff640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >> JANA >>No more event sources JANA >>Thread 0x7f565dffb640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >> JANA >>No more event sources JANA >>Thread 0x7f565effd640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >> JANA >>No more event sources JANA >>Thread 0x7f5664d8e640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >> JANA >>No more event sources JANA >>Thread 0x7f5670988640 completed gracefully: Tue Jan 30 10:43:33 2024 JANA >>Merging thread 0 (0x7f565f7fe640) ...) 530.0Hz (avg.: 475.8Hz) JANA >>Merging thread 1 (0x7f5671189640) ... JANA >>Merging thread 2 (0x7f565ffff640) ... JANA >>Merging thread 3 (0x7f565dffb640) ... JANA >>Merging thread 4 (0x7f565effd640) ...

=========================================================== There was a crash. This is the entire stack trace of all threads:

Thread 6 (Thread 0x7f565e7fc640 (LWP 1838177) "hd_root"):

0 0x00007f5688cfa30f in wait4 () from /lib64/libc.so.6

1

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_1&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ktvToEdHbdzMlK9uIbUcNR6plFyXwTDIIhfD100EpbE&e= 0x00007f5688c43953 in do_system () from /lib64/libc.so.6

2

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_2&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=4C3bkBlee1_CJdCeOGoMYSftisF44LdNOF_a8cHYBFc&e= 0x00007f568b993cbc in TUnixSystem::StackTrace() () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/root/root-6.24.04/lib/libCore.so

3

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_3&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=tJOrrUmlM2Sw-u1BN8a4M62n7uBN_KVls84jZ9jyjUE&e= 0x00007f568b9912f5 in TUnixSystem::DispatchSignals(ESignals) () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/root/root-6.24.04/lib/libCore.so

4

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_4&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ufawTg-g5MKnj4OLNHdB3EbydfxGmyc5EIhsyJDk0Cw&e=

5

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_5&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=i7qEKZF0I-Oiy0zZh8Ufm1GpxiIoqbDlYXV2xYZsGgQ&e= 0x0000000000a3051a in DTreeInterface::Fill(DTreeFillData&) ()

6

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ZF7uihl1F4SuXtH-nvIRNTjNkaewBQO6fLGChdH8AzA&e= 0x0000000000827e0e in DEventWriterROOT::Fill_DataTree(jana::JEventLoop, DAnalysis::DReaction const, std::deque<DAnalysis::DParticleCombo const, std::allocator<DAnalysis::DParticleCombo const> >&) const ()

7

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=uesfbdwVYK5UOmG2hdZqWORU-KtkEs95j0Kzl0PXCJY&e= 0x0000000000828fda in DEventWriterROOT::Fill_DataTrees(jana::JEventLoop*, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) const ()

8

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=xmZtBr8izlFNu08MILXcC1KgXLtU4gMfz3tOKAim4hI&e= 0x00007f5678478d93 in DEventProcessor_ReactionFilter::evnt(jana::JEventLoop*, unsigned long) () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/halld_recon/halld_recon-4.43.1^ccdb1610/Linux_Alma9-x86_64-gcc11.4.1-cntr/plugins/ReactionFilter.so

9

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=0E3ZYHsmaRHL5udPyKTaUU8WUC8oPEYrRNgQNn97tKA&e= 0x0000000001486e62 in jana::JEventLoop::OneEvent (this=0x7f5640000b60) at src/JANA/JEventLoop.cc:693

10

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=XBHQRWH4F7Hll8uHcqkgCMfFEQzyfiTZOoLDC18rHxI&e= 0x0000000001487484 in jana::JEventLoop::Loop (this=this entry=0x7f5640000b60) at src/JANA/JEventLoop.cc:496

11

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=wKr7luDXfmpKB7wU0tk5y6cuY777pOQ-Q33Ii9vOeWc&e= 0x000000000145c3b5 in LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1382

12

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=-Cfy3w9MsYu14a70WRmPCQZf9w3hhYpXb2kY2VNp-6U&e= 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

13

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_13&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=wrGS93xy4R2BLFi0e19dH7iLgLzVgIB-pN-GzKuf5cU&e= 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7f565effd640 (LWP 1838176) "hd_root"):

0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from

/lib64/libc.so.6

1

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_1&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ktvToEdHbdzMlK9uIbUcNR6plFyXwTDIIhfD100EpbE&e= 0x00007f5688c87d8f in pthread_rwlock_wrlock GLIBC_2.2.5 () from /lib64/libc.so.6

2

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_2&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=4C3bkBlee1_CJdCeOGoMYSftisF44LdNOF_a8cHYBFc&e= 0x0000000000a2f647 in DTreeInterface::~DTreeInterface() ()

3

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_3&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=tJOrrUmlM2Sw-u1BN8a4M62n7uBN_KVls84jZ9jyjUE&e= 0x0000000000800899 in DEventWriterROOT::~DEventWriterROOT() ()

4

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_4&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ufawTg-g5MKnj4OLNHdB3EbydfxGmyc5EIhsyJDk0Cw&e= 0x0000000000800e69 in DEventWriterROOT::~DEventWriterROOT() ()

5

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_5&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=i7qEKZF0I-Oiy0zZh8Ufm1GpxiIoqbDlYXV2xYZsGgQ&e= 0x00000000007424c6 in DEventWriterROOT_factory::fini() ()

6

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ZF7uihl1F4SuXtH-nvIRNTjNkaewBQO6fLGChdH8AzA&e= 0x00000000014867e9 in jana::JEventLoop::JEventLoop (this=0x7f564c000b60, __in_chrg=) at /usr/include/c++/11/bits/stl_vector.h:1043

7

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=uesfbdwVYK5UOmG2hdZqWORU-KtkEs95j0Kzl0PXCJY&e= 0x0000000001486a99 in jana::JEventLoop::JEventLoop (this=0x7f564c000b60, __in_chrg=) at src/JANA/JEventLoop.cc:152

8

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=xmZtBr8izlFNu08MILXcC1KgXLtU4gMfz3tOKAim4hI&e= 0x000000000145c442 in pthread_cleanup_class::pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

9

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=0E3ZYHsmaRHL5udPyKTaUU8WUC8oPEYrRNgQNn97tKA&e= pthread_cleanup_class::pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

10

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=XBHQRWH4F7Hll8uHcqkgCMfFEQzyfiTZOoLDC18rHxI&e= LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1391

11

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=wKr7luDXfmpKB7wU0tk5y6cuY777pOQ-Q33Ii9vOeWc&e= 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

12

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=-Cfy3w9MsYu14a70WRmPCQZf9w3hhYpXb2kY2VNp-6U&e= 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7f5664d8e640 (LWP 1838173) "hd_root"):

0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from

/lib64/libc.so.6

1

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_1&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ktvToEdHbdzMlK9uIbUcNR6plFyXwTDIIhfD100EpbE&e= 0x00007f5688c87d8f in pthread_rwlock_wrlock GLIBC_2.2.5 () from /lib64/libc.so.6

2

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_2&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=4C3bkBlee1_CJdCeOGoMYSftisF44LdNOF_a8cHYBFc&e= 0x0000000000a2f647 in DTreeInterface::~DTreeInterface() ()

3

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_3&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=tJOrrUmlM2Sw-u1BN8a4M62n7uBN_KVls84jZ9jyjUE&e= 0x0000000000800899 in DEventWriterROOT::~DEventWriterROOT() ()

4

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_4&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ufawTg-g5MKnj4OLNHdB3EbydfxGmyc5EIhsyJDk0Cw&e= 0x0000000000800e69 in DEventWriterROOT::~DEventWriterROOT() ()

5

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_5&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=i7qEKZF0I-Oiy0zZh8Ufm1GpxiIoqbDlYXV2xYZsGgQ&e= 0x00000000007424c6 in DEventWriterROOT_factory::fini() ()

6

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ZF7uihl1F4SuXtH-nvIRNTjNkaewBQO6fLGChdH8AzA&e= 0x00000000014867e9 in jana::JEventLoop::JEventLoop (this=0x7f5650000b60, __in_chrg=) at /usr/include/c++/11/bits/stl_vector.h:1043

7

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=uesfbdwVYK5UOmG2hdZqWORU-KtkEs95j0Kzl0PXCJY&e= 0x0000000001486a99 in jana::JEventLoop::JEventLoop (this=0x7f5650000b60, __in_chrg=) at src/JANA/JEventLoop.cc:152

8

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=xmZtBr8izlFNu08MILXcC1KgXLtU4gMfz3tOKAim4hI&e= 0x000000000145c442 in pthread_cleanup_class::pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

9

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=0E3ZYHsmaRHL5udPyKTaUU8WUC8oPEYrRNgQNn97tKA&e= pthread_cleanup_class::pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

10

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=XBHQRWH4F7Hll8uHcqkgCMfFEQzyfiTZOoLDC18rHxI&e= LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1391

11

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=wKr7luDXfmpKB7wU0tk5y6cuY777pOQ-Q33Ii9vOeWc&e= 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

12

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=-Cfy3w9MsYu14a70WRmPCQZf9w3hhYpXb2kY2VNp-6U&e= 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f5670988640 (LWP 1838172) "hd_root"):

0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from

/lib64/libc.so.6

1

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_1&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ktvToEdHbdzMlK9uIbUcNR6plFyXwTDIIhfD100EpbE&e= 0x00007f5688c87d8f in pthread_rwlock_wrlock GLIBC_2.2.5 () from /lib64/libc.so.6

2

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_2&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=4C3bkBlee1_CJdCeOGoMYSftisF44LdNOF_a8cHYBFc&e= 0x0000000000a2f647 in DTreeInterface::~DTreeInterface() ()

3

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_3&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=tJOrrUmlM2Sw-u1BN8a4M62n7uBN_KVls84jZ9jyjUE&e= 0x0000000000800899 in DEventWriterROOT::~DEventWriterROOT() ()

4

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_4&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ufawTg-g5MKnj4OLNHdB3EbydfxGmyc5EIhsyJDk0Cw&e= 0x0000000000800e69 in DEventWriterROOT::~DEventWriterROOT() ()

5

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_5&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=i7qEKZF0I-Oiy0zZh8Ufm1GpxiIoqbDlYXV2xYZsGgQ&e= 0x00000000007424c6 in DEventWriterROOT_factory::fini() ()

6

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ZF7uihl1F4SuXtH-nvIRNTjNkaewBQO6fLGChdH8AzA&e= 0x00000000014867e9 in jana::JEventLoop::JEventLoop (this=0x7f5658000b60, __in_chrg=) at /usr/include/c++/11/bits/stl_vector.h:1043

7

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=uesfbdwVYK5UOmG2hdZqWORU-KtkEs95j0Kzl0PXCJY&e= 0x0000000001486a99 in jana::JEventLoop::JEventLoop (this=0x7f5658000b60, __in_chrg=) at src/JANA/JEventLoop.cc:152

8

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=xmZtBr8izlFNu08MILXcC1KgXLtU4gMfz3tOKAim4hI&e= 0x000000000145c442 in pthread_cleanup_class::pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

9

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=0E3ZYHsmaRHL5udPyKTaUU8WUC8oPEYrRNgQNn97tKA&e= pthread_cleanup_class::pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578

10

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=XBHQRWH4F7Hll8uHcqkgCMfFEQzyfiTZOoLDC18rHxI&e= LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1391

11

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=wKr7luDXfmpKB7wU0tk5y6cuY777pOQ-Q33Ii9vOeWc&e= 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

12

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=-Cfy3w9MsYu14a70WRmPCQZf9w3hhYpXb2kY2VNp-6U&e= 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f568581f640 (LWP 1838140) "hd_root"):

0 0x00007f5688c7e39a in __futex_abstimed_wait_common () from

/lib64/libc.so.6

1

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_1&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ktvToEdHbdzMlK9uIbUcNR6plFyXwTDIIhfD100EpbE&e= 0x00007f5688c89838 in __new_sem_wait_slow64.constprop.0 () from /lib64/libc.so.6

2

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_2&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=4C3bkBlee1_CJdCeOGoMYSftisF44LdNOF_a8cHYBFc&e= 0x00007f5688bcc5a0 in XrdPosixFile::DelayedDestroy(void*) () from /lib64/libXrdPosix.so.3

3

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_3&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=tJOrrUmlM2Sw-u1BN8a4M62n7uBN_KVls84jZ9jyjUE&e= 0x00007f56860796a8 in XrdSysThread_Xeq () from /lib64/libXrdUtils.so.3

4

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_4&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ufawTg-g5MKnj4OLNHdB3EbydfxGmyc5EIhsyJDk0Cw&e= 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

5

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_5&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=i7qEKZF0I-Oiy0zZh8Ufm1GpxiIoqbDlYXV2xYZsGgQ&e= 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f5685893400 (LWP 1838139) "hd_root"):

0 0x00007f5688c7e39a in __futex_abstimed_wait_common () from

/lib64/libc.so.6

1

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_1&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ktvToEdHbdzMlK9uIbUcNR6plFyXwTDIIhfD100EpbE&e= 0x00007f5688c832d3 in __pthread_clockjoin_ex () from /lib64/libc.so.6

2

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_2&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=4C3bkBlee1_CJdCeOGoMYSftisF44LdNOF_a8cHYBFc&e= 0x000000000146a38a in jana::JApplication::Run (this=0x7ffc7a5f93d0, proc=, Nthreads=) at /usr/include/c++/11/bits/stl_vector.h:1043

3

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_3&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=tJOrrUmlM2Sw-u1BN8a4M62n7uBN_KVls84jZ9jyjUE&e= 0x0000000000717e2d in main ()

The lines below might hint at the cause of the crash. You may get help by asking at the ROOT forum https://root.cern.ch/forum https://urldefense.proofpoint.com/v2/url?u=https-3A__root.cern.ch_forum&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=i3tyGUu4hNtSGLl-xagdDGB7yqxWOqzdmE0ow_Nvk6E&e= Only if you are really convinced it is a bug in ROOT then please submit a report at https://root.cern.ch/bugs https://urldefense.proofpoint.com/v2/url?u=https-3A__root.cern.ch_bugs&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=G4Urp4X3fKoLgZmj6hATWcY3hFjXgbIy9PqbSTpV8ac&e= Please post the ENTIRE stack trace from above as an attachment in addition to anything else that might help us fixing this issue.

5

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_5&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=i7qEKZF0I-Oiy0zZh8Ufm1GpxiIoqbDlYXV2xYZsGgQ&e= 0x0000000000a3051a in DTreeInterface::Fill(DTreeFillData&) ()

6

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=ZF7uihl1F4SuXtH-nvIRNTjNkaewBQO6fLGChdH8AzA&e= 0x0000000000827e0e in DEventWriterROOT::Fill_DataTree(jana::JEventLoop, DAnalysis::DReaction const, std::deque<DAnalysis::DParticleCombo const, std::allocator<DAnalysis::DParticleCombo const> >&) const ()

7

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=uesfbdwVYK5UOmG2hdZqWORU-KtkEs95j0Kzl0PXCJY&e= 0x0000000000828fda in DEventWriterROOT::Fill_DataTrees(jana::JEventLoop*, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) const ()

8

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=xmZtBr8izlFNu08MILXcC1KgXLtU4gMfz3tOKAim4hI&e= 0x00007f5678478d93 in DEventProcessor_ReactionFilter::evnt(jana::JEventLoop*, unsigned long) () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/halld_recon/halld_recon-4.43.1^ccdb1610/Linux_Alma9-x86_64-gcc11.4.1-cntr/plugins/ReactionFilter.so

9

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=0E3ZYHsmaRHL5udPyKTaUU8WUC8oPEYrRNgQNn97tKA&e= 0x0000000001486e62 in jana::JEventLoop::OneEvent (this=0x7f5640000b60) at src/JANA/JEventLoop.cc:693

10

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=XBHQRWH4F7Hll8uHcqkgCMfFEQzyfiTZOoLDC18rHxI&e= 0x0000000001487484 in jana::JEventLoop::Loop (this=this entry=0x7f5640000b60) at src/JANA/JEventLoop.cc:496

11

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=wKr7luDXfmpKB7wU0tk5y6cuY777pOQ-Q33Ii9vOeWc&e= 0x000000000145c3b5 in LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1382

12

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=-Cfy3w9MsYu14a70WRmPCQZf9w3hhYpXb2kY2VNp-6U&e= 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6

13

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_13&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=wrGS93xy4R2BLFi0e19dH7iLgLzVgIB-pN-GzKuf5cU&e= 0x00007f5688c21314 in clone () from /lib64/libc.so.6

===========================================================

Quitting after error, code=139 Reason: hd_root crashed total 1784

— Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_issues_613-23issuecomment-2D1917306922&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=I6pmVnAV_zay6IK3Qk8-wZ-vUJ__KAmVSEyDcrH94ys&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADF7ACYAZ4SWMFN436SL5ATYREJB3AVCNFSM5LPXSIBKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJRG4ZTANRZGIZA&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=DexCmBqjcX1a_lAGikdL06Lb8BJ93cikZh12UdrDhswuNWVa2CKaBfFmOr__IvZq&s=FKhkG2GKO9aMsglUbSh5mNINrIElRS2g9lo69n93VTs&e=. You are receiving this because you are subscribed to this thread.Message ID: @.***>

--------------DFZvuPsjD53Tm0ouNDR7mqUR Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

<!DOCTYPE html>

I suspect that if you increase the number of threads the probability to fail will increase.

On 1/30/24 10:49, Richard Jones wrote:

Good news, after many attempts, I finally saw a crash. It might take a while to diagnose it, being as rare as it seems to be in this alma9 environment. But it is definitely a thing. Here is the full crash dump. You can see that it happens after the last event has been read from the input file, so it is a program termination bug. -Richard
JANA >> --- Configuration Parameters --
JANA >> COMBO:MAX_NEUTRALS = 15
JANA >> EVENTS_TO_KEEP = 1000000000
JANA >> JANA:RESOURCE_DEFAULT_PATH =
JANA >> NTHREADS = 8
JANA >> PLUGINS = monitoring_hists,ReactionFilter
JANA >> Reaction1 = 1_14__7_8_9_14
JANA >> Reaction1:Flags = B0_M7
JANA >> Reaction2 = 1_14__1_7_14
JANA >> Reaction2:Flags = B0_M7
JANA >> THREAD_TIMEOUT = 600
JANA >> THREAD_TIMEOUT_FIRST_EVENT = 3600
JANA >> -------------------------------
JANA >>events processed (9.7k events read) 764.0Hz (avg.: 474.5Hz)
JANA >>
JANA >>No more event sources
JANA >>No more event sources
JANA >>
JANA >>No more event sources
JANA >>Thread 0x7f565f7fe640 completed gracefully: Tue Jan 30 10:43:33 2024
JANA >>Thread 0x7f5671189640 completed gracefully: Tue Jan 30 10:43:33 2024
JANA >>Thread 0x7f565ffff640 completed gracefully: Tue Jan 30 10:43:33 2024
JANA >>
JANA >>No more event sources
JANA >>Thread 0x7f565dffb640 completed gracefully: Tue Jan 30 10:43:33 2024
JANA >>
JANA >>No more event sources
JANA >>Thread 0x7f565effd640 completed gracefully: Tue Jan 30 10:43:33 2024
JANA >>
JANA >>No more event sources
JANA >>Thread 0x7f5664d8e640 completed gracefully: Tue Jan 30 10:43:33 2024
JANA >>
JANA >>No more event sources
JANA >>Thread 0x7f5670988640 completed gracefully: Tue Jan 30 10:43:33 2024
JANA >>Merging thread 0 (0x7f565f7fe640) ...) 530.0Hz (avg.: 475.8Hz)
JANA >>Merging thread 1 (0x7f5671189640) ...
JANA >>Merging thread 2 (0x7f565ffff640) ...
JANA >>Merging thread 3 (0x7f565dffb640) ...
JANA >>Merging thread 4 (0x7f565effd640) ...

===========================================================
There was a crash.
This is the entire stack trace of all threads:

Thread 6 (Thread 0x7f565e7fc640 (LWP 1838177) "hd_root"):
#0 0x00007f5688cfa30f in wait4 () from /lib64/libc.so.6
#1 0x00007f5688c43953 in do_system () from /lib64/libc.so.6
#2 0x00007f568b993cbc in TUnixSystem::StackTrace() () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/root/root-6.24.04/lib/libCore.so
#3 0x00007f568b9912f5 in TUnixSystem::DispatchSignals(ESignals) () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/root/root-6.24.04/lib/libCore.so
#4
#5 0x0000000000a3051a in DTreeInterface::Fill(DTreeFillData&) ()
#6 0x0000000000827e0e in DEventWriterROOT::Fill_DataTree(jana::JEventLoop*, DAnalysis::DReaction const*, std::deque<DAnalysis::DParticleCombo const*, std::allocator<DAnalysis::DParticleCombo const*> >&) const ()
#7 0x0000000000828fda in DEventWriterROOT::Fill_DataTrees(jana::JEventLoop*, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) const ()
#8 0x00007f5678478d93 in DEventProcessor_ReactionFilter::evnt(jana::JEventLoop*, unsigned long) () from /group/halld/Software/builds/Linux_Alma9-x86_64-gcc11.4.1-cntr/halld_recon/halld_recon-4.43.1^ccdb1610/Linux_Alma9-x86_64-gcc11.4.1-cntr/plugins/ReactionFilter.so
#9 0x0000000001486e62 in jana::JEventLoop::OneEvent (this=0x7f5640000b60) at src/JANA/JEventLoop.cc:693
#10 0x0000000001487484 in jana::JEventLoop::Loop (this=this
entry=0x7f5640000b60) at src/JANA/JEventLoop.cc:496
#11 0x000000000145c3b5 in LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1382
#12 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6
#13 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7f565effd640 (LWP 1838176) "hd_root"):
#0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from /lib64/libc.so.6
#1 0x00007f5688c87d8f in pthread_rwlock_wrlock
GLIBC_2.2.5 () from /lib64/libc.so.6
#2 0x0000000000a2f647 in DTreeInterface::~DTreeInterface() ()
#3 0x0000000000800899 in DEventWriterROOT::~DEventWriterROOT() ()
#4 0x0000000000800e69 in DEventWriterROOT::~DEventWriterROOT() ()
#5 0x00000000007424c6 in DEventWriterROOT_factory::fini() ()
#6 0x00000000014867e9 in jana::JEventLoop::JEventLoop (this=0x7f564c000b60, __in_chrg=) at /usr/include/c++/11/bits/stl_vector.h:1043
#7 0x0000000001486a99 in jana::JEventLoop::JEventLoop (this=0x7f564c000b60, __in_chrg=) at src/JANA/JEventLoop.cc:152
#8 0x000000000145c442 in __pthread_cleanup_class::
__pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578
#9 __pthread_cleanup_class::
__pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578
#10 LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1391
#11 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6
#12 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7f5664d8e640 (LWP 1838173) "hd_root"):
#0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from /lib64/libc.so.6
#1 0x00007f5688c87d8f in pthread_rwlock_wrlock
GLIBC_2.2.5 () from /lib64/libc.so.6
#2 0x0000000000a2f647 in DTreeInterface::~DTreeInterface() ()
#3 0x0000000000800899 in DEventWriterROOT::~DEventWriterROOT() ()
#4 0x0000000000800e69 in DEventWriterROOT::~DEventWriterROOT() ()
#5 0x00000000007424c6 in DEventWriterROOT_factory::fini() ()
#6 0x00000000014867e9 in jana::JEventLoop::JEventLoop (this=0x7f5650000b60, __in_chrg=) at /usr/include/c++/11/bits/stl_vector.h:1043
#7 0x0000000001486a99 in jana::JEventLoop::JEventLoop (this=0x7f5650000b60, __in_chrg=) at src/JANA/JEventLoop.cc:152
#8 0x000000000145c442 in __pthread_cleanup_class::
__pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578
#9 __pthread_cleanup_class::
__pthread_cleanup_class (this=, __in_chrg=) at /usr/include/pthread.h:578
#10 LaunchThread (arg=0x7ffc7a5f93d0) at src/JANA/JApplication.cc:1391
#11 0x00007f5688c81802 in start_thread () from /lib64/libc.so.6
#12 0x00007f5688c21314 in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f5670988640 (LWP 1838172) "hd_root"):
#0 0x00007f5688c7e319 in __futex_abstimed_wait_common () from /lib64/libc.so.6
#1 0x00007f5688c87d8f in pthread_rwlock_wrlock
GLIBC_2.2.5 () from /lib64/libc.so.6

rjones30 commented 8 months ago

I have found and fixed this bug. It is an inplementation error in the libraries/ANALYSIS code, not a problem with ROOT or other external libraries. There was a serious blunder in the way storage for TClonesArray branches of the reactions tree was being allocated. The original scheme assumed that once a std::map entry was created for a given key, the storage for that element never moves. That is not guaranteed by the std::map standard, and in fact is not respected by any of the g++ compiler versions in the last 5 years. The fact that we have not seen segfaults when running under RHEL7 builds does NOT mean that the results are free of the effects of this bug. In fact, it was probably there at some level all along, and in principle it affected both single-threaded and multi-threaded running. The fact that segfaults do not occur with RHEL7 builds is not proof the the bug was not affecting the output.

I have submitted a PR that I claim fixes the problem. I have only tested it on alma9, but it should work on RHEL8 as well. Please check and confirm. I also included another bug fix to DEventWriterROOT.cc in the same PR. These fixes together are needed to run safely, including but not limited to platforms with gcc v.6 or greater

nsjarvis commented 8 months ago

Awesome. I will check it out. Thank you.

nsjarvis commented 8 months ago

I confirm that the code completes correctly with Richard's PR (but does not with version set 5.11.0). Will merge it. I tested using 32 threads, about 1/64 of the events were missing from the buggy trees; hd_root.root was a few kB & unreadable instead of MB.

mashephe commented 8 months ago

On Feb 9, 2024, at 8:53 PM, Richard Jones @.***> wrote:

I have found and fixed this bug. It is an inplementation error in the libraries/ANALYSIS code, not a problem with ROOT or other external libraries. There was a serious blunder in the way storage for TClonesArray branches of the reactions tree was being allocated. The original scheme assumed that once a std::map entry was created for a given key, the storage for that element never moves. That is not guaranteed by the std::map standard, and in fact is not respected by any of the g++ compiler versions in the last 5 years. The fact that we have not seen segfaults when running under RHEL7 builds does NOT mean that the results are free of the effects of this bug.

Richard: that is a great catch and a very subtle bug! Indeed this is a recipe to get garbage output without warning.

I once made the same mistake but with a stl::vector instead of stl::map. The vector is much more prone to this problem because while the stl standard does not require the memory location be fixed, for a vector it must be continuous. As the vector grows then this often necessitates movement in memory to a larger free block in order to maintain the continuity requirement. The stl is great and flexible, but indeed sometimes one has to think carefully about what is happening behind the scenes.

Thanks so much for tracking this down.

Matt

aaust commented 8 months ago

PS: Fixed by PR #785 . The total number of events in a test tree case did not change.

aaust commented 8 months ago

PPS: Comparison of the above-mentioned test case

The differences have the same order of magnitude as previously observed reproducibility issues, they are not caused or solved by this fix.

Reaction 4.43.1 master with PR #785
tree_ggkpkm__B4 16427 16429
tree_ggpippim__B4 54361 54357
treepippimB4 46760 46759

-PREST:JANACALIBCONTEXT=calibtime=2020-07-24-00-00-01 was added to the command to obtain the correct tagger energies

-PReaction3=1_14__8_9_14 -PReaction3:Flags=B4 was added to check high-statistics reaction