Open bryngemark opened 2 years ago
This may be related to the recent patch to Framework where an extra copy that was causing memory to be mismanaged was taking place. https://github.com/LDMX-Software/Framework/pull/59
I was not able to originally reproduce this error, but it may be time to try to reproduce this error again to see if that patch resolved the issue.
On and off, there are
terminate called after throwing an instance of 'std::bad_alloc'
. I've seen this trying to run the ecal veto and most recently when trying to runv3.0.0
re-digi ofEcalSimHits
in files produced withv2.3.0
. It doesn't happen on all files, but maybe 75% of them in my tests so far.The weird thing is, that if the terminal printout verbosity increases (going from default
p.termLogLevel = 2
to 0 or even 1, the problem disappears. But only if there is a log file specified too, with log level 0 or 1 (at 2, crashes reappear).To reproduce: run the ecal digi parts of this template config on e.g. this input file:
/nfs/slac/g/ldmx/data/mc20/v12/4.0GeV/v2.3.0-batch24/mc_v12-4GeV-1e-ecal_photonuclear_run230005_t1608608718.root
with the pro_v3.0.0 singularity image:/nfs/slac/g/ldmx/production/singularityImages/ldmx-pro_v3.0.0-gLDMX.10.2.3_v0.4-r6.22.00-onnx1.3.0-xerces3.2.3-ubuntu18.04.sif
using singularity version 3.8.6-1.el7 at slac.Curiosity: Tom was not able to reproduce this locally: jobs ran fine regardless of verbosity. I have been able to reproduce it with LDCS on a number of files from
-batch24
above.