Open nsoblath opened 7 years ago
It happened again on Monday, 1:53pm EST. Recorded error message in the elog (not copying it here because it's the same thing, no need to keep repeating it)
I saw it suggested that one potential cause of the "pure virtual method called" error is if a virtual method is called from a destructor, it can sometimes be that the function that's supposed to be called has already been deleted before it's called.
I checked through the classes in the control and daq libraries, in midge/core, and in monarch3, and I didn't find any suspicious destructors.
It happened again on Monday, 1:53pm EST. Recorded error message in the elog https://maxwell.npl.washington.edu/elog/project8/Project+8/1683 (not copying it here because it's the same thing, no need to keep repeating it)
Here's what I know so far:
The infinite loop closing library
issue is probably the result of HDF5 commands being called after the global HDF5 cleanup has already been called. This is a secondary problem, caused by something else going wrong.
The root cause, I believe, is described in this section:
HDF5-DIAG: Error detected in HDF5 (1.8.16) thread 139886732498688:
#000: ../../../src/H5D.c line 993 in H5Dset_extent(): not a dataset
major: Invalid arguments to routine
minor: Inappropriate type
19:08:19 [ERROR] /core/diptera.cc(292): non-node exception thrown: HDF5 error while writing a record:
H5Dset_extent failed (function: DataSet::extend)
In the C++ library, DataSet::extend()
calls function H5Dset_extent()
in the C library. The latter function has an error here in this bit of code:
if(NULL == (dset = (H5D_t *)H5I_object_verify(dset_id, H5I_DATASET)))
HGOTO_ERROR(H5E_ARGS, H5E_BADTYPE, FAIL, "not a dataset")
This is checking whether one of the arguments, dset_id
, which should be the ID number of the dataset, is in fact a dataset. The dataset ID comes from member variable id
of the DataSet C++ object.
The only place that DataSet::extend()
is called in Monarch is in M3Stream.cc, at line 453, in function M3Stream::WriteRecord()
. The extend
function is called on fH5CurrentAcqDataSet
, which is a pointer to a DataSet object. I assume the pointer is valid, because if not we would have a segfault instead of the crash that we have. Perhaps the DataSet object isn't initialized correctly. In the default constructor it's initialized to 0.
I've added some diagnostic printing to the exception catching in M3Stream::WriteRecord()
(starting at line 468):
LWARN( mlog, "DIAGNOSTIC: id of fH5CurrentAcqDataSet: " << fH5CurrentAcqDataSet->getId() );
LWARN( mlog, "DIAGNOSTIC: class name: " << fH5CurrentAcqDataSet->fromClass() );
H5D_space_status_t t_status;
fH5CurrentAcqDataSet->getSpaceStatus( t_status );
LWARN( mlog, "DIAGNOSTIC: offset: " << fH5CurrentAcqDataSet->getOffset() << " space status: " << t_status << " storage size: " << fH5CurrentAcqDataSet->getStorageSize() << " in mem data size: " << fH5CurrentAcqDataSet->getInMemDataSize() );
These should tell us how the DataSet object is configured, to some extent.
For the record, during writing, fH5CurrentAcqDataSet
is initialized on line 447:
fH5CurrentAcqDataSet = new H5::DataSet( fH5AcqLoc->createDataSet( fAcqNameBuffer, fDataTypeInFile, H5::DataSpace( N_DATA_DIMS, fStrDataDims, fStrMaxDataDims ), tPropList ) );