Closed filefolder closed 2 years ago
Let me ask you few details:
scrtdd
are you running?scrtdd
that worked in the past, is the same version that now crashes? That is, you only updated seiscomp code, not scrtdd, right? I wonder if the issue is with the master branch of seiscomp. That normally works but it is not as stable as the releases branch (currently v4
).
However, we can see where the code crashes using the debugger gdb
and reporting the output here.
How to run the debugger gdb
?
gdb --args
before your command that is gdb --args scrtdd --reloc-catalog $csv --ep $xml --profile $profile ....[etc]
r
(for run)bt
(backtrace) which will print where the code crashed. You can copy paste that output here so that I can evaluate it.Important: before running the debugger, you need to make sure the debugging information are present in the compiled code, which means you have to set CMAKE_BUILD_TYPE
to RelWithDebInfo
in ccmake.
I do suspect that maybe it is an issue with the current seiscomp as scrtdd was working very recently and I just went over a somewhat similar crash issue with boost: https://github.com/SeisComP/main/issues/34
Otherwise I am using the most recent scrtdd here.
I will rebuild with the new CMAKE_BUILD_TYPE set and try again tomorrow.
Have you recompiled scrtdd
after you upgraded SeisComP?
Have you recompiled
scrtdd
after you upgraded SeisComP?
I believe so, assuming it compiles with the rest of SC. The scrtdd source is kept in build/extras and the age of the scrtdd binary is the same as all the ones I recompiled yesterday.
edit: here is the relevant output
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/__/3rd-party/lsqr-lsmr/lsmr.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/__/3rd-party/lsqr-lsmr/lsqr.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/utils.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/solver.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/csvreader.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/catalog.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/sccatalog.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/waveform.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/clustering.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/ttt.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/scttt.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/nllttt.cpp.o
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/CMakeFiles/seiscomp_hdd.dir/hypodd.cpp.o
[ 99%] Linking CXX shared library ../../../../../lib/libseiscomp_hdd.so
[ 99%] Built target seiscomp_hdd
[ 99%] Building CXX object src/extras/scrtdd/libs/hdd/test/CMakeFiles/test_hdd_waveform.dir/waveform.cpp.o
[ 99%] Linking CXX executable ../../../../../../bin/test_hdd_waveform
[ 99%] Built target test_hdd_waveform
[100%] Building CXX object src/extras/scrtdd/libs/hdd/test/CMakeFiles/test_hdd_ellipsoid.dir/ellipsoid.cpp.o
[100%] Linking CXX executable ../../../../../../bin/test_hdd_ellipsoid
[100%] Built target test_hdd_ellipsoid
[100%] Building CXX object src/extras/scrtdd/libs/hdd/test/CMakeFiles/test_hdd_ttt.dir/ttt.cpp.o
[100%] Linking CXX executable ../../../../../../bin/test_hdd_ttt
[100%] Built target test_hdd_ttt
[100%] Building CXX object src/extras/scrtdd/libs/hdd/test/CMakeFiles/test_hdd_clustering.dir/clustering.cpp.o
In file included from /data/seiscomp/SRC/src/extras/scrtdd/libs/hdd/test/clustering.cpp:8:
/data/seiscomp/SRC/src/extras/scrtdd/../../base/common/libs/seiscomp/logging/log.h:26:6: warning: #warning SEISCOMP_COMPONENT not defined -> setting to "Seiscomp" [-Wcpp]
26 | # warning SEISCOMP_COMPONENT not defined -> setting to "Seiscomp"
| ^~~~~~~
[100%] Linking CXX executable ../../../../../../bin/test_hdd_clustering
[100%] Built target test_hdd_clustering
[100%] Building CXX object src/extras/scrtdd/libs/hdd/test/CMakeFiles/test_hdd_dd.dir/dd.cpp.o
In file included from /data/seiscomp/SRC/src/extras/scrtdd/libs/hdd/test/dd.cpp:10:
/data/seiscomp/SRC/src/extras/scrtdd/../../base/common/libs/seiscomp/logging/log.h:26:6: warning: #warning SEISCOMP_COMPONENT not defined -> setting to "Seiscomp" [-Wcpp]
26 | # warning SEISCOMP_COMPONENT not defined -> setting to "Seiscomp"
| ^~~~~~~
[100%] Linking CXX executable ../../../../../../bin/test_hdd_dd
[100%] Built target test_hdd_dd
[100%] Building CXX object src/extras/scrtdd/libs/rtddmsg/CMakeFiles/seiscomp_rtddmsg.dir/rtddmsg.cpp.o
[100%] Linking CXX shared library ../../../../../lib/libseiscomp_rtddmsg.so
[100%] Built target seiscomp_rtddmsg
[100%] Building CXX object src/extras/scrtdd/apps/scrtdd/CMakeFiles/scrtdd.dir/app.cpp.o
[100%] Building CXX object src/extras/scrtdd/apps/scrtdd/CMakeFiles/scrtdd.dir/rtdd.cpp.o
[100%] Building CXX object src/extras/scrtdd/apps/scrtdd/CMakeFiles/scrtdd.dir/main.cpp.o
[100%] Linking CXX executable ../../../../../bin/scrtdd
[100%] Built target scrtdd
[100%] Building CXX object src/extras/scrtdd/plugins/locator/rtddloc/CMakeFiles/rtddloc.dir/rtddloc.cpp.o
[100%] Linking CXX shared module ../../../../../../lib/rtddloc.so
[100%] Built target rtddloc
and here is the gdb output
Thread 1 "scrtdd" received signal SIGSEGV, Segmentation fault.
Seiscomp::DataModel::DatabaseArchive::close (this=this@entry=0x555555682dd0) at /data/seiscomp/SRC/src/base/common/libs/seiscomp/datamodel/databasearchive.cpp:645
645 _db = nullptr;
and backtrace
(gdb) bt
#0 Seiscomp::DataModel::DatabaseArchive::close (this=this@entry=0x555555682dd0) at /data/seiscomp/SRC/src/base/common/libs/seiscomp/datamodel/databasearchive.cpp:645
#1 0x00007ffff769c725 in Seiscomp::DataModel::DatabaseReader::~DatabaseReader (this=0x555555682dd0, __in_chrg=<optimized out>) at /data/seiscomp/SRC/src/base/common/libs/seiscomp/datamodel/databasereader.cpp:52
#2 0x00007ffff76a703d in Seiscomp::DataModel::DatabaseQuery::~DatabaseQuery (this=0x555555682dd0, __in_chrg=<optimized out>) at /data/seiscomp/SRC/src/base/common/libs/seiscomp/datamodel/databasequery.cpp:48
#3 0x00007ffff76dd37a in Seiscomp::Core::BaseObject::decrementReferenceCount (this=<optimized out>) at /data/seiscomp/SRC/src/base/common/libs/seiscomp/core/baseobject.inl:51
#4 Seiscomp::Core::intrusive_ptr_release (p=<optimized out>) at /data/seiscomp/SRC/src/base/common/libs/seiscomp/core/baseobject.inl:32
#5 boost::intrusive_ptr<Seiscomp::DataModel::DatabaseArchive>::~intrusive_ptr (this=0x7fffffffddc0, __in_chrg=<optimized out>) at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:98
#6 Seiscomp::DataModel::PublicObjectCache::~PublicObjectCache (this=0x7fffffffddb0, __in_chrg=<optimized out>) at /data/seiscomp/SRC/src/base/common/libs/seiscomp/datamodel/publicobjectcache.cpp:148
#7 0x000055555559be62 in Seiscomp::DataModel::PublicObjectTimeSpanBuffer::~PublicObjectTimeSpanBuffer (this=0x7fffffffddb0, __in_chrg=<optimized out>)
at /data/seiscomp/SRC/src/extras/scrtdd/../../base/common/libs/seiscomp3/datamodel/publicobjectcache.h:232
#8 Seiscomp::RTDD::~RTDD (this=0x7fffffffd3b0, __in_chrg=<optimized out>) at /data/seiscomp/SRC/src/extras/scrtdd/apps/scrtdd/rtdd.cpp:365
#9 0x00005555555881a4 in main (argc=<optimized out>, argv=<optimized out>) at /data/seiscomp/SRC/src/extras/scrtdd/apps/scrtdd/main.cpp:28
I hope this helps? Somewhat of interest, but when running gdb i did not pipe the output xml to a file and just let it print, in doing so the xml output seemed complete without cutting off.
edit: here is the final output without piping the xml print to file
.....
</EventParameters>
</seiscomp>
22:36:18 [info] Disable timer
22:36:18 [debug] Unload plugin 'MySQL database driver'
22:36:18 [debug] Unload plugin 'Locator implementation using NonLinLoc by Anthony Lomax (http://alomax.free.fr/nlloc/)'
22:36:18 [debug] Unload plugin 'MLa magnitude. Calculates magnitude based on universal formulae MLa=c0_log10(Amp)+c1*log10(delta*c3+c4)+c5*(delta+c6), where coefficients c1...6 vary based on epicentral location.'
22:36:18 [debug] Leaving ::done
22:36:18 [info] Unloading profile my_profile
Segmentation fault (core dumped)
$
So the segfault is happening within seiscomp lib. That doesn't mean scrtdd is not responsible for the bug, though. To restrict my search for the segfault cause, I have to ask you to switch seiscomp to branch v4
, recompile and run scrtdd again. If the problem doesn't show up anymore I know where to look for the issue.
Somewhat of interest, but when running gdb i did not pipe the output xml to a file and just let it print, in doing so the xml output seemed complete without cutting off.
From the debugger I can see the segfault happens at the end of everything, after the processing, after the xml file is generated. In the pipe case, the xml file is truncated probably because the application crashes before the xml content is totally written to file, although it was already generated.
It may be a while before I can get to that, but I can confirm that the code worked in early December and I tend to refresh SC on this machine fairly often. It could not have been older than 4.7.2.
This might be related to https://github.com/SeisComP/common/pull/49 which now resets everything in Application::done
. According to the output posted there is still something unloading after done
was called. Can you check your application code? After done
all interfaces and plugins are wiped. Accessing pointers still pointing to e.g. the MYSQL interface will result in a segmentation fault.
@filefolder, you don't need necessarily build against V4. There were some deep changes necessary in how an application is cleaned up and that might break something now with scrtdd. That is not desirable but maybe a good opportunity to fix the clean up procedure.
@gempa-jabe I will have a look at the scrtdd Application
code and let you know
RTDD
destcrutor callRTDD
class member DataModel::PublicObjectTimeSpanBuffer _cache
is being destroyedPublicObjectTimeSpanBuffer
holds an intrusive pointer DatabaseArchivePtr _archive
PublicObjectTimeSpanBuffer
destructor is called, the DatabaseArchivePtr _archive
member destructor is called as well and that causes the segfault
Looking at the changes of Application::done
from branch v4
to master
I can see that now a call to System::Application::done()
is performed, which in turn calls PluginRegistry::Reset()
.
Since the database is a plugin based interface, I suspect the new call to System::Application::done()
makes the DatabaseArchivePtr
, held by PublicObjectTimeSpanBuffer
, invalid. If this is true I can simply reset RTDD::_cache
member before calling the super class Application::done
.
However, when I wrote that code I was not very proficient with seiscomp and I copied the approach from existing code (scwfparam
) so I suspect there are other such cases around the code base.
@gempa-jabe thoughts?
That is true. So the best would be to call PluginRegistry::Reset()
in the application's destructor to avoid further pitfalls with other modules. Can you reproduce the crash?
Can you reproduce the crash?
not yet, I'll let you know soon
I pushed https://github.com/SeisComP/common/commit/a850e6c95e2e3af15cc9bbb04c4c7dedbb01f9f1 which should fix the issue.
@filefolder please give it another try.
@gempa-jabe I cannot replicate the segfault on my side, so I cannot verify my hypothesis
seems to be working! thanks both for getting to this so quickly. i leave it to you to close
(excuse typo in title *while not whole)
Hi, recently I have found that scrtdd segfaults near the very end of its process, while writing the output xml
code is ran like this:
scrtdd --reloc-catalog $csv --ep $xml --profile $profile --author="scrtdd" --verbosity=4 --console=1 -d $conn --xmlout > relocated_picks.$profile.scxml
here's an example of the final xml file, abruptly stopped
It was working fine until somewhat recently, after a few system and seiscomp upgrades. Currently on the latest seiscomp master branch, ubuntu 20.04.