Open MatthijsBurgh opened 2 years ago
As I wasn't able to get gdb working to debug the plugin as rqt runs in python.
Use gdb --args ${python_executable} ${path_to_python_script} ${other_args}
, where:
python_executable
is python
or python3
.path_to_python_script
is the path to rqt in your workspace.other_args
if you were passing other arguments to rqt.With that, you should be able to get a traceback. Thanks!
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7db1859 in __GI_abort () at abort.c:79
#2 0x00007fffeec0a911 in () at /lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fffeec1638c in () at /lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fffeec163f7 in () at /lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fffeec166a9 in () at /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fffc31d8701 in void boost::throw_exception<boost::lock_error>(boost::lock_error const&)
() at /opt/ros/noetic/lib/libqt_gui_cpp.so
#7 0x00007fffc232d624 in boost::mutex::lock() (this=<optimized out>)
at /usr/include/c++/9/bits/char_traits.h:300
#8 boost::unique_lock<boost::mutex>::lock() (this=<synthetic pointer>)
at /usr/include/boost/thread/lock_types.hpp:346
#9 boost::unique_lock<boost::mutex>::unique_lock(boost::mutex&) (m_=..., this=<synthetic pointer>)
at /usr/include/boost/thread/lock_types.hpp:124
#10 ros::Time::now() () at ./src/time.cpp:230
#11 ros::Time::now() () at ./src/time.cpp:221
#12 0x00007fffc1f60101 in ros::Subscription::handleMessage(ros::SerializedMessage const&, bool, bool, boost::shared_ptr<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > const&, boost::shared_ptr<ros::PublisherLink> const&) () at /opt/ros/noetic/lib/libroscpp.so
#13 0x00007fffc1f404d5 in ros::TransportPublisherLink::handleMessage(ros::SerializedMessage const&, bool, bool) () at /opt/ros/noetic/lib/libroscpp.so
#14 0x00007fffc1f40196 in ros::TransportPublisherLink::onMessage(boost::shared_ptr<ros::Connection> const&, boost::shared_array<unsigned char> const&, unsigned int, bool) ()
at /opt/ros/noetic/lib/libroscpp.so
#15 0x00007fffc1ec9baf in ros::Connection::readTransport() () at /opt/ros/noetic/lib/libroscpp.so
#16 0x00007fffc1f3bc65 in ros::TransportTCP::socketUpdate(int) ()
at /opt/ros/noetic/lib/libroscpp.so
#17 0x00007fffc1f6ff67 in ros::PollSet::update(int) () at /opt/ros/noetic/lib/libroscpp.so
#18 0x00007fffc1f01dbd in ros::PollManager::threadFunc() () at /opt/ros/noetic/lib/libroscpp.so
#19 0x00007fffc1dae43b in () at /lib/x86_64-linux-gnu/libboost_thread.so.1.71.0
#20 0x00007ffff7d74609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#21 0x00007ffff7eae133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
So the segfault is comming from https://github.com/ros/roscpp_core/blob/72ce04f8b2849e0e4587ea4d598be6ec5d24d8ca/rostime/src/time.cpp#L230.
Though the rostime code, doesn't look that weird too me. So I think the real problem is somewhere else IMO.
Though the rostime code, doesn't look that weird too me.
Global variables with non-trivial construction or destruction are harmful. Construction order issues can be easily solved with the "initialize on first use" pattern (which is not being used here). Destruction order issues can sometimes be solved, but it's more complex.
So I think the real problem is somewhere else IMO.
I think you're right.
There seems to be still an executor running in a thread, we should stop it before closing.
And for some reason that executor has a pointer to a subscription with a timer that was already destroyed.
I haven't looked at the qt_gui_core
, but if you find a thread spinning, that should be stopped and the thread joined before closing (or destroying other things).
Running on noetic
When closing rqt, while a cpp plugin is still active, I get a segmentation fault or the following aborted.
The problem only occurs when the plugin is still active on closure. When you close the plugin manually before closing rqt, the problem doesn't occur.
I tried some more debugging on
rqt_image_view
. (I also experience the issue withmapviz
, but I didn't do any debugging on that plugin.)I figured out that the destructor is not being called when closing rqt with the plugin active. Or at least not being called before the segmentation fault occurs.
Debugging
As I wasn't able to get gdb working to debug the plugin as rqt runs in python. I added some print statements to debug what happens. I just drop it here as it might help you to solve the issue.
When closing the cpp plugin(
rqt_image_view
) manually:When closing rqt with an active cpp plugin(
rqt_image_view
):