ros-visualization / rviz

ROS 3D Robot Visualizer
BSD 3-Clause "New" or "Revised" License
846 stars 463 forks source link

PointCloud crashes when switching .rviz files #1753

Closed ysl-design closed 2 years ago

ysl-design commented 2 years ago

Describe your issue here and explain how to reproduce it.

The description may be a little too much, please be patient to read it : )

Your environment

My scenario: Hi, I've added a dozen display plugins to rviz, including pointcloud2, marker, markerArray, etc. I then saved the settings to the xxx.rviz file. Later, I loaded the xxx.rviz file several times and opened the file through ‘File -> Open Config’. (The corresponding data is still being sent when the config file is switched.) I found that rviz occasionally crashed.

I found that there are three reasons for crashing, all related to the PointCloudCommon class: (1) Based on the backtrace and code analysis, it is found that an emitTimeSignal signal is sent in the PointCloudCommon::processMessage function. This signal transfers the pointer pointing to the pointcloud2 plug-in to the TimePanel::onTimeSignal function. In some cases, the pointcloud2 plug-in is destroyed before TimePanel::onTimeSignal is executed. As a result, the display pointer transferred to the TimePanel::onTimeSignal function becomes invalid, and a segment fault occurs when an invalid memory is accessed. I wonder if you can avoid this by adding a judgment at the top of the TimePanel::onTimeSignal function that determines whether sender() is a null pointer.

the backtrace shows that:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `rviz -d /home/ysl/Download/critical/RViz'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f1403a289b7 in rviz::TimePanel::onTimeSignal(rviz::Display*, ros::Time) () from /opt/ros/melodic/lib/librviz.so
[Current thread is 1 (Thread 0x7f1403e94cc0 (LWP 4579))]
(gdb) bt
#0  0x00007f1403a289b7 in rviz::TimePanel::onTimeSignal(rviz::Display*, ros::Time) () at /opt/ros/melodic/lib/librviz.so
#1  0x00007f1403a71a33 in  () at /opt/ros/melodic/lib/librviz.so
#2  0x00007f1402bc5092 in QObject::event(QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#3  0x00007f14031e974b in QWidget::event(QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#4  0x00007f14031aa83c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#5  0x00007f14031b2104 in QApplication::notify(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#6  0x00007f1402b958a8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#7  0x00007f1402b9801d in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#8  0x00007f1402bef233 in  () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#9  0x00007f13fbf2d537 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#10 0x00007f13fbf2d770 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x00007f13fbf2d7fc in g_main_context_iteration () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x00007f1402bee85f in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#13 0x00007f1402bc4525 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#14 0x00007f1403a6ff02 in rviz::VisualizationFrame::statusUpdate(QString const&) () at /opt/ros/melodic/lib/librviz.so
#15 0x00007f1402bc4525 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#16 0x00007f1403a70035 in rviz::VisualizationManager::statusUpdate(QString const&) () at /opt/ros/melodic/lib/librviz.so
#17 0x00007f1403a56ca1 in rviz::VisualizationManager::load(rviz::Config const&) () at /opt/ros/melodic/lib/librviz.so
#18 0x00007f1403a5226b in rviz::VisualizationFrame::load(rviz::Config const&) () at /opt/ros/melodic/lib/librviz.so
#19 0x00007f1403a53091 in rviz::VisualizationFrame::loadDisplayConfigHelper(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () at /opt/ros/melodic/lib/librviz.so
#20 0x00007f1403a5325c in rviz::VisualizationFrame::loadDisplayConfig(QString const&) () at /opt/ros/melodic/lib/librviz.so
#21 0x00007f1403a540d9 in rviz::VisualizationFrame::onOpen() () at /opt/ros/melodic/lib/librviz.so
#22 0x00007f1402bc4525 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#23 0x00007f14031a4122 in QAction::triggered(bool) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#24 0x00007f14031a680c in QAction::activate(QAction::ActionEvent) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#25 0x00007f140332305c in  () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#26 0x00007f140332a50b in  () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#27 0x00007f140332b333 in QMenu::mouseReleaseEvent(QMouseEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#28 0x00007f14031e9038 in QWidget::event(QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#29 0x00007f140332d65b in QMenu::event(QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#30 0x00007f14031aa83c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#31 0x00007f14031b265f in QApplication::notify(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#32 0x00007f1402b958a8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#33 0x00007f14031b1632 in QApplicationPrivate::sendMouseEvent(QWidget*, QMouseEvent*, QWidget*, QWidget*, QWidget**, QPointer<QWidget>&, bool) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#34 0x00007f1403203e95 in  () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#35 0x00007f14032067ca in  () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#36 0x00007f14031aa83c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#37 0x00007f14031b2104 in QApplication::notify(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#38 0x00007f1402b958a8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#39 0x00007f13fd5225eb in QGuiApplicationPrivate::processMouseEvent(QWindowSystemInterfacePrivate::MouseEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#40 0x00007f13fd5240b5 in QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#41 0x00007f13fd4fb33b in QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#42 0x00007f13ebf56260 in  () at /usr/lib/x86_64-linux-gnu/libQt5XcbQpa.so.5
#43 0x00007f13fbf2d537 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#44 0x00007f13fbf2d770 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#45 0x00007f13fbf2d7fc in g_main_context_iteration () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#46 0x00007f1402bee85f in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#47 0x00007f1402b938da in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#48 0x00007f1402b9c984 in QCoreApplication::exec() () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#49 0x0000563c918e6dcf in main ()

(2)Another possible cause of the crash is that PointCloudCommon has been destructed and the mutex new_cloudsmutex has been destroyed. However, the lock operation is still performed in the PointCloudCommon::processMessage function, leading to the crash.

the backtrace shows that:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `rviz'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: no such file or directory.
[Current thread is 1 (Thread 0x7f058920f440 (LWP 9654))]
(gdb) bt
#0  0x00007f0587080e87 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f05870827f1 in __GI_abort () at abort.c:79
#2  0x00007f05870723fa in __assert_fail_base (fmt=0x7f05891dd76b <error: Cannot access memory at address 0x7f05891dd76b>, assertion=assertion@entry=0x7f0588d77b5c "/boost/thread/pthread/mutex.hpp", file=file@entry=0x7f0588d77b30 "r failed in pthread_mutex_init", line=line@entry=111, function=function@entry=0x7f0588d78070 <void boost::shared_ptr<tf::TransformListener>::reset<tf::TransformListener>(tf::TransformListener*)::__PRETTY_FUNCTION__+80> "ansformListener]") at assert.c:92
#3  0x00007f0587072472 in __GI___assert_fail (assertion=0x7f0588d77b5c "/boost/thread/pthread/mutex.hpp", file=0x7f0588d77b30 "r failed in pthread_mutex_init", line=111, function=0x7f0588d78070 <void boost::shared_ptr<tf::TransformListener>::reset<tf::TransformListener>(tf::TransformListener*)::__PRETTY_FUNCTION__+80> "ansformListener]") at assert.c:101
#4  0x00007f0588c41f7e in boost::mutex::~mutex() (this=0x55dbce411748, __in_chrg=<optimized out>) at /usr/include/boost/thread/pthread/mutex.hpp:111
#5  0x00007f04ae8b3b24 in message_filters::Signal1<[sensor_msgs::PointCloud2_<std::allocator<void](https://github.com/ros-visualization/rviz/issues/sensor_msgs::PointCloud2_%3Cstd::allocator%3Cvoid)> > >::~Signal1() (this=0x55dbce411748, __in_chrg=<optimized out>)
    at /opt/ros/melodic/include/message_filters/signal1.h:84
#6  0x00007f04ae8b3b7c in message_filters::SimpleFilter<[sensor_msgs::PointCloud2_<std::allocator<void](https://github.com/ros-visualization/rviz/issues/sensor_msgs::PointCloud2_%3Cstd::allocator%3Cvoid)> > >::~SimpleFilter() (this=0x55dbce411748, __in_chrg=<optimized out>)
    at /opt/ros/melodic/include/message_filters/simple_filter.h:60
#7  0x00007f04ae8b41c4 in tf2_ros::MessageFilter<[sensor_msgs::PointCloud2_<std::allocator<void](https://github.com/ros-visualization/rviz/issues/sensor_msgs::PointCloud2_%3Cstd::allocator%3Cvoid)> > >::~MessageFilter() (this=0x55dbce411740, __in_chrg=<optimized out>)
    at /opt/ros/melodic/include/tf2_ros/message_filter.h:226
#8  0x00007f04ae8b4210 in tf2_ros::MessageFilter<[sensor_msgs::PointCloud2_<std::allocator<void](https://github.com/ros-visualization/rviz/issues/sensor_msgs::PointCloud2_%3Cstd::allocator%3Cvoid)> > >::~MessageFilter() (this=0x55dbce411740, __in_chrg=<optimized out>)
    at /opt/ros/melodic/include/tf2_ros/message_filter.h:226
#9  0x00007f04ae8b30e1 in rviz::MessageFilterDisplay<[sensor_msgs::PointCloud2_<std::allocator<void](https://github.com/ros-visualization/rviz/issues/sensor_msgs::PointCloud2_%3Cstd::allocator%3Cvoid)> > >::~MessageFilterDisplay() (this=0x55dbd119a2a0, __in_chrg=<optimized out>)
    at /home/ysl/Downloads/rviz/rviz/src/rviz/message_filter_display.h:105
#10 0x00007f04ae8b1e1f in rviz::PointCloud2Display::~PointCloud2Display() (this=0x55dbd119a2a0, __in_chrg=<optimized out>)
    at /home/ysl/Downloads/rviz/rviz/src/rviz/default_plugin/point_cloud2_display.cpp:57
#11 0x00007f04ae8b1e3a in rviz::PointCloud2Display::~PointCloud2Display() (this=0x55dbd119a2a0, __in_chrg=<optimized out>)
    at /home/ysl/Downloads/rviz/rviz/src/rviz/default_plugin/point_cloud2_display.cpp:60
#12 0x00007f0588c3972c in rviz::DisplayGroup::removeAllDisplays() (this=0x7f0558008c10) at /home/ysl/Downloads/rviz/rviz/src/rviz/display_group.cpp:172
#13 0x00007f0588c38ee6 in rviz::DisplayGroup::load(rviz::Config const&) (this=0x7f0558008c10, config=...) at /home/ysl/Downloads/rviz/rviz/src/rviz/display_group.cpp:59
#14 0x00007f0588d507a2 in rviz::VisualizationManager::load(rviz::Config const&) (this=0x55dbcf0f7ff0, config=...) at /home/ysl/Downloads/rviz/rviz/src/rviz/visualization_manager.cpp:517
#15 0x00007f0588d408f8 in rviz::VisualizationFrame::load(rviz::Config const&) (this=0x55dbce34d500, config=...) at /home/ysl/Downloads/rviz/rviz/src/rviz/visualization_frame.cpp:868
#16 0x00007f0588d3fdf4 in rviz::VisualizationFrame::loadDisplayConfigHelper(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (this=0x55dbce34d500, full_path="/home/ysl/Downloads/critical/custom.rviz") at /home/ysl/Downloads/rviz/rviz/src/rviz/visualization_frame.cpp:784
#17 0x00007f0588d3fb11 in rviz::VisualizationFrame::loadDisplayConfig(QString const&) (this=0x55dbce34d500, qpath=...) at /home/ysl/Downloads/rviz/rviz/src/rviz/visualization_frame.cpp:751
#18 0x00007f0588d42659 in rviz::VisualizationFrame::onOpen() (this=0x55dbce34d500) at /home/ysl/Downloads/rviz/rviz/src/rviz/visualization_frame.cpp:1085
#19 0x00007f0588c0cab6 in rviz::VisualizationFrame::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) (_o=0x55dbce34d500, _c=QMetaObject::InvokeMetaMethod, _id=7, _a=0x7fff6df61f40)
    at /home/ysl/Downloads/rviz/rviz/cmake-build-debug/src/rviz/rviz_autogen/EWIEGA46WW/moc_visualization_frame.cpp:224
#20 0x00007f0587c8d525 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#21 0x00007f058826d122 in QAction::triggered(bool) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#22 0x00007f058826f80c in QAction::activate(QAction::ActionEvent) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#23 0x00007f05883ec05c in  () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#24 0x00007f05883f350b in  () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#25 0x00007f05883f4333 in QMenu::mouseReleaseEvent(QMouseEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#26 0x00007f05882b2038 in QWidget::event(QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#27 0x00007f05883f665b in QMenu::event(QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#28 0x00007f058827383c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#29 0x00007f058827b65f in QApplication::notify(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#30 0x00007f0587c5e8a8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#31 0x00007f058827a632 in QApplicationPrivate::sendMouseEvent(QWidget*, QMouseEvent*, QWidget*, QWidget*, QWidget**, QPointer<QWidget>&, bool) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#32 0x00007f05882cce95 in  () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#33 0x00007f05882cf7ca in  () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#34 0x00007f058827383c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#35 0x00007f058827b104 in QApplication::notify(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#36 0x00007f0587c5e8a8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#37 0x00007f05825ee5eb in QGuiApplicationPrivate::processMouseEvent(QWindowSystemInterfacePrivate::MouseEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#38 0x00007f05825f00b5 in QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#39 0x00007f05825c733b in QWindowSystemInterface::sendWindowSystemEvents(QFlags<qeventloop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
---Type <return> to continue, or q <return> to quit---
#40 0x00007f0571022260 in  () at /usr/lib/x86_64-linux-gnu/libQt5XcbQpa.so.5
#41 0x00007f0580ff9537 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#42 0x00007f0580ff9770 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#43 0x00007f0580ff97fc in g_main_context_iteration () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#44 0x00007f0587cb785f in QEventDispatcherGlib::processEvents(QFlags<qeventloop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#45 0x00007f0587c5c8da in QEventLoop::exec(QFlags<qeventloop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#46 0x00007f0587c65984 in QCoreApplication::exec() () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#47 0x000055dbcc845dcf in main ()

(3) The last possible cause of the crash is that the mutex transformersmutex is locked and not unlocked in the PointCloudCommon::transformCloud function, while PointCloudCommon is destructed, causing the crash.

the backtrace shows that:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `rviz'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: no such file or directory.
[Current thread is 1 (Thread 0x7f4904592440 (LWP 10985))]
(gdb) bt
#0  0x00007f4902403e87 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f49024057f1 in __GI_abort () at abort.c:79
#2  0x00007f49023f53fa in __assert_fail_base (fmt=0x7f490456076b <error: Cannot access memory at address 0x7f490456076b>, assertion=assertion@entry=0x7f4904117d01 "!pthread_mutex_destroy(&m)", file=file@entry=0x7f4904117c18 "/usr/include/boost/thread/pthread/recursive_mutex.hpp", line=line@entry=104, function=function@entry=0x7f4904119440 <boost::recursive_mutex::~recursive_mutex()::__PRETTY_FUNCTION__> "boost::recursive_mutex::~recursive_mutex()") at assert.c:92
#3  0x00007f49023f5472 in __GI___assert_fail (assertion=0x7f4904117d01 "!pthread_mutex_destroy(&m)", file=0x7f4904117c18 "/usr/include/boost/thread/pthread/recursive_mutex.hpp", line=104, function=0x7f4904119440 <boost::recursive_mutex::~recursive_mutex()::__PRETTY_FUNCTION__> "boost::recursive_mutex::~recursive_mutex()") at assert.c:101
#4  0x00007f490408c6dd in boost::recursive_mutex::~recursive_mutex() (this=0x55e6e16705c8, __in_chrg=<optimized out>) at /usr/include/boost/thread/pthread/recursive_mutex.hpp:104
#5  0x00007f48368cb9ef in rviz::PointCloudCommon::~PointCloudCommon() (this=0x55e6e16704c0, __in_chrg=<optimized out>) at /home/Downloads/Downloads/rviz/rviz/src/rviz/default_plugin/point_cloud_common.cpp:381
#6  0x00007f48368cba5c in rviz::PointCloudCommon::~PointCloudCommon() (this=0x55e6e16704c0, __in_chrg=<optimized out>) at /home/Downloads/Downloads/rviz/rviz/src/rviz/default_plugin/point_cloud_common.cpp:384
#7  0x00007f48368b1e13 in rviz::PointCloud2Display::~PointCloud2Display() (this=0x55e6e0f07850, __in_chrg=<optimized out>)
    at /home/Downloads/Downloads/rviz/rviz/src/rviz/default_plugin/point_cloud2_display.cpp:59
#8  0x00007f48368b1e3a in rviz::PointCloud2Display::~PointCloud2Display() (this=0x55e6e0f07850, __in_chrg=<optimized out>)
    at /home/Downloads/Downloads/rviz/rviz/src/rviz/default_plugin/point_cloud2_display.cpp:60
#9  0x00007f4903fbc72c in rviz::DisplayGroup::removeAllDisplays() (this=0x55e6df96c200) at /home/Downloads/Downloads/rviz/rviz/src/rviz/display_group.cpp:172
#10 0x00007f4903fbbee6 in rviz::DisplayGroup::load(rviz::Config const&) (this=0x55e6df96c200, config=...) at /home/Downloads/Downloads/rviz/rviz/src/rviz/display_group.cpp:59
#11 0x00007f49040d37c6 in rviz::VisualizationManager::load(rviz::Config const&) (this=0x55e6e04f2d20, config=...) at /home/Downloads/Downloads/rviz/rviz/src/rviz/visualization_manager.cpp:517
#12 0x00007f49040c391c in rviz::VisualizationFrame::load(rviz::Config const&) (this=0x55e6dfa6bc00, config=...) at /home/Downloads/Downloads/rviz/rviz/src/rviz/visualization_frame.cpp:868
#13 0x00007f49040c2e18 in rviz::VisualizationFrame::loadDisplayConfigHelper(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (this=0x55e6dfa6bc00, full_path="/home/ysl/Downloads/critical/custom.rviz") at /home/Downloads/Downloads/rviz/rviz/src/rviz/visualization_frame.cpp:784
#14 0x00007f49040c2b35 in rviz::VisualizationFrame::loadDisplayConfig(QString const&) (this=0x55e6dfa6bc00, qpath=...) at /home/Downloads/Downloads/rviz/rviz/src/rviz/visualization_frame.cpp:754
#15 0x00007f49040c567d in rviz::VisualizationFrame::onOpen() (this=0x55e6dfa6bc00) at /home/Downloads/Downloads/rviz/rviz/src/rviz/visualization_frame.cpp:1090
#16 0x00007f4903f8fab6 in rviz::VisualizationFrame::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) (_o=0x55e6dfa6bc00, _c=QMetaObject::InvokeMetaMethod, _id=7, _a=0x7ffd5b9d2ff0)
    at /home/Downloads/Downloads/rviz/rviz/cmake-build-debug/src/rviz/rviz_autogen/EWIEGA46WW/moc_visualization_frame.cpp:224
#17 0x00007f4903010525 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#18 0x00007f49035f0122 in QAction::triggered(bool) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#19 0x00007f49035f280c in QAction::activate(QAction::ActionEvent) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#20 0x00007f49035f30d5 in QAction::event(QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#21 0x00007f49035f683c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#22 0x00007f49035fe104 in QApplication::notify(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#23 0x00007f4902fe18a8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#24 0x00007f48fd99f4b7 in QShortcutMap::dispatchEvent(QKeyEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#25 0x00007f48fd99f58a in QShortcutMap::tryShortcut(QKeyEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#26 0x00007f48fd94dcb3 in QWindowSystemInterface::handleShortcutEvent(QWindow*, unsigned long, int, QFlags<qt::KeyboardModifier>, unsigned int, unsigned int, unsigned int, QString const&, bool, unsigned short) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#27 0x00007f48fd96e047 in QGuiApplicationPrivate::processKeyEvent(QWindowSystemInterfacePrivate::KeyEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#28 0x00007f48fd973095 in QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#29 0x00007f48fd94a33b in QWindowSystemInterface::sendWindowSystemEvents(QFlags<qeventloop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5
#30 0x00007f48ec3a5260 in  () at /usr/lib/x86_64-linux-gnu/libQt5XcbQpa.so.5
#31 0x00007f48fc37c537 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#32 0x00007f48fc37c770 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#33 0x00007f48fc37c7fc in g_main_context_iteration () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#34 0x00007f490303a85f in QEventDispatcherGlib::processEvents(QFlags<qeventloop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#35 0x00007f4902fdf8da in QEventLoop::exec(QFlags<qeventloop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#36 0x00007f4902fe8984 in QCoreApplication::exec() () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#37 0x000055e6de4bedcf in main ()
rhaschke commented 2 years ago

Thanks for this detailed error report. I think the clean solution would be to stop calling processMessage / waiting for a running call to finish before destructing the point cloud.

ysl-design commented 2 years ago

Thank you for your response and for the corrections to the issues in my description. I tested it with the modifications in PR #1754 to compile RViz, and the problem (1) in my description has been fixed. However, problems (2) and (3) in the description still exist during the test. I think that even though it stops subscribing to new data before destruction, threads that have entered processMessage will continue after the PointCloudCommon destruction is complete, and I suspect this may be the cause of the crash during mutex destruction.

rhaschke commented 2 years ago

I think that even though it stops subscribing to new data before destruction, threads that have entered processMessage will continue after the PointCloudCommon destruction is complete, and I suspect this may be the cause of the crash during mutex destruction.

Yeah, that might be an issue. Quoting the doc: Attempting to destroy a locked mutex results in undefined behavior. I have pushed another commit to ensure the mutexes are held by the destructor...

ysl-design commented 2 years ago

Thank you for your reply and modification. I'm sorry I didn't get back to you in time. I tried new modifications and it looks like the problem (3) in the description has been fixed as well. But problem (2) still exists. According to the analysis of the core dump file again, problem (2) occurs in this case: After PointCloud2Display is destructed, its parent class MessageFilterDisplay is destroyed. A crash occurred while executing delete tffilter in the ~MessageFilterDisplay() function. The backtrace information ultimately points to the Signal1 class in the messagefilters of the ROS. The mutex in the Signal1 class was destroyed without being unlocked, resulting in a crash. Can this problem be avoided when MessageFilterDisplay is destructed or the code related to ROS needs to be modified?

rhaschke commented 2 years ago

Thanks for the feedback. The MessageFilter destructor correctly disconnects as expected:

~MessageFilter()
{
    message_connection_.disconnect();
    MessageFilter::clear();
}

Could you try to build with these cmake flags and post the resulting backtrace(s) when just running rviz: -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS="-fsanitize=address -fno-omit-frame-pointer -O1"

This enables the address sanitizer, which in detail tracks allocated and freed memory.

ysl-design commented 2 years ago

I used these cmake flags you gave to build rviz, and amazingly, I couldn't reproduce problem (2) in the description while running rviz, I tried several times and never crashed. After I shut down rviz normally, the terminal displays "ERROR: LeakSanitizer: detected memory leaks" . There will be a lot of printed information, and I don't know which information to post. And the information doesn't seem very relevant to my question here.

rhaschke commented 2 years ago

The memory leaks reported are not related to your issue. If you want to report some, only consider those related to rviz. There are many low-level libraries having leaks, which we cannot fix anyway.

That you can't reproduce the issue (2) anymore might be related to the slower execution with asan.

I don't understand why the mutex_ in MessageFilter/Signal1 can still become locked after disconnecting the message_connection_ (which should stop pushing new messages) and clearing the message buffer. I was hoping for more insight with asan... Could you build with -DCMAKE_BUILD_TYPE=RelWithDebInfo again and check which thread is holding the lock at crash time? Please paste the backtrace of this thread (which should include message_filters::Signal1::call()).

ysl-design commented 2 years ago

I used -DCMAKE_BUILD_TYPE=RelWithDebInfo to build rviz, and again, it no longer crashes. So I'm afraid I can't provide the information you want. I remove these compilation parameters and compile and run rviz, and it crashes again. I added print statements before and after mutex is locked and unlocked in the Signal1 class. According to the print information, after mutex is locked in the void Signal1::call(const ros::MessageEvent<M const>& event) function of signal1, mutex is not unlocked until the rviz crashes. According to the core dump file, the crash occurred during the destruction of the locked mutex. The following is the cout result. At the end of the Signal1::call function, the cout statement "call unlock" is not printed, indicating that mutex_ is not unlocked.

call lock 0x55ce07f77a68
...
~PointCloudCommon()
~PointCloud2Display() 0x55ce0754d100
~MessageFilter() start
removeCallback lock 0x55ce0754d330
removeCallback unlock 0x55ce0754d330
~MessageFilter() end
rviz: /usr/include/boost/thread/pthread/mutex.hpp:111:boost::mutex::~mutex(): Assertion '!res' failed
Aborted (core dumped)

According to other printed information, mutex_ is locked in the Signal1::call function, then, the program runs to helper->call(event, nonconst_force_copy) and crashes.

According to the core dump file, the address of the object when the crash occurs is 0x55ce07f77a68, which is the same as the address of the object that invokes the Signal1::call function in the cout information.

...
#4  0x00007fb805091160 in boost::mutex::~mutex() (this=0x55ce07f77a68, __in_chrg=<optimized out>) at /usr/include/boost/thread/pthread/mutex.hpp:111
...

This should indicate that the crash occurred in the Signal1::call function and that mutex_ was not unlocked.

rhaschke commented 2 years ago

Thanks for your investigation. I continued as well and traced the issue down to tf2_ros::MessageFilter. I filed a PR https://github.com/ros/geometry2/pull/538.

rhaschke commented 2 years ago

I used -DCMAKE_BUILD_TYPE=RelWithDebInfo to build rviz, and again, it no longer crashes.

A release build disables all assertions. Hence, it is not aborting anymore (due to failing assertions).

rhaschke commented 2 years ago

Fixed via #1754