Closed GoesM closed 6 months ago
HOWEVER, it's noticed that here's already a join()
for waiting the thread shutdown
So it's strange for me to understand why the function is still working after "resource of slam-toolbox freed".
To make sure Thread may shutdown later than the destructor of SlamToolbox , i did more experiments as following:
we can insert a positive-delay as following, to prove that the thread may be still working after destructor works:
/*****************************************************************************/
void SlamToolbox::publishVisualizations()
/*****************************************************************************/
{
...
...
while (rclcpp::ok()) {
// goes insert // ==> SEGV
std::this_thread::sleep_for(std::chrono::seconds(5));
// insert end
updateMap();
if (!isPaused(VISUALIZING_GRAPH)) {
boost::mutex::scoped_lock lock(smapper_mutex_);
closure_assistant_->publishGraph();
}
r.sleep();
}
}
Doing so, we could stably get a SEGV asan report (during shutdown period) as following , which proves that the thread may be still working after destructor works:
=================================================================
==603189==ERROR: AddressSanitizer: SEGV on unknown address 0x7619597fe288 (pc 0x76196809d332 bp 0x7619535fced0 sp 0x7619535fb560 T22)
==603189==The signal is caused by a READ memory access.
#0 0x76196809d332 in loop_closure_assistant::LoopClosureAssistant::publishGraph() (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x89d332) (BuildId: 14eced23e7350992863c4c87d7e9e58c4a36ef4b)
#1 0x761967df0714 in slam_toolbox::SlamToolbox::publishVisualizations() (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x5f0714) (BuildId: 14eced23e7350992863c4c87d7e9e58c4a36ef4b)
#2 0x761962cd70ca (/lib/x86_64-linux-gnu/libboost_thread.so.1.74.0+0x150ca) (BuildId: 3ee87007d3e209494407f6bf5137a7c297f69f51)
#3 0x761961e94ac2 in start_thread nptl/./nptl/pthread_create.c:442:8
#4 0x761961f2684f misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x89d332) (BuildId: 14eced23e7350992863c4c87d7e9e58c4a36ef4b) in loop_closure_assistant::LoopClosureAssistant::publishGraph()
Thread T22 created by T0 here:
#0 0x5807bd1999ac in __interceptor_pthread_create (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/slam_toolbox/async_slam_toolbox_node+0x12d9ac) (BuildId: ecb9a0b428ce82221c2cab0c1311438a98dff306)
#1 0x761962cd31f3 in boost::thread::start_thread_noexcept() (/lib/x86_64-linux-gnu/libboost_thread.so.1.74.0+0x111f3) (BuildId: 3ee87007d3e209494407f6bf5137a7c297f69f51)
==603189==ABORTING
In another insert way (as following), it would not bring any asan-report:
/*****************************************************************************/
void SlamToolbox::publishVisualizations()
/*****************************************************************************/
{
//goes insert // not effect
std::this_thread::sleep_for(std::chrono::seconds(5));
//insert end
nav_msgs::msg::OccupancyGrid & og = map_.map;
og.info.resolution = resolution_;
og.info.origin.position.x = 0.0;
og.info.origin.position.y = 0.0;
og.info.origin.position.z = 0.0;
og.info.origin.orientation.x = 0.0;
og.info.origin.orientation.y = 0.0;
og.info.origin.orientation.z = 0.0;
og.info.origin.orientation.w = 1.0;
og.header.frame_id = map_frame_;
double map_update_interval = 10;
map_update_interval = this->declare_parameter("map_update_interval",
map_update_interval);
rclcpp::Rate r(1.0 / map_update_interval);
while (rclcpp::ok()) {
updateMap();
if (!isPaused(VISUALIZING_GRAPH)) {
boost::mutex::scoped_lock lock(smapper_mutex_);
closure_assistant_->publishGraph();
}
r.sleep();
}
}
So I came to a preliminary conclusion that join()
itself can ensure thread exit, but due to the presence of while(rclcpp::ok())
, certain conditions are disrupted, leading to exceptions.
However, since I'm not entirely sure how it's being disrupted by rclcpp::ok()
, the main purpose of this issue is to share my findings. If possible, I also welcome discussion and further solutions to this problem~
We could set an atomic bool to signal to the thread that it should exit before join is called so its while (rclcpp::ok() && !stop_visualization /*atomic*/)
. Rather than debugging why in detail if its unclear, we can easily work around it I think
while(rclcpp::ok()&&!stop_visuliazation /*atomic*/)
shall I remove rclcpp::ok()
? I think it might be redundant if we use the atomic flag here.
Yeah agreed since this is a lifecycle now
Required Info:
Steps to reproduce issue
I use slam-toolbox (async) by following command :
Running Slam-Toolbox within AddressSanitizer , I always faced to such UAF report during shutdown-period
Expected behavior
No UAF occurs
Actual behavior
most of UAF report is like following:
Additional information
As
void SlamToolbox::publishVisualizations()
is working in a thread:https://github.com/SteveMacenski/slam_toolbox/blob/94cec982a7f850818187c81295d1212f145efe37/src/slam_toolbox_common.cpp#L78-L82
So, the UAF should be caused by: Thread may shutdown later than the destructor of
SlamToolbox
.