SteveMacenski / slam_toolbox

Slam Toolbox for lifelong mapping and localization in potentially massive maps with ROS
GNU Lesser General Public License v2.1
1.67k stars 525 forks source link

UAF bug caused by `void SlamToolbox::publishVisualizations()` #691

Closed GoesM closed 6 months ago

GoesM commented 7 months ago

Required Info:

Steps to reproduce issue

I use slam-toolbox (async) by following command :

#!/bin/bash
export ASAN_OPTIONS=halt_on_error=0:new_delete_type_mismatch=0:detect_leaks=0:log_path=asan
source install/setup.bash
ros2 launch slam_toolbox online_async_launch.py

Running Slam-Toolbox within AddressSanitizer , I always faced to such UAF report during shutdown-period

Expected behavior

No UAF occurs

Actual behavior

most of UAF report is like following:

=================================================================
==2070917==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600006c878 at pc 0x765d5a68fe35 bp 0x765d45d4c6c0 sp 0x765d45d4c6b8
READ of size 8 at 0x60600006c878 thread T22
    #0 0x765d5a68fe34 in std::_Rb_tree<karto::Name, std::pair<karto::Name const, karto::Sensor*>, std::_Select1st<std::pair<karto::Name const, karto::Sensor*> >, std::less<karto::Name>, std::allocator<std::pair<karto::Name const, karto::Sensor*> > >::find(karto::Name const&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x68fe34) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #1 0x765d5a68f976 in karto::SensorManager::GetSensorByName(karto::Name const&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x68f976) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #2 0x765d5a98e4b5 in karto::OccupancyGrid::AddScan(karto::LocalizedRangeScan*, bool) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x98e4b5) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #3 0x765d5a98e1d3 in karto::OccupancyGrid::CreateFromScans(std::vector<karto::LocalizedRangeScan*, std::allocator<karto::LocalizedRangeScan*> > const&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x98e1d3) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #4 0x765d5a98a848 in karto::OccupancyGrid::CreateFromScans(std::vector<karto::LocalizedRangeScan*, std::allocator<karto::LocalizedRangeScan*> > const&, double) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x98a848) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #5 0x765d5a96dfd8 in mapper_utils::SMapper::getOccupancyGrid(double const&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x96dfd8) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #6 0x765d5a61f93f in slam_toolbox::SlamToolbox::updateMap() (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x61f93f) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #7 0x765d5a61d857 in slam_toolbox::SlamToolbox::publishVisualizations() (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x61d857) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #8 0x765d556170ca  (/lib/x86_64-linux-gnu/libboost_thread.so.1.74.0+0x150ca) (BuildId: 3ee87007d3e209494407f6bf5137a7c297f69f51)
    #9 0x765d54894ac2 in start_thread nptl/./nptl/pthread_create.c:442:8
    #10 0x765d5492684f  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

0x60600006c878 is located 24 bytes inside of 56-byte region [0x60600006c860,0x60600006c898)
freed by thread T0 here:
    #0 0x5a0882b98b6d in operator delete(void*) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/slam_toolbox/async_slam_toolbox_node+0x180b6d) (BuildId: 665ae7c0cf44b4a2030f356d111c1fa0c395b3f9)
    #1 0x765d54845494 in __run_exit_handlers stdlib/./stdlib/exit.c:113:8

previously allocated by thread T20 here:
    #0 0x5a0882b9830d in operator new(unsigned long) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/slam_toolbox/async_slam_toolbox_node+0x18030d) (BuildId: 665ae7c0cf44b4a2030f356d111c1fa0c395b3f9)
    #1 0x765d55dc348b in karto::SensorManager::GetInstance() (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libkartoSlamToolbox.so+0x3c348b) (BuildId: 1d5b8c39460f3efacfdd9b09596ff4c8c42993c7)
    #2 0x765d5a622f8f in slam_toolbox::SlamToolbox::getLaser(std::shared_ptr<sensor_msgs::msg::LaserScan_<std::allocator<void> > const> const&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x622f8f) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #3 0x765d5b2eca68 in slam_toolbox::AsynchronousSlamToolbox::laserCallback(std::shared_ptr<sensor_msgs::msg::LaserScan_<std::allocator<void> > const>) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libasync_slam_toolbox.so+0xb4a68) (BuildId: 95814265feb557adff78a6cb4f4d7f046f136e55)
    #4 0x765d5a8456c9 in void std::__invoke_impl<void, void (slam_toolbox::SlamToolbox::*&)(std::shared_ptr<sensor_msgs::msg::LaserScan_<std::allocator<void> > const>), slam_toolbox::SlamToolbox*&, std::shared_ptr<sensor_msgs::msg::LaserScan_<std::allocator<void> > const> const&>(std::__invoke_memfun_deref, void (slam_toolbox::SlamToolbox::*&)(std::shared_ptr<sensor_msgs::msg::LaserScan_<std::allocator<void> > const>), slam_toolbox::SlamToolbox*&, std::shared_ptr<sensor_msgs::msg::LaserScan_<std::allocator<void> > const> const&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x8456c9) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #5 0x765d5a844e55 in message_filters::CallbackHelper1T<std::shared_ptr<sensor_msgs::msg::LaserScan_<std::allocator<void> > const> const&, sensor_msgs::msg::LaserScan_<std::allocator<void> > >::call(message_filters::MessageEvent<sensor_msgs::msg::LaserScan_<std::allocator<void> > const> const&, bool) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x844e55) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #6 0x765d5a7dbb2b in message_filters::Signal1<sensor_msgs::msg::LaserScan_<std::allocator<void> > >::call(message_filters::MessageEvent<sensor_msgs::msg::LaserScan_<std::allocator<void> > const> const&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x7dbb2b) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #7 0x765d5a83ca32 in tf2_ros::MessageFilter<sensor_msgs::msg::LaserScan_<std::allocator<void> >, tf2_ros::Buffer>::transformReadyCallback(tf2_ros::TransformStampedFuture const&, unsigned long) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x83ca32) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #8 0x765d5a84123e in std::_Function_handler<void (tf2_ros::TransformStampedFuture const&), std::_Bind<void (tf2_ros::MessageFilter<sensor_msgs::msg::LaserScan_<std::allocator<void> >, tf2_ros::Buffer>::* (tf2_ros::MessageFilter<sensor_msgs::msg::LaserScan_<std::allocator<void> >, tf2_ros::Buffer>*, std::_Placeholder<1>, unsigned long))(tf2_ros::TransformStampedFuture const&, unsigned long)> >::_M_invoke(std::_Any_data const&, tf2_ros::TransformStampedFuture const&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x84123e) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #9 0x765d5b18ee8d  (/opt/ros/humble/lib/libtf2_ros.so+0x4fe8d) (BuildId: dceacb25e05c8a82678784802b23fb16ba98d172)
    #10 0x765d482ea91f  (<unknown module>)

Thread T22 created by T0 here:
    #0 0x5a0882b469bc in __interceptor_pthread_create (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/slam_toolbox/async_slam_toolbox_node+0x12e9bc) (BuildId: 665ae7c0cf44b4a2030f356d111c1fa0c395b3f9)
    #1 0x765d556131f3 in boost::thread::start_thread_noexcept() (/lib/x86_64-linux-gnu/libboost_thread.so.1.74.0+0x111f3) (BuildId: 3ee87007d3e209494407f6bf5137a7c297f69f51)

Thread T20 created by T0 here:
    #0 0x5a0882b469bc in __interceptor_pthread_create (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/slam_toolbox/async_slam_toolbox_node+0x12e9bc) (BuildId: 665ae7c0cf44b4a2030f356d111c1fa0c395b3f9)
    #1 0x765d54cdc328 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc328) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2)
    #2 0x765d5a670ddb in std::_MakeUniq<tf2_ros::TransformListener>::__single_object std::make_unique<tf2_ros::TransformListener, tf2_ros::Buffer&>(tf2_ros::Buffer&) (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x670ddb) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #3 0x765d5a610dd5 in slam_toolbox::SlamToolbox::configure() (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x610dd5) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3)
    #4 0x5a0882b9b3c1 in main (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/slam_toolbox/async_slam_toolbox_node+0x1833c1) (BuildId: 665ae7c0cf44b4a2030f356d111c1fa0c395b3f9)
    #5 0x765d54829d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

SUMMARY: AddressSanitizer: heap-use-after-free (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x68fe34) (BuildId: c8f26049d6a58c55f3283c4cac8d28e45dea89d3) in std::_Rb_tree<karto::Name, std::pair<karto::Name const, karto::Sensor*>, std::_Select1st<std::pair<karto::Name const, karto::Sensor*> >, std::less<karto::Name>, std::allocator<std::pair<karto::Name const, karto::Sensor*> > >::find(karto::Name const&)
Shadow bytes around the buggy address:
  0x0c0c800058b0: 00 00 00 fa fa fa fa fa 00 00 00 00 00 00 00 fa
  0x0c0c800058c0: fa fa fa fa 00 00 00 00 00 00 00 fa fa fa fa fa
  0x0c0c800058d0: 00 00 00 00 00 00 00 fa fa fa fa fa 00 00 00 00
  0x0c0c800058e0: 00 00 00 fa fa fa fa fa 00 00 00 00 00 00 00 fa
  0x0c0c800058f0: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa
=>0x0c0c80005900: fd fd fd fd fd fd fd fd fa fa fa fa fd fd fd[fd]
  0x0c0c80005910: fd fd fd fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c0c80005920: fa fa fa fa 00 00 00 00 00 00 00 fa fa fa fa fa
  0x0c0c80005930: 00 00 00 00 00 00 00 fa fa fa fa fa fd fd fd fd
  0x0c0c80005940: fd fd fd fd fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c0c80005950: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==2070917==ABORTING

Additional information

As void SlamToolbox::publishVisualizations() is working in a thread:

https://github.com/SteveMacenski/slam_toolbox/blob/94cec982a7f850818187c81295d1212f145efe37/src/slam_toolbox_common.cpp#L78-L82

So, the UAF should be caused by: Thread may shutdown later than the destructor of SlamToolbox.

GoesM commented 7 months ago

futhermore Additional information

HOWEVER, it's noticed that here's already a join() for waiting the thread shutdown

https://github.com/SteveMacenski/slam_toolbox/blob/94cec982a7f850818187c81295d1212f145efe37/src/slam_toolbox_common.cpp#L86-L91

So it's strange for me to understand why the function is still working after "resource of slam-toolbox freed".

To make sure Thread may shutdown later than the destructor of SlamToolbox , i did more experiments as following:

some experiments

<1>

we can insert a positive-delay as following, to prove that the thread may be still working after destructor works:

/*****************************************************************************/
void SlamToolbox::publishVisualizations()
/*****************************************************************************/
{
...
...
  while (rclcpp::ok()) {
// goes insert // ==> SEGV
 std::this_thread::sleep_for(std::chrono::seconds(5));
// insert end
    updateMap();
    if (!isPaused(VISUALIZING_GRAPH)) {
      boost::mutex::scoped_lock lock(smapper_mutex_);
      closure_assistant_->publishGraph();
    }
    r.sleep();
  }
}

Doing so, we could stably get a SEGV asan report (during shutdown period) as following , which proves that the thread may be still working after destructor works:

=================================================================
==603189==ERROR: AddressSanitizer: SEGV on unknown address 0x7619597fe288 (pc 0x76196809d332 bp 0x7619535fced0 sp 0x7619535fb560 T22)
==603189==The signal is caused by a READ memory access.
    #0 0x76196809d332 in loop_closure_assistant::LoopClosureAssistant::publishGraph() (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x89d332) (BuildId: 14eced23e7350992863c4c87d7e9e58c4a36ef4b)
    #1 0x761967df0714 in slam_toolbox::SlamToolbox::publishVisualizations() (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x5f0714) (BuildId: 14eced23e7350992863c4c87d7e9e58c4a36ef4b)
    #2 0x761962cd70ca  (/lib/x86_64-linux-gnu/libboost_thread.so.1.74.0+0x150ca) (BuildId: 3ee87007d3e209494407f6bf5137a7c297f69f51)
    #3 0x761961e94ac2 in start_thread nptl/./nptl/pthread_create.c:442:8
    #4 0x761961f2684f  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/libtoolbox_common.so+0x89d332) (BuildId: 14eced23e7350992863c4c87d7e9e58c4a36ef4b) in loop_closure_assistant::LoopClosureAssistant::publishGraph()
Thread T22 created by T0 here:
    #0 0x5807bd1999ac in __interceptor_pthread_create (/home/****/slam_toolbox_humble_fork/install/slam_toolbox/lib/slam_toolbox/async_slam_toolbox_node+0x12d9ac) (BuildId: ecb9a0b428ce82221c2cab0c1311438a98dff306)
    #1 0x761962cd31f3 in boost::thread::start_thread_noexcept() (/lib/x86_64-linux-gnu/libboost_thread.so.1.74.0+0x111f3) (BuildId: 3ee87007d3e209494407f6bf5137a7c297f69f51)

==603189==ABORTING

<2>

In another insert way (as following), it would not bring any asan-report:

/*****************************************************************************/
void SlamToolbox::publishVisualizations()
/*****************************************************************************/
{
//goes insert // not effect
std::this_thread::sleep_for(std::chrono::seconds(5));
//insert end
  nav_msgs::msg::OccupancyGrid & og = map_.map;
  og.info.resolution = resolution_;
  og.info.origin.position.x = 0.0;
  og.info.origin.position.y = 0.0;
  og.info.origin.position.z = 0.0;
  og.info.origin.orientation.x = 0.0;
  og.info.origin.orientation.y = 0.0;
  og.info.origin.orientation.z = 0.0;
  og.info.origin.orientation.w = 1.0;
  og.header.frame_id = map_frame_;

  double map_update_interval = 10;
  map_update_interval = this->declare_parameter("map_update_interval",
      map_update_interval);
  rclcpp::Rate r(1.0 / map_update_interval);
  while (rclcpp::ok()) {
    updateMap();
    if (!isPaused(VISUALIZING_GRAPH)) {
      boost::mutex::scoped_lock lock(smapper_mutex_);
      closure_assistant_->publishGraph();
    }
    r.sleep();
  }
}

So I came to a preliminary conclusion that join() itself can ensure thread exit, but due to the presence of while(rclcpp::ok()), certain conditions are disrupted, leading to exceptions.

However, since I'm not entirely sure how it's being disrupted by rclcpp::ok(), the main purpose of this issue is to share my findings. If possible, I also welcome discussion and further solutions to this problem~

SteveMacenski commented 7 months ago

We could set an atomic bool to signal to the thread that it should exit before join is called so its while (rclcpp::ok() && !stop_visualization /*atomic*/). Rather than debugging why in detail if its unclear, we can easily work around it I think

GoesM commented 6 months ago

while(rclcpp::ok()&&!stop_visuliazation /*atomic*/)

shall I remove rclcpp::ok() ? I think it might be redundant if we use the atomic flag here.

SteveMacenski commented 6 months ago

Yeah agreed since this is a lifecycle now