ros-navigation / navigation2

ROS 2 Navigation Framework and System
https://nav2.org/
Other
2.3k stars 1.2k forks source link

[nav2_costmap] UAF bug caused by the exit mechanism flaw in class `ClearCostmapService` #4489

Closed GoesM closed 6 days ago

GoesM commented 1 week ago

Bug report

Required Info:

Steps to reproduce issue

Bug happened in my normal usage.

Launch the navigation2 as following steps:

#!/bin/bash
export ASAN_OPTIONS=halt_on_error=0:new_delete_type_mismatch=0:detect_leaks=0:log_pah=asan
source install/setup.bash
export TURTLEBOT3_MODEL=waffle
export GAZEBO_MODEL_PATH=$GAZEBO_MODEL_PATH:/opt/ros/humble/share/turtlebot3_gazebo/models
ros2 launch nav2_bringup tb3_simulation_launch.py headless:=True use_rviz:=False use_composition:=False 

Expected behavior

no bug occured.

Actual behavior

the Asan report of this bug is as following:

  ================================================================= 
 ==1243485==ERROR: AddressSanitizer: heap-use-after-free on address 0x6170000c1690 at pc 0x7966587ea5af bp 0x79664a7fb710 sp 0x79664a7fb708 
 READ of size 8 at 0x6170000c1690 thread T14 
     #0 0x7966587ea5ae in std::vector<int, std::allocator<int> >::operator=(std::vector<int, std::allocator<int> > const&) (/home/*****/nav2_humble/install/nav2_planner/lib/libplanner_server_core.so+0x1ea5ae) (BuildId: 1b8bf26805500b9cf783c2dd56285836f05b374f) 
     #1 0x79665837748a in std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >::operator=(std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > > const&) (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/liblayers.so+0x17748a) (BuildId: ae511a7dd186fe200f2fced591c67dc98004b8d8) 
     #2 0x79665837607f in nav2_costmap_2d::InflationLayer::generateIntegerDistances() (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/liblayers.so+0x17607f) (BuildId: ae511a7dd186fe200f2fced591c67dc98004b8d8) 
     #3 0x796658374c50 in nav2_costmap_2d::InflationLayer::computeCaches() (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/liblayers.so+0x174c50) (BuildId: ae511a7dd186fe200f2fced591c67dc98004b8d8) 
     #4 0x796658374903 in nav2_costmap_2d::InflationLayer::matchSize() (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/liblayers.so+0x174903) (BuildId: ae511a7dd186fe200f2fced591c67dc98004b8d8) 
     #5 0x796658377529 in nav2_costmap_2d::InflationLayer::reset() (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/liblayers.so+0x177529) (BuildId: ae511a7dd186fe200f2fced591c67dc98004b8d8) 
     #6 0x7966581349b9 in nav2_costmap_2d::Costmap2DROS::resetLayers() (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0xd79b9) (BuildId: ae00021bac019132f382130b42d6275d543c55a4) 
     #7 0x7966581a93e5 in nav2_costmap_2d::ClearCostmapService::clearEntireCallback(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Request_<std::allocator<void> > >, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Response_<std::allocator<void> > >) (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0x14c3e5) (BuildId: ae00021bac019132f382130b42d6275d543c55a4) 
     #8 0x7966581b2eb1 in void std::__invoke_impl<void, void (nav2_costmap_2d::ClearCostmapService::*&)(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Request_<std::allocator<void> > >, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Response_<std::allocator<void> > >), nav2_costmap_2d::ClearCostmapService*&, std::shared_ptr<rmw_request_id_s>, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Request_<std::allocator<void> > >, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Response_<std::allocator<void> > > >(std::__invoke_memfun_deref, void (nav2_costmap_2d::ClearCostmapService::*&)(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Request_<std::allocator<void> > >, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Response_<std::allocator<void> > >), nav2_costmap_2d::ClearCostmapService*&, std::shared_ptr<rmw_request_id_s>&&, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Request_<std::allocator<void> > >&&, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Response_<std::allocator<void> > >&&) (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0x155eb1) (BuildId: ae00021bac019132f382130b42d6275d543c55a4) 
     #9 0x7966581b62c9 in rclcpp::AnyServiceCallback<nav2_msgs::srv::ClearEntireCostmap>::dispatch(std::shared_ptr<rclcpp::Service<nav2_msgs::srv::ClearEntireCostmap> > const&, std::shared_ptr<rmw_request_id_s> const&, std::shared_ptr<nav2_msgs::srv::ClearEntireCostmap_Request_<std::allocator<void> > >) (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0x1592c9) (BuildId: ae00021bac019132f382130b42d6275d543c55a4) 
     #10 0x7966581b44bb in rclcpp::Service<nav2_msgs::srv::ClearEntireCostmap>::handle_request(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<void>) (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0x1574bb) (BuildId: ae00021bac019132f382130b42d6275d543c55a4) 
     #11 0x796657878375  (/opt/ros/humble/lib/librclcpp.so+0xe9375) (BuildId: bd69d2a87470f6b9fdc0534ef05dc334083a0b6c) 
     #12 0x796657875d59 in rclcpp::Executor::execute_service(std::shared_ptr<rclcpp::ServiceBase>) (/opt/ros/humble/lib/librclcpp.so+0xe6d59) (BuildId: bd69d2a87470f6b9fdc0534ef05dc334083a0b6c) 
     #13 0x7966578760c5 in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) (/opt/ros/humble/lib/librclcpp.so+0xe70c5) (BuildId: bd69d2a87470f6b9fdc0534ef05dc334083a0b6c) 
     #14 0x79665787d97f in rclcpp::executors::SingleThreadedExecutor::spin() (/opt/ros/humble/lib/librclcpp.so+0xee97f) (BuildId: bd69d2a87470f6b9fdc0534ef05dc334083a0b6c) 
     #15 0x79665800ceb4 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<nav2_util::NodeThread::NodeThread(std::shared_ptr<rclcpp::node_interfaces::NodeBaseInterface>)::$_0> > >::_M_run() node_thread.cpp 
     #16 0x7966568dc252  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc252) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 
     #17 0x796656494ac2 in start_thread nptl/./nptl/pthread_create.c:442:8 
     #18 0x79665652684f  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 

 0x6170000c1690 is located 16 bytes inside of 648-byte region [0x6170000c1680,0x6170000c1908) 
 freed by thread T1 here: 
     #0 0x5c4aa18b3a2d in operator delete(void*) (/home/*****/nav2_humble/install/nav2_planner/lib/nav2_planner/planner_server+0xe5a2d) (BuildId: 191f253724b34c41ec9522f9202cc91f782cabef) 
     #1 0x796658372633 in nav2_costmap_2d::InflationLayer::~InflationLayer() (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/liblayers.so+0x172633) (BuildId: ae511a7dd186fe200f2fced591c67dc98004b8d8) 

 previously allocated by thread T0 here: 
     #0 0x5c4aa18b31cd in operator new(unsigned long) (/home/*****/nav2_humble/install/nav2_planner/lib/nav2_planner/planner_server+0xe51cd) (BuildId: 191f253724b34c41ec9522f9202cc91f782cabef) 
     #1 0x79665837a27b in std::vector<int, std::allocator<int> >* std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > >::_M_allocate_and_copy<__gnu_cxx::__normal_iterator<std::vector<int, std::allocator<int> > const*, std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > > > >(unsigned long, __gnu_cxx::__normal_iterator<std::vector<int, std::allocator<int> > const*, std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > > >, __gnu_cxx::__normal_iterator<std::vector<int, std::allocator<int> > const*, std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > > >) (/home/*****/nav2_humble/install/nav2_costmap_2d/lib/liblayers.so+0x17a27b) (BuildId: ae511a7dd186fe200f2fced591c67dc98004b8d8) 

 Thread T14 created by T0 here: 
     #0 0x5c4aa186187c in __interceptor_pthread_create (/home/*****/nav2_humble/install/nav2_planner/lib/nav2_planner/planner_server+0x9387c) (BuildId: 191f253724b34c41ec9522f9202cc91f782cabef) 
     #1 0x7966568dc328 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc328) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 

 Thread T1 created by T0 here: 
     #0 0x5c4aa186187c in __interceptor_pthread_create (/home/*****/nav2_humble/install/nav2_planner/lib/nav2_planner/planner_server+0x9387c) (BuildId: 191f253724b34c41ec9522f9202cc91f782cabef) 
     #1 0x7966568dc328 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc328) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 

 SUMMARY: AddressSanitizer: heap-use-after-free (/home/*****/nav2_humble/install/nav2_planner/lib/libplanner_server_core.so+0x1ea5ae) (BuildId: 1b8bf26805500b9cf783c2dd56285836f05b374f) in std::vector<int, std::allocator<int> >::operator=(std::vector<int, std::allocator<int> > const&) 
 Shadow bytes around the buggy address: 
   0x0c2e80010280: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 
   0x0c2e80010290: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 
   0x0c2e800102a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 
   0x0c2e800102b0: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 
   0x0c2e800102c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 
 =>0x0c2e800102d0: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 
   0x0c2e800102e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 
   0x0c2e800102f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 
   0x0c2e80010300: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 
   0x0c2e80010310: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 
   0x0c2e80010320: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 
 Shadow byte legend (one shadow byte represents 8 application bytes): 
   Addressable:           00 
   Partially addressable: 01 02 03 04 05 06 07  
   Heap left redzone:       fa 
   Freed heap region:       fd 
   Stack left redzone:      f1 
   Stack mid redzone:       f2 
   Stack right redzone:     f3 
   Stack after return:      f5 
   Stack use after scope:   f8 
   Global redzone:          f9 
   Global init order:       f6 
   Poisoned by user:        f7 
   Container overflow:      fc 
   Array cookie:            ac 
   Intra object redzone:    bb 
   ASan internal:           fe 
   Left alloca redzone:     ca 
   Right alloca redzone:    cb 
 ==1243485==ABORTING 

Additional information


It's a shutdown-issue

First, based on my execution logs, I can confirm this is a shutdown issue.

It is caused by the clear_entire_service_ in the costmap not being actively closed.
  1. The callback function clearEntireCallback() that triggers the bug is executed by the service clear_entire_service_:

    https://github.com/ros-navigation/navigation2/blob/9fbae3e66ecc3b7315ca7cbb070468a564f5e111/nav2_costmap_2d/src/clear_costmap_service.cpp#L55-L59

    This service is created in the ClearCostmapService class, but this class does not seem to have a destructor, meaning there is no active release or shutdown behavior for the service.

    https://github.com/ros-navigation/navigation2/blob/9fbae3e66ecc3b7315ca7cbb070468a564f5e111/nav2_costmap_2d/include/nav2_costmap_2d/clear_costmap_service.hpp#L38-L59

    [Notice.1.]: In ROS2, the service is a smart pointer, which means that if the callback function is still executing, the smart pointer of this service will not be automatically released, and its execution thread will not automatically exit but will continue to perform tasks.

  2. This function accesses the inflation_layer_, which is a plugin belonging to costmap_ros_.

    The call stack for the access is: clearEntireCallback() -> costmap_ros_->resetLayers(). At this point, pointers to plugins are obtained from costmap_ros_'s layered_costmap_ and read:

    https://github.com/ros-navigation/navigation2/blob/9fbae3e66ecc3b7315ca7cbb070468a564f5e111/nav2_costmap_2d/src/costmap_2d_ros.cpp#L667-L686

    inflation_layer_ is one of these plugins, which means access to the address of inflation_layer_ starts here.

  3. During the on_cleanup() process of the costmap_ros_ node, inflation_layer_ is released:

    As a member of layered_costmap_, inflation_layer_ is released along with layered_costmap_.reset():

    https://github.com/ros-navigation/navigation2/blob/9fbae3e66ecc3b7315ca7cbb070468a564f5e111/nav2_costmap_2d/src/costmap_2d_ros.cpp#L364

    It is worth noting that costmap_ros_->resetLayers() accesses pointers to the vector (non-smart pointers), and this vector is also released along with layered_costmap_.reset(), causing the reference count of the smart pointer pointing to inflation_layer_ to drop to zero, thereby releasing inflation_layer_.

  4. At this point, inflation_layer_ is released, but the clearEntireCallback() thread that accesses inflation_layer_ does not stop, leading to the use-after-free bug.

suggestion:

Shall we actively close the service created by ClearCostmapService in its destructor like following ?

ClearCostmapService::~ClearCostmapService(){
// shutdown services : clear_except_service_,  clear_around_service_,  clear_entire_service_
}

or provide a lifecyle for ClearCostmapService():

void ClearCostmapService::deactivate(){
// shutdown services : clear_except_service_,  clear_around_service_,  clear_entire_service_
}
SteveMacenski commented 1 week ago

Sure, that works for me! I think a destructor makes sense for this case!