ros-navigation / navigation2

ROS 2 Navigation Framework and System
https://nav2.org/
Other
2.3k stars 1.2k forks source link

NullPtr bug during the `getRobotPose()` calculation of `costmap_ros_` in `nav2_controller` #4437

Closed GoesM closed 1 week ago

GoesM commented 2 weeks ago

Bug report

Required Info:

Steps to reproduce issue

Launch the navigation2 normally, as following steps:

#!/bin/bash
export ASAN_OPTIONS=halt_on_error=0:new_delete_type_mismatch=0:detect_leaks=0:log_pah=asan
source install/setup.bash
export TURTLEBOT3_MODEL=waffle
export GAZEBO_MODEL_PATH=$GAZEBO_MODEL_PATH:/opt/ros/humble/share/turtlebot3_gazebo/models
ros2 launch nav2_bringup tb3_simulation_launch.py headless:=True use_rviz:=False use_composition:=False 

Expected behavior

no NullPtr bug occured.

Actual behavior

the Asan report of this NullPtr bug is as following:

 ================================================================= 
 ==661329==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x76800b96d0a7 bp 0x767ffa0f4f78 sp 0x767ffa0f4e80 T23) 
 ==661329==The signal is caused by a READ memory access. 
 ==661329==Hint: address points to the zero page. 
     #0 0x76800b96d0a7 in geometry_msgs::msg::PoseStamped_<std::allocator<void> >& tf2_ros::BufferInterface::transform<geometry_msgs::msg::PoseStamped_<std::allocator<void> > >(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&, geometry_msgs::msg::PoseStamped_<std::allocator<void> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) const (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav_2d_utils/lib/libtf_help.so+0x50a7) (BuildId: f41cf37be8ae9a489a0e1967e8f3d5855328550d) 
     #1 0x76800ad0010c in geometry_msgs::msg::PoseStamped_<std::allocator<void> > tf2_ros::BufferInterface::transform<geometry_msgs::msg::PoseStamped_<std::allocator<void> > >(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) const (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_util/lib/libnav2_util_core.so+0x8a10c) (BuildId: 074014a12aa30b6de43159e7eb335b66f35e5343) 
     #2 0x76800acfe94d in nav2_util::transformPoseInTargetFrame(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&, geometry_msgs::msg::PoseStamped_<std::allocator<void> >&, tf2_ros::Buffer&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, double) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_util/lib/libnav2_util_core.so+0x8894d) (BuildId: 074014a12aa30b6de43159e7eb335b66f35e5343) 
     #3 0x76800acfe83f in nav2_util::getCurrentPose(geometry_msgs::msg::PoseStamped_<std::allocator<void> >&, tf2_ros::Buffer&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, double, rclcpp::Time) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_util/lib/libnav2_util_core.so+0x8883f) (BuildId: 074014a12aa30b6de43159e7eb335b66f35e5343) 
     #4 0x76800af340a8 in nav2_costmap_2d::Costmap2DROS::getRobotPose(geometry_msgs::msg::PoseStamped_<std::allocator<void> >&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0xd70a8) (BuildId: ae00021bac019132f382130b42d6275d543c55a4) 
     #5 0x7680096a8b93 in nav2_controller::ControllerServer::getRobotPose(geometry_msgs::msg::PoseStamped_<std::allocator<void> >&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x2a8b93) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #6 0x7680096a4c22 in nav2_controller::ControllerServer::computeAndPublishVelocity() (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x2a4c22) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #7 0x768009696561 in nav2_controller::ControllerServer::computeControl() (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x296561) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #8 0x76800981eec3 in nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::work() (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41eec3) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #9 0x76800981e234 in std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>::operator()() const (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41e234) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #10 0x76800981df47 in std::enable_if<is_invocable_r_v<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>&>, std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> >::type std::__invoke_r<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>&>(std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41df47) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #11 0x76800981dd88 in std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void> >::_M_invoke(std::_Any_data const&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41dd88) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #12 0x76800acf5b46 in std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_util/lib/libnav2_util_core.so+0x7fb46) (BuildId: 074014a12aa30b6de43159e7eb335b66f35e5343) 
     #13 0x768008a99ee7 in __pthread_once_slow nptl/./nptl/pthread_once.c:116:7 
     #14 0x76800981b951 in std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>::_M_run() (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41b951) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #15 0x768008edc252  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc252) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 
     #16 0x768008a94ac2 in start_thread nptl/./nptl/pthread_create.c:442:8 
     #17 0x768008b2684f  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 

 AddressSanitizer can not provide additional info. 
 SUMMARY: AddressSanitizer: SEGV (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav_2d_utils/lib/libtf_help.so+0x50a7) (BuildId: f41cf37be8ae9a489a0e1967e8f3d5855328550d) in geometry_msgs::msg::PoseStamped_<std::allocator<void> >& tf2_ros::BufferInterface::transform<geometry_msgs::msg::PoseStamped_<std::allocator<void> > >(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&, geometry_msgs::msg::PoseStamped_<std::allocator<void> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) const 
 Thread T23 created by T15 here: 
     #0 0x59f1c7ec487c in __interceptor_pthread_create (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/nav2_controller/controller_server+0x9387c) (BuildId: 3220191749087a73a7eee69dd741a727d5302202) 
     #1 0x768008edc328 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc328) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 
     #2 0x76800981b2b2 in std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>::_Async_state_impl<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()>(nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41b2b2) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #3 0x76800981aa58 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>, std::allocator<void>, nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()>(std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>*&, std::_Sp_alloc_shared_tag<std::allocator<void> >, nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41aa58) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #4 0x768009818d03 in std::future<std::__invoke_result<std::decay<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()>::type>::type> std::async<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()>(std::launch, nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x418d03) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #5 0x768009801e29 in nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x401e29) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #6 0x768009822ac7 in void std::__invoke_impl<void, void (nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::*&)(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >), nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>*&, std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> > >(std::__invoke_memfun_deref, void (nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::*&)(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >), nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>*&, std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x422ac7) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #7 0x768009809986 in rclcpp_action::Server<nav2_msgs::action::FollowPath>::call_goal_accepted_callback(std::shared_ptr<rcl_action_goal_handle_s>, std::array<unsigned char, 16ul>, std::shared_ptr<void>) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x409986) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #8 0x76800aa0c246 in rclcpp_action::ServerBase::execute_goal_request_received(std::shared_ptr<void>&) (/opt/ros/humble/lib/librclcpp_action.so+0x13246) (BuildId: 4dfcc4cee7010878193255b3a622d5194654caa8) 

 Thread T15 created by T0 here: 
     #0 0x59f1c7ec487c in __interceptor_pthread_create (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/nav2_controller/controller_server+0x9387c) (BuildId: 3220191749087a73a7eee69dd741a727d5302202) 
     #1 0x768008edc328 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc328) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 
     #2 0x7680097fb2a5 in nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::SimpleActionServer<std::shared_ptr<nav2_util::LifecycleNode> >(std::shared_ptr<nav2_util::LifecycleNode>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void ()>, std::function<void ()>, std::chrono::duration<long, std::ratio<1l, 1000l> >, bool, rcl_action_server_options_s const&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x3fb2a5) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #3 0x768009694cc8 in std::__detail::_MakeUniq<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath> >::__single_object std::make_unique<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>, std::shared_ptr<nav2_util::LifecycleNode>, char const (&) [12], std::_Bind<void (nav2_controller::ControllerServer::* (nav2_controller::ControllerServer*))()>, std::nullptr_t, std::chrono::duration<long, std::ratio<1l, 1000l> >, bool>(std::shared_ptr<nav2_util::LifecycleNode>&&, char const (&) [12], std::_Bind<void (nav2_controller::ControllerServer::* (nav2_controller::ControllerServer*))()>&&, std::nullptr_t&&, std::chrono::duration<long, std::ratio<1l, 1000l> >&&, bool&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x294cc8) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #4 0x76800968a121 in nav2_controller::ControllerServer::on_configure(rclcpp_lifecycle::State const&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x28a121) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #5 0x76800a80b8ec  (/opt/ros/humble/lib/librclcpp_lifecycle.so+0x288ec) (BuildId: 97f6428dc1ee45fd402b522b3b8e6b4fcfeabe76) 

 ==661329==ABORTING 

Additional information


It's a shutdown-issue

First, based on my execution logs, I can confirm this is a shutdown issue.

the casue analysis

Below is an analysis of the cause of this bug:

The actionserver binds the computeControl() function as a callback function:

https://github.com/ros-navigation/navigation2/blob/635880d3d6ad1f23eb75d793409189c653797f58/nav2_controller/src/controller_server.cpp#L228-L234

Therefore, when the actionserver receives a task command, it will invoke the function path computeControl() -> computeAndPublishVelocity():

https://github.com/ros-navigation/navigation2/blob/635880d3d6ad1f23eb75d793409189c653797f58/nav2_controller/src/controller_server.cpp#L482

The computeAndPublishVelocity() function directly uses the function getRobotPose() of costmap_ros_ for calculations:

https://github.com/ros-navigation/navigation2/blob/635880d3d6ad1f23eb75d793409189c653797f58/nav2_controller/src/controller_server.cpp#L584-L591

This causes the action_server_ to access the internal memory of costmap_ros_ during the getRobotPose() execution. However, it should be noted that :

although nav2_costmap_ros is a plugin, it is also an independent ROS2-LifecycleNode that actively responds to exit signals.

Therefore, costmap_ros might respond to signals from rclcpp::ok() and exit earlier than thier parent-LifecycleNode, nav2_controller.

If this response sequence occurs, costmap_ros_ might have already exited and released its memory BUT nav2_controller's action_server_ is still executing calculations, leading to access of released memory and thus an error.

Additional information for the additional information:

Note: This should be a very similar (or even the same) issue occurred with following ISSUEs , in which here're our previous discussion and fix into main branch:

https://github.com/ros-navigation/navigation2/issues/3971#issue-2001528486

https://github.com/ros-navigation/navigation2/pull/3972#issuecomment-1820538411

I noticed that such fix is deployed into main but not into humble so that the similar bug occurs again.

suggestion

Shall we push such change for costmap_ros_ from main into humble?

SteveMacenski commented 2 weeks ago

Yeah! Can you submit the backport PR to Humble? Happy to review/merge!

GoesM commented 1 week ago

My pleasure ^_^

I also have a check on Iron, and Iron also have the same issue, shall I submit another backport PR to Iron ?

SteveMacenski commented 1 week ago

Merged!

SteveMacenski commented 1 week ago

shall I submit another backport PR to Iron ?

Sure!