ros-navigation / navigation2

ROS 2 Navigation Framework and System
https://nav2.org/
Other
2.5k stars 1.27k forks source link

NullPtr bug during the `computeVelocityCommands()` during shutdown #4436

Closed GoesM closed 3 months ago

GoesM commented 3 months ago

Bug report

Required Info:

Steps to reproduce issue

The bug occurs during normal use of nav2_mppi_controller. By changing the controller plugin to MPPI while keeping the rest of the configuration the same as the official default, start navigation2. The YAML configuration for changing the controller plugin to MPPI is:

controller_server:
  ros__parameters:
    FollowPath:
      AckermannConstraints:
        min_turning_r: 0.2
      ConstraintCritic:
        cost_power: 1
        cost_weight: 4.0
        enabled: true
      CostCritic:
        collision_cost: 1000000.0
        consider_footprint: true
        cost_power: 1
        cost_weight: 3.81
        critical_cost: 300.0
        enabled: true
        near_goal_distance: 1.0
        trajectory_point_step: 2
      GoalAngleCritic:
        cost_power: 1
        cost_weight: 3.0
        enabled: true
        threshold_to_consider: 0.5
      GoalCritic:
        cost_power: 1
        cost_weight: 5.0
        enabled: true
        threshold_to_consider: 1.4
      ObstaclesCritic:
        collision_cost: 10000.0
        collision_margin_distance: 0.1
        consider_footprint: false
        cost_power: 1
        cost_scaling_factor: 10.0
        critical_weight: 20.0
        enabled: true
        inflation_radius: 0.55
        near_goal_distance: 0.5
        repulsion_weight: 1.5
      PathAlignCritic:
        cost_power: 1
        cost_weight: 14.0
        enabled: true
        max_path_occupancy_ratio: 0.05
        offset_from_furthest: 20
        threshold_to_consider: 0.5
        trajectory_point_step: 4
        use_path_orientations: false
      PathAngleCritic:
        cost_power: 1
        cost_weight: 2.0
        enabled: true
        max_angle_to_furthest: 1.0
        mode: 0
        offset_from_furthest: 4
        threshold_to_consider: 0.5
      PathFollowCritic:
        cost_power: 1
        cost_weight: 5.0
        enabled: true
        offset_from_furthest: 5
        threshold_to_consider: 1.4
      PreferForwardCritic:
        cost_power: 1
        cost_weight: 5.0
        enabled: true
        threshold_to_consider: 0.5
      TrajectoryVisualizer:
        time_step: 3
        trajectory_step: 5
      TwirlingCritic:
        enabled: true
        twirling_cost_power: 1
        twirling_cost_weight: 10.0
      VelocityDeadbandCritic:
        cost_power: 1
        cost_weight: 35.0
        deadband_velocities:
          - 0.05
          - 0.05
          - 0.05
        enabled: true
      batch_size: 2000
      critics:
        - ConstraintCritic
        - CostCritic
        - GoalCritic
        - GoalAngleCritic
        - PathAlignCritic
        - PathFollowCritic
        - PathAngleCritic
        - PreferForwardCritic
      gamma: 0.015
      iteration_count: 1
      model_dt: 0.05
      motion_model: DiffDrive
      plugin: nav2_mppi_controller::MPPIController
      prune_distance: 1.7
      regenerate_noises: false
      reset_period: 1.0
      temperature: 0.3
      time_steps: 56
      transform_tolerance: 0.1
      visualize: false
      vx_max: 0.5
      vx_min: -0.35
      vx_std: 0.2
      vy_max: 0.5
      vy_std: 0.2
      wz_max: 1.9
      wz_std: 0.4
    controller_frequency: 20.0
    controller_plugins:
      - FollowPath
    failure_tolerance: 0.3
    general_goal_checker:
      plugin: nav2_controller::SimpleGoalChecker
      stateful: true
      xy_goal_tolerance: 0.25
      yaw_goal_tolerance: 0.25
    goal_checker_plugins:
      - general_goal_checker
    min_theta_velocity_threshold: 0.001
    min_x_velocity_threshold: 0.001
    min_y_velocity_threshold: 0.5
    progress_checker:
      movement_time_allowance: 10.0
      plugin: nav2_controller::SimpleProgressChecker
      required_movement_radius: 0.5
    progress_checker_plugin: progress_checker
    use_sim_time: true

Expected behavior

no NullPtr bug occured.

Actual behavior

the Asan report of this NullPtr bug is as following:

 ================================================================= 
 ==605052==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000060 (pc 0x741423520880 bp 0x61f0000ea4d8 sp 0x7414125f57b8 T23) 
 ==605052==The signal is caused by a READ memory access. 
 ==605052==Hint: address points to the zero page. 
     #0 0x741423520880 in nav2_costmap_2d::Costmap2D::worldToMap(double, double, unsigned int&, unsigned int&) const (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0xc3880) (BuildId: ae00021bac019132f382130b42d6275d543c55a4) 
     #1 0x741414cd1704 in mppi::PathHandler::getGlobalPlanConsideringBoundsInCostmapFrame(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_mppi_controller/lib/libmppi_controller.so+0xec704) (BuildId: 6464b5c753e86bc8fa48c1a26a3f20eee21259fb) 
     #2 0x741414cd1d0e in mppi::PathHandler::transformPath(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_mppi_controller/lib/libmppi_controller.so+0xecd0e) (BuildId: 6464b5c753e86bc8fa48c1a26a3f20eee21259fb) 
     #3 0x741414c428c9 in nav2_mppi_controller::MPPIController::computeVelocityCommands(geometry_msgs::msg::PoseStamped_<std::allocator<void> > const&, geometry_msgs::msg::Twist_<std::allocator<void> > const&, nav2_core::GoalChecker*) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_mppi_controller/lib/libmppi_controller.so+0x5d8c9) (BuildId: 6464b5c753e86bc8fa48c1a26a3f20eee21259fb) 
     #4 0x741421ca506b in nav2_controller::ControllerServer::computeAndPublishVelocity() (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x2a506b) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #5 0x741421c96561 in nav2_controller::ControllerServer::computeControl() (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x296561) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #6 0x741421e1eec3 in nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::work() (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41eec3) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #7 0x741421e1e234 in std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>::operator()() const (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41e234) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #8 0x741421e1df47 in std::enable_if<is_invocable_r_v<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>&>, std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> >::type std::__invoke_r<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>&>(std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41df47) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #9 0x741421e1dd88 in std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void> >::_M_invoke(std::_Any_data const&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41dd88) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #10 0x7414232f5b46 in std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_util/lib/libnav2_util_core.so+0x7fb46) (BuildId: 074014a12aa30b6de43159e7eb335b66f35e5343) 
     #11 0x741421099ee7 in __pthread_once_slow nptl/./nptl/pthread_once.c:116:7 
     #12 0x741421e1b951 in std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>::_M_run() (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41b951) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #13 0x7414214dc252  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc252) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 
     #14 0x741421094ac2 in start_thread nptl/./nptl/pthread_create.c:442:8 
     #15 0x74142112684f  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 

 AddressSanitizer can not provide additional info. 
 SUMMARY: AddressSanitizer: SEGV (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_costmap_2d/lib/libnav2_costmap_2d_core.so+0xc3880) (BuildId: ae00021bac019132f382130b42d6275d543c55a4) in nav2_costmap_2d::Costmap2D::worldToMap(double, double, unsigned int&, unsigned int&) const 
 Thread T23 created by T15 here: 
     #0 0x5a47bc68d87c in __interceptor_pthread_create (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/nav2_controller/controller_server+0x9387c) (BuildId: 3220191749087a73a7eee69dd741a727d5302202) 
     #1 0x7414214dc328 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc328) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 
     #2 0x741421e1b2b2 in std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>::_Async_state_impl<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()>(nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41b2b2) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #3 0x741421e1aa58 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>, std::allocator<void>, nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()>(std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()> >, void>*&, std::_Sp_alloc_shared_tag<std::allocator<void> >, nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x41aa58) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #4 0x741421e18d03 in std::future<std::__invoke_result<std::decay<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()>::type>::type> std::async<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()>(std::launch, nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >)::'lambda'()&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x418d03) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #5 0x741421e01e29 in nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::handle_accepted(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x401e29) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #6 0x741421e22ac7 in void std::__invoke_impl<void, void (nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::*&)(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >), nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>*&, std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> > >(std::__invoke_memfun_deref, void (nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::*&)(std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >), nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>*&, std::shared_ptr<rclcpp_action::ServerGoalHandle<nav2_msgs::action::FollowPath> >&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x422ac7) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #7 0x741421e09986 in rclcpp_action::Server<nav2_msgs::action::FollowPath>::call_goal_accepted_callback(std::shared_ptr<rcl_action_goal_handle_s>, std::array<unsigned char, 16ul>, std::shared_ptr<void>) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x409986) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #8 0x741422fe8246 in rclcpp_action::ServerBase::execute_goal_request_received(std::shared_ptr<void>&) (/opt/ros/humble/lib/librclcpp_action.so+0x13246) (BuildId: 4dfcc4cee7010878193255b3a622d5194654caa8) 

 Thread T15 created by T0 here: 
     #0 0x5a47bc68d87c in __interceptor_pthread_create (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/nav2_controller/controller_server+0x9387c) (BuildId: 3220191749087a73a7eee69dd741a727d5302202) 
     #1 0x7414214dc328 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc328) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2) 
     #2 0x741421dfb2a5 in nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>::SimpleActionServer<std::shared_ptr<nav2_util::LifecycleNode> >(std::shared_ptr<nav2_util::LifecycleNode>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void ()>, std::function<void ()>, std::chrono::duration<long, std::ratio<1l, 1000l> >, bool, rcl_action_server_options_s const&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x3fb2a5) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #3 0x741421c94cc8 in std::__detail::_MakeUniq<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath> >::__single_object std::make_unique<nav2_util::SimpleActionServer<nav2_msgs::action::FollowPath>, std::shared_ptr<nav2_util::LifecycleNode>, char const (&) [12], std::_Bind<void (nav2_controller::ControllerServer::* (nav2_controller::ControllerServer*))()>, std::nullptr_t, std::chrono::duration<long, std::ratio<1l, 1000l> >, bool>(std::shared_ptr<nav2_util::LifecycleNode>&&, char const (&) [12], std::_Bind<void (nav2_controller::ControllerServer::* (nav2_controller::ControllerServer*))()>&&, std::nullptr_t&&, std::chrono::duration<long, std::ratio<1l, 1000l> >&&, bool&&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x294cc8) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #4 0x741421c8a121 in nav2_controller::ControllerServer::on_configure(rclcpp_lifecycle::State const&) (/home/goes/ROS_Workstation/nav2_humble_experiment/install/nav2_controller/lib/libcontroller_server_core.so+0x28a121) (BuildId: f6be86e576eea5a4c9c0de506e472b69f7a05ca9) 
     #5 0x741422de58ec  (/opt/ros/humble/lib/librclcpp_lifecycle.so+0x288ec) (BuildId: 97f6428dc1ee45fd402b522b3b8e6b4fcfeabe76) 

 ==605052==ABORTING 

Additional information


It's a shutdown-issue

First, based on my execution logs, I can confirm this is a shutdown issue.

the casue analysis

Below is an analysis of the cause of this bug:

The actionserver binds the computeControl() function as a callback function:

https://github.com/ros-navigation/navigation2/blob/635880d3d6ad1f23eb75d793409189c653797f58/nav2_controller/src/controller_server.cpp#L228-L234

Therefore, when the action_server_ receives a task command, it will invoke the function path computeControl() -> computeAndPublishVelocity() -> mppi::PathHandler::transformPath() -> .......

NOTICE: from here, the action_server_ access the internal memory of nav2_mppi_controller , during the computeControl() execution.

So that, the bug seems to be caused because action_server_ is still executing but mppi has been freed.

GoesM commented 3 months ago

futhermore Additional information

So that, the bug seems to be caused because action_server_ is still executing but mppi has been freed.

But, I'm confused about such result because I notice there's already a action_server_->deactivate() to make sure the thread of action_server_ shutdown before cleanup.

https://github.com/ros-navigation/navigation2/blob/635880d3d6ad1f23eb75d793409189c653797f58/nav2_controller/src/controller_server.cpp#L280

So I check all related functions , and I noticed those lines:

https://github.com/ros-navigation/navigation2/blob/635880d3d6ad1f23eb75d793409189c653797f58/nav2_controller/src/controller_server.cpp#L454-L482

It's found that computeAndPublishVelocity() is called following while(rclcpp::ok())

But, due to our experiments before, we could know that rclcpp::ok() might bring some exception errors for waiting threads exit, here's the link to our discussion before:

https://github.com/SteveMacenski/slam_toolbox/issues/691

suggestion

Shall we also replace the rclcpp::ok() by a bool flag ?

padhupradheep commented 3 months ago

same happens while using RPP, see: https://github.com/ros-navigation/navigation2/issues/4438 (duplicate)

SteveMacenski commented 3 months ago

@GoesM do you see any of the following errors with deactivate? https://github.com/ros-navigation/navigation2/blob/main/nav2_util/include/nav2_util/simple_action_server.hpp#L305-L320

I assume you only see this one shutdown when the system is currently operating, not when the robot is staying still and you shutdown the system (otherwise that thread / action server wouldn't be active in the first place). Thus, I would think you'd see one of these 2 errors / warnings.

If we set the server active state to false in the deactive function, it should be accessible by the controller server here to be checking it, which runs every cycle. So, at most we should be 1 cycle away from checking that we're not active anymore and thus should exit the control loop. The action server should be waiting 500ms while trying to shut down (https://github.com/ros-navigation/navigation2/blob/main/nav2_util/include/nav2_util/simple_action_server.hpp#L315) which should be plenty for the controller's loop to end since the control rates should pretty much always be 10hz+.

So... this seems like we shouldn't be having an issue, regardless of rclcpp::ok and you should be seeing error / warning logs about some failures on deactivate() that we could use to debug further.

GoesM commented 3 months ago

Firstly, sorry for my previous misunderstanding regarding this bug.

To ensure the accuracy of my fix, I conducted several experiments to verify my hypothesis.

After these experiments, I invalidated my previous analysis, but fortunately, I now have a very definite answer.

Conclusion

This is still an issue caused by costmap_ros_ as an independent lifecycle node responding to the Ctrl+C signal before nav2_controller, resulting in a null pointer bug.

Therefore, this issue and #4438 are actually additional tickets to issue #4437, which will be resolved through a single PR.


Details of the experiments and analysis are as follows:

Experiment 1:

Adding a delay before rclcpp::ok() does not trigger the bug.

std::this_thread::sleep_for(std::chrono::seconds(2));
rclcpp::WallRate loop_rate(controller_frequency_);
while (rclcpp::ok()) {

Experiment 2:

Adding a delay after rclcpp::ok() but before computeVelocityCommands() almost always triggers the bug.

updateGlobalPath();
std::this_thread::sleep_for(std::chrono::seconds(2));
computeAndPublishVelocity();

Experiment 3:

Inserting a delay at the same position as in Experiment 2 but replacing rclcpp::ok() with a bool flag method still almost always triggers the bug.

This indicates that rclcpp::ok() is not the cause. My previous speculative analysis was incorrect.

Detailed Cause Analysis:

Inserting a delay before computeVelocityCommands() makes the deactivate process of action_server_ very slow, while costmap_ros_, as an independent ROS2-LifecycleNode, has already responded to the Ctrl+C signal;

At this time, nav2_controller is attempting to execute computeVelocityCommands() and will call different controller plugins. However, plugins like mppi_controller, regulated_pure_pursuit_controller, and the default dwb_controller all call functions of costmap_ros_->(), leading to null pointer access.

regulated_pure_pursuit_controller accesses costmap_ros_->getCostmap():

https://github.com/ros-navigation/navigation2/blob/9fbae3e66ecc3b7315ca7cbb070468a564f5e111/nav2_regulated_pure_pursuit_controller/src/regulated_pure_pursuit_controller.cpp#L164-L171

mppi_controller accesses costmap_ros_->worldToMap():

https://github.com/ros-navigation/navigation2/blob/9fbae3e66ecc3b7315ca7cbb070468a564f5e111/nav2_mppi_controller/src/path_handler.cpp#L91-L95

dwb_controller accesses costmap_ros_->getSizeInCellsX():

https://github.com/ros-navigation/navigation2/blob/9fbae3e66ecc3b7315ca7cbb070468a564f5e111/nav2_dwb_controller/dwb_core/src/dwb_local_planner.cpp#L484-L485

SteveMacenski commented 3 months ago

Merged!