ros-navigation / navigation2

ROS 2 Navigation Framework and System
https://nav2.org/
Other
2.45k stars 1.24k forks source link

some special msg onto `/initialpose` could lead to heap-buffer-overflow bug #4307

Closed GoesM closed 2 months ago

GoesM commented 3 months ago

Bug report

Required Info:

Steps to reproduce issue

Launch the navigation2 normally, as following steps:

#!/bin/bash
export ASAN_OPTIONS=halt_on_error=0:new_delete_type_mismatch=0:detect_leaks=0:log_pah=asan
source install/setup.bash
export TURTLEBOT3_MODEL=waffle
export GAZEBO_MODEL_PATH=$GAZEBO_MODEL_PATH:/opt/ros/humble/share/turtlebot3_gazebo/models
ros2 launch nav2_bringup tb3_simulation_launch.py headless:=True use_rviz:=False use_composition:=False 

Curious about how nav2 face to topic-interception, i sent validate /PoseWithCovarianceStamped msg onto topic /initialpose, which is like this:

ros2 topic pub /initialpose geometry_msgs/PoseWithCovarianceStamped " 
header:
  frame_id: map
  stamp:
    nanosec: 834647291
    sec: 1707737088
pose:
  covariance:
    - 0.25
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.25
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.06853891909122467
  pose:
    orientation:
      w: 0.0
      x: 0.0
      y: 0.0
      z: 1.0
    position:
      x: -751613824.000000
      y: 0.37812575697898865
      z: 0.0" -1

[notice] the value of x of the position is -751613824.000000, which leads to the heap-buffer-overflow.

Expected behavior

Actual behavior

The ASAN reporting a stack-buffer-overflow bug to me as following, and the nav2_amcl stop its work.

=================================================================
==424016==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x76a4a2e36758 at pc 0x76a4abfe3c61 bp 0x7ffcfd208690 sp 0x7ffcfd208688
READ of size 8 at 0x76a4a2e36758 thread T0
    #0 0x76a4abfe3c60 in pf_cluster_stats (/home/***/nav2_humble/install/nav2_amcl/lib/libpf_lib.so+0x7c60) (BuildId: 92f6f6da07c69e4e8fdb674a2fb2b39a58de9a00)
    #1 0x76a4abfe21f5 in pf_init (/home/***/nav2_humble/install/nav2_amcl/lib/libpf_lib.so+0x61f5) (BuildId: 92f6f6da07c69e4e8fdb674a2fb2b39a58de9a00)
    #2 0x76a4ac546df2 in nav2_amcl::AmclNode::handleInitialPose(geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x346df2) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #3 0x76a4ac5446c6 in nav2_amcl::AmclNode::initialPoseReceived(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x3446c6) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #4 0x76a4ac726d57 in void std::__invoke_impl<void, void (nav2_amcl::AmclNode::*&)(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >), nav2_amcl::AmclNode*&, std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > > >(std::__invoke_memfun_deref, void (nav2_amcl::AmclNode::*&)(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >), nav2_amcl::AmclNode*&, std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >&&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x526d57) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #5 0x76a4ac7561cb in auto rclcpp::AnySubscriptionCallback<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >, rclcpp::MessageInfo const&)::'lambda'(auto&&)::operator()<std::function<void (std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >)>&>(auto&&) const (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x5561cb) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #6 0x76a4ac753061 in rclcpp::AnySubscriptionCallback<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >, rclcpp::MessageInfo const&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x553061) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #7 0x76a4ac72fc34 in rclcpp::Subscription<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, std::allocator<void>, geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, rclcpp::message_memory_strategy::MessageMemoryStrategy<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, std::allocator<void> > >::handle_message(std::shared_ptr<void>&, rclcpp::MessageInfo const&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x52fc34) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #8 0x76a4ad6257bb in rclcpp::Executor::execute_subscription(std::shared_ptr<rclcpp::SubscriptionBase>) (/opt/ros/humble/lib/librclcpp.so+0xe77bb) (BuildId: 4cca8a387f3c93d38a0567a8efc7cba9106f5d9a)
    #9 0x76a4ad625fbe in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) (/opt/ros/humble/lib/librclcpp.so+0xe7fbe) (BuildId: 4cca8a387f3c93d38a0567a8efc7cba9106f5d9a)
    #10 0x76a4ad62d8af in rclcpp::executors::SingleThreadedExecutor::spin() (/opt/ros/humble/lib/librclcpp.so+0xef8af) (BuildId: 4cca8a387f3c93d38a0567a8efc7cba9106f5d9a)
    #11 0x76a4ad62dac4 in rclcpp::spin(std::shared_ptr<rclcpp::node_interfaces::NodeBaseInterface>) (/opt/ros/humble/lib/librclcpp.so+0xefac4) (BuildId: 4cca8a387f3c93d38a0567a8efc7cba9106f5d9a)
    #12 0x64f0866154cd in main (/home/***/nav2_humble/install/nav2_amcl/lib/nav2_amcl/amcl+0xe84cd) (BuildId: 6374af6c6a02284720c5116aa2ef067ebdd75367)
    #13 0x76a4ab829d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #14 0x76a4ab829e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #15 0x64f0865545e4 in _start (/home/***/nav2_humble/install/nav2_amcl/lib/nav2_amcl/amcl+0x275e4) (BuildId: 6374af6c6a02284720c5116aa2ef067ebdd75367)

0x76a4a2e36758 is located 168 bytes to the left of 352000-byte region [0x76a4a2e36800,0x76a4a2e8c700)
allocated by thread T0 here:
    #0 0x64f0865d7618 in __interceptor_calloc (/home/***/nav2_humble/install/nav2_amcl/lib/nav2_amcl/amcl+0xaa618) (BuildId: 6374af6c6a02284720c5116aa2ef067ebdd75367)
    #1 0x76a4abfe1759 in pf_alloc (/home/***/nav2_humble/install/nav2_amcl/lib/libpf_lib.so+0x5759) (BuildId: 92f6f6da07c69e4e8fdb674a2fb2b39a58de9a00)
    #2 0x76a4ac532d69 in nav2_amcl::AmclNode::on_configure(rclcpp_lifecycle::State const&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x332d69) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #3 0x76a4ad523b8c  (/opt/ros/humble/lib/librclcpp_lifecycle.so+0x28b8c) (BuildId: e9b8e454bf87aaab775667b79aefcab12c018de9)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/***/nav2_humble/install/nav2_amcl/lib/libpf_lib.so+0x7c60) (BuildId: 92f6f6da07c69e4e8fdb674a2fb2b39a58de9a00) in pf_cluster_stats
Shadow bytes around the buggy address:
  0x0ed5145bec90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145beca0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145becb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145becc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145becd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0ed5145bece0: fa fa fa fa fa fa fa fa fa fa fa[fa]fa fa fa fa
  0x0ed5145becf0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145bed00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ed5145bed10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ed5145bed20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ed5145bed30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==424016==ABORTING

Additional information


it seems like that a high value of x could lead to such bug...

SteveMacenski commented 3 months ago

You know what to do :wink:

GoesM commented 3 months ago

Ah... I think the situation here might be a bit different and not suitable for validation through validageMsg().

In my understanding, nav2_util::validateMsg() is a generic check mainly for nan, inf, and similar requirements that all messages of the same msg-type should meet. However, the error here is caused by a large value of x in a specific calculation process of nav2_amcl.

And I believe it's more appropriate to perform the check in the callback process of /initialpose in nav2_amcl, rather than adding conditions to nav2_util::validateMsg(). There are two main reasons for this:

Overall, I think agreeing on an upper limit for x and implementing it in the callback process of nav2_amcl would be more appropriate.

However, I'm still checking the specific line of code where the overflow occurs and haven't confirmed it yet. I think here would be a better method to fix it after a further check.

SO, I'd further confirm the specific line of code causing the buffer overflow and investigate why JUST a large value (still in the range of integer) here could lead to a heap-buffer-overflow.

SteveMacenski commented 3 months ago

And I believe it's more appropriate to perform the check in the callback process of /initialpose in nav2_amcl

Ok! I trust your opinion :smile: Let me know what you think is best!

SteveMacenski commented 3 months ago

@GoesM any update?

GoesM commented 3 months ago

I'm so sorry for my late work about this issue because I'm struggled by other works. T_T

After the updating humble-branch, validateMsg() is deployed so the command to reproduce this issue needs some little change, as I'd edited it.

Additional Information:

**which code-line met to heap-buffer-overflow

I insert code for log as following:

void goes_debug(int cnt){
  fprintf(stderr, "[GOES|DEBUG]:---------- %d \n", cnt);
}
// Re-compute the cluster statistics for a sample set
void pf_cluster_stats(pf_t * pf, pf_sample_set_t * set)
{
...
...
  goes_debug(3);
  // Compute cluster stats
  for (i = 0; i < set->sample_count; i++) {
    ...
    ...
    goes_debug(4);
    cluster = set->clusters + cidx;
    goes_debug(5);
    cluster->weight += sample->weight;
    goes_debug(6);
    weight += sample->weight;
    goes_debug(7);
    ...
    ...

  }
...
}

And I catched the log during the crash:

[amcl-7] [INFO] [1717073213.332957908] [amcl]: Setting pose (5.019000): -751613824.000 0.378 3.142
[amcl-7] [GOES|DEBUG]:---------- 1 
[amcl-7] [GOES|DEBUG]:---------- 2 
[amcl-7] [GOES|DEBUG]:---------- 3 
[amcl-7] [GOES|DEBUG]:---------- 4 
[amcl-7] [GOES|DEBUG]:---------- 5 
[amcl-7] [GOES|DEBUG]:---------- 6 
[amcl-7] [GOES|DEBUG]:---------- 7 
[amcl-7] [GOES|DEBUG]:---------- 4 
[amcl-7] [GOES|DEBUG]:---------- 5 
[ERROR] [amcl-7]: process has died [pid 111426, exit code 1, cmd '/home/goes/ROS_Workstation/humble_fork/install/nav2_amcl/lib/nav2_amcl/amcl --ros-args --log-level info --ros-args -r __node:=amcl --params-file /tmp/launch_params_t5mx02k0 -r /tf:=tf -r /tf_static:=tf_static'].

It proves that the bug occurs in line:

https://github.com/ros-navigation/navigation2/blob/cf3dd553c2164c2a586f0f66628f22d43264d35f/nav2_amcl/src/pf/pf.c#L521

And it seems like cluster = set->clusters +cidx would let the cluster became an invalidate pointer.

GoesM commented 3 months ago

However, I am not an expert about the pf-calculation , and not clear whether such buffer-overflow is related to the size of map or not ?

If related to , I think we could add a check to ensure that the received pose is within the map range ?

SteveMacenski commented 3 months ago

Sure! That seems sensible

SteveMacenski commented 2 months ago

@GoesM can you submit a PR?

GoesM commented 2 months ago

oh no, I'm very very sorry for that, it's my missing.

I submit the PR #4416

SteveMacenski commented 2 months ago

No worries!