ros-navigation / navigation2

ROS 2 Navigation Framework and System
https://nav2.org/
Other
2.51k stars 1.27k forks source link

[BUG] Smac Hybrid Planner, vector::reserve exception due to precision #2547

Closed zoltan-lengyel closed 3 years ago

zoltan-lengyel commented 3 years ago

Bug report

Required Info:

Issue and analysis

I have an isolated unit test environment that is meant to test the algorithm used by the Smac hybrid astar planner. A unit test in this environment is always throwing a 'vector::reserve' std::exception.

The exception is thrown at line 139 in nav2_smac_planner/collision_checker.hpp:

int angle_bin = theta / bin_size_;
geometry_msgs::msg::Point new_pt;
const nav2_costmap_2d::Footprint & oriented_footprint = oriented_footprints_[angle_bin];
nav2_costmap_2d::Footprint current_footprint;
current_footprint.reserve(oriented_footprint.size());  // exception is thrown here

The reserve fails here, because the angle_bin ends up being equal to oriented_footprints_.size(), which means that oriented_footprint.size() is memory garbage due to over indexing. This happens because the passed in theta ends up being 72 due to an unfortunate chain of float precision errors. _I am using default settings with 72 angle_quantization_bins, so the passed in theta should never reach 72 [0, 72)._

The crash can be simply prevented by ensuring the index is not out of range:

const nav2_costmap_2d::Footprint & oriented_footprint = oriented_footprints_[angle_bin % oriented_footprints_.size()];

Further analysis

The following code in nav2_smac_planner/src/a_star.cpp is meant to ensure that the orientation is in the right range:

node->motion_table.state_space->interpolate(from(), to(), i / num_intervals, s());
reals = s.reals();
angle = reals[2] / node->motion_table.bin_size;
while (angle >= node->motion_table.num_angle_quantization_float) {
  angle -= node->motion_table.num_angle_quantization_float;
}
while (angle < 0.0) {
  angle += node->motion_table.num_angle_quantization_float;
}

Unfortunately with certain values this code fails to do its job. In my test case the reals[2] ends up being a very tiny negative value: -1.2830684958942129e-08 So angle will be also a tiny negative value: -1.47028814e-07 When this tiny negative value is increased by node->motion_table.num_angle_quantization_float (72) it ends up rolling up to 72 due to the limited precision, which is not valid.

I tried switching the related parts to double precision here, but I realized that would mean using double precision inside the node coordinates as well, which would have an impact on memory usage. Hopefully you will have a better idea at handling this, but you could use the index range check as a hotfix.

SteveMacenski commented 3 years ago

I wonder if this is simply handled implicitly if we change the blocks of code everywhere from

while (angle >= node->motion_table.num_angle_quantization_float) {
  angle -= node->motion_table.num_angle_quantization_float;
}
while (angle < 0.0) {
  angle += node->motion_table.num_angle_quantization_float;
}

to

while (angle < 0.0) {
  angle += node->motion_table.num_angle_quantization_float;
}
while (angle >= node->motion_table.num_angle_quantization_float) {
  angle -= node->motion_table.num_angle_quantization_float;
}

The special edge case you have found is when the number is very small but negative. Swapping the order would drive it up to 72 then back down to 0. I don't think there's a possible equivalent edge case in the opposite direction. If you agree, a PR would be appreciated!

zoltan-lengyel commented 3 years ago

I believe you are correct, there is no equivalent edge case in the opposite direction, so switching the order is a nice way of handling the problem. I sort of confirmed this with a lot of constructed values.

As far as I can see there is only two similar code blocks in smac_planner_hybrid.cpp, which could have the same issue:

double orientation_bin = tf2::getYaw(start.pose.orientation) / _angle_bin_size;
while (orientation_bin < 0.0) {
    orientation_bin += static_cast<float>(_angle_quantizations);
}

I might need a few days before I can get to it, but I will try making a PR.

SteveMacenski commented 3 years ago

Any update?

SteveMacenski commented 3 years ago

I'm planning on doing a sync this week so if you made that change, I could get new binaries out to you by the end of the month or sooner

zoltan-lengyel commented 3 years ago

I made a PR, sorry for the delay. I am not using the binaries right now, so it is not urgent for my case.

SteveMacenski commented 3 years ago

merging imminent