ros-navigation / navigation2

ROS 2 Navigation Framework and System
https://nav2.org/
Other
2.51k stars 1.27k forks source link

[Smac] Crash using footprint instead of radius :bug: #2808

Closed matijazigic closed 2 years ago

matijazigic commented 2 years ago

Bug report

Required Info:

Steps to reproduce issue


1. Use smack planner and default turtlebot3 world and map
2. Use footprint instead of radius
3. Send goal

Expected behavior

Actual behavior

Additional information

Crash report: [component_container_isolated-6] [WARN] [1644321675.406004450] [planner_server]: GridBased plugin failed to plan calculation to (0.43, 1.77): "vector::reserve" [component_container_isolated-6] [WARN] [1644321675.406055584] [planner_server]: [compute_path_to_pose] [ActionServer] Aborting handle.

#robot_radius: 0.22
footprint: "[[0.2, 0.1], [0.2, -0.1], [-0.2, -0.1], [-0.2, 0.1]]"

planner_server:
  ros__parameters:
    planner_plugins: ["GridBased"]
    use_sim_time: True
    GridBased:
      plugin: "nav2_smac_planner/SmacPlannerLattice"
      tolerance: 0.5                      # tolerance for planning if unable to reach exact pose, in meters, for 2D node
      downsample_costmap: false           # whether or not to downsample the map
      downsampling_factor: 1              # multiplier for the resolution of the costmap layer (e.g. 2 on a 5cm costmap would be 10cm)
      allow_unknown: false                # allow traveling in unknown space
      max_iterations: 1000000             # maximum total iterations to search for before failing (in case unreachable), set to -1 to disable
      max_on_approach_iterations: 1000    # maximum number of iterations to attempt to reach goal once in tolerance, 2D only
      max_planning_time: 3.5              # max time in s for planner to plan, smooth, and upsample. Will scale maximum smoothing and upsampling times based on remaining time after planning.
      motion_model_for_search: "DUBINS"    # 2D Moore, Von Neumann; Hybrid Dubin, Redds-Shepp; State Lattice set internally
      cost_travel_multiplier: 2.0         # For 2D: Cost multiplier to apply to search to steer away from high cost areas. Larger values will place in the center of aisles more exactly (if non-`FREE` cost potential field exists) but take slightly longer to compute. To optimize for speed, a value of 1.0 is reasonable. A reasonable tradeoff value is 2.0. A value of 0.0 effective disables steering away from obstacles and acts like a naive binary search A*.
      angle_quantization_bins: 64         # For Hybrid nodes: Number of angle bins for search, must be 1 for 2D node (no angle search)
      analytic_expansion_ratio: 3.5 #3.5       # For Hybrid/Lattice nodes: The ratio to attempt analytic expansions during search for final approach.
      analytic_expansion_max_length: 3.0    # For Hybrid/Lattice nodes: The maximum length of the analytic expansion to be considered valid to prevent unsafe shortcutting (in meters). This should be scaled with minimum turning radius and be no less than 4-5x the minimum radius
      minimum_turning_radius: 0.40        # For Hybrid/Lattice nodes: minimum turning radius in m of path / vehicle
      reverse_penalty: 2.1                # For Reeds-Shepp model: penalty to apply if motion is reversing, must be => 1
      change_penalty: 0.0                 # For Hybrid nodes: penalty to apply if motion is changing directions, must be >= 0
      non_straight_penalty: 1.20          # For Hybrid nodes: penalty to apply if motion is non-straight, must be => 1
      cost_penalty: 2.0                   # For Hybrid nodes: penalty to apply to higher cost areas when adding into the obstacle map dynamic programming distance expansion heuristic. This drives the robot more towards the center of passages. A value between 1.3 - 3.5 is reasonable.
      retrospective_penalty: 0.025        # For Hybrid/Lattice nodes: penalty to prefer later maneuvers before earlier along the path. Saves search time since earlier nodes are not expanded until it is necessary. Must be >= 0.0 and <= 1.0
      rotation_penalty: 5.0               # For Lattice node: Penalty to apply only to pure rotate in place commands when using minimum control sets containing rotate in place primitives. This should always be set sufficiently high to weight against this action unless strictly necessary for obstacle avoidance or there may be frequent discontinuities in the plan where it requests the robot to rotate in place to short-cut an otherwise smooth path for marginal path distance savings.
      lattice_filepath: "/home/matija/Desktop/nav2_ws/src/navigation2/nav2_smac_planner/lattice_primitives/sample_primitives/5cm_resolution/0.5m_turning_radius/ackermann/output.json" # The filepath to the state lattice graph
      lookup_table_size: 20.0              # For Hybrid nodes: Size of the dubin/reeds-sheep distance window to cache, in meters.
      cache_obstacle_heuristic: True      # For Hybrid nodes: Cache the obstacle map dynamic programming distance expansion heuristic between subsiquent replannings of the same goal location. Dramatically speeds up replanning performance (40x) if costmap is largely static.  
      allow_reverse_expansion: True #False      # For Lattice nodes: Whether to expand state lattice graph in forward primitives or reverse as well, will double the branching factor at each step.   
      smooth_path: True                   # For Lattice/Hybrid nodes: Whether or not to smooth the path, always true for 2D nodes.
      smoother:
        max_iterations: 1000
        w_smooth: 0.3
        w_data: 0.2
        tolerance: 0.0000000001
        do_refinement: true               # Whether to recursively run the smoother 3 times on the results from prior runs to refine the results further

Feature request

Feature description

Implementation considerations

matijazigic commented 2 years ago

Stacktrace:

Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0208 17:27:37.680630 383299 main.cpp:29] *** Aborted at 1644337657 (unix time) try "date -d @1644337657" if you are using GNU date ***
E0208 17:27:37.683888 383299 main.cpp:29] PC: @                0x0 (unknown)
E0208 17:27:37.684031 383299 main.cpp:29] *** SIGSEGV (@0x8) received by PID 383029 (TID 0x7f3390b0b700) from PID 8; stack trace: ***
E0208 17:27:37.685885 383299 main.cpp:29]     @     0x7f33c82693c0 (unknown)
E0208 17:27:37.689031 383299 main.cpp:29]     @     0x7f33c7e03975 nav2_costmap_2d::FootprintCollisionChecker<>::footprintCost()
E0208 17:27:37.689402 383299 main.cpp:29]     @     0x7f33c00f0e5f nav2_smac_planner::GridCollisionChecker::inCollision()
E0208 17:27:37.689700 383299 main.cpp:29]     @     0x7f33c00fae9a nav2_smac_planner::NodeLattice::isNodeValid()
E0208 17:27:37.689934 383299 main.cpp:29]     @     0x7f33c00fc0fc nav2_smac_planner::NodeLattice::getNeighbors()
E0208 17:27:37.690172 383299 main.cpp:29]     @     0x7f33c00e63f6 nav2_smac_planner::AStarAlgorithm<>::createPath()
E0208 17:27:37.690488 383299 main.cpp:29]     @     0x7f33c00ceee2 nav2_smac_planner::SmacPlannerLattice::createPlan()
E0208 17:27:37.692167 383299 main.cpp:29]     @     0x7f33c862169f nav2_planner::PlannerServer::getPlan()
E0208 17:27:37.693833 383299 main.cpp:29]     @     0x7f33c862710e nav2_planner::PlannerServer::computePlan()
E0208 17:27:37.695358 383299 main.cpp:29]     @     0x7f33c864bdeb nav2_util::SimpleActionServer<>::work()
SteveMacenski commented 2 years ago

Can you compile with -g so we can see exact lines / memory locations? That would narrow it down. I haven’t seen this before but looks like it should be a quick fix if I know exactly what variable to analyze

also smac, not smack 😉 changed the ticket name for later searchability

matijazigic commented 2 years ago

I tried compiling with the -g flag, and everything that is suggested here, with no luck. I added -g flag inside nav2_planner and nav2_smac_planner. Am I doing something wrong?

Only thing that I can get is:

[component_container_isolated-6] [WARN] [1644321675.406004450] [planner_server]: GridBased plugin failed to plan calculation to (0.43, 1.77): "vector::reserve"

So I removed try/catch block from planner_server and implemented glog crash dump and got this. Here you can find an exact line where SIGSEGV happened.

SteveMacenski commented 2 years ago

Wait, is this a crash or just an exception is thrown and caught? That log is just an exception thrown and caught

SteveMacenski commented 2 years ago

A couple of follow up questions then.

Also, I see that you're missing spaces between the parent footprint polygon array, if you add that, does it work? I think maybe there's an error you missed on ~setup~ initialization that is indicating a problem with your footprint potentially that this is just the run-time impact of

matijazigic commented 2 years ago

It is an exception caught. It would not plan because of that.

matijazigic commented 2 years ago

Hybrid also working. I used parameters from example.

SteveMacenski commented 2 years ago

Hybrid with the same footprint as you used in Lattice?

SteveMacenski commented 2 years ago

From that, I think this is the issue

https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/node_lattice.cpp#L214

Does changing

const double & angle = motion_table.getAngleFromBin(this->pose.theta) / bin_size;

to

const double & angle = floor(motion_table.getAngleFromBin(this->pose.theta) / bin_size);

fix it? If so, please submit a PR with that and I'll merge. I think what's happening is that it rounds up when its in the N-1'th bin size to N which doesn't exist (that would be 0 turning around the circle)

Unless that is that your footprint isn't being read in correctly and then from https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/collision_checker.cpp#L126 its empty so the checker fails because the size is 0

matijazigic commented 2 years ago

Hello @SteveMacenski, Maybe the problem lies here: https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/collision_checker.cpp#L117 orientedfootprints is a vector, and angle_bin is float?

I think everything is okay with the footprint because I can see it in the rviz (/global_costmap/published_footprint).

SteveMacenski commented 2 years ago

Hybrid-A* does the same thing, but computes the float angle differently, that's why I'm suspect of the line I linked to avoid that would floor the angle used in the collision checker rather than allowing it to round up when it is cast into an int for that line you linked to.

The error vector::reserve and that your trace shows footprintCost makes me suspicious that that's not actually the issue, its actually in your footprint itself, but its a good first thing to try.

Edit: though 2 lines down with current_footprint.reserve(oriented_footprint.size());... that could be causing it if its an invalid reference! So perhaps that floor() will fix it. That would be more clear with line numbers, but I'm pretty sure that's it.

matijazigic commented 2 years ago

I tried changing node_lattice.cpp, as you suggested, but the same problem occurs.

SteveMacenski commented 2 years ago

OK, I'll actually need a traceback with line numbers and memory locations then. Or you can add some prints and see if anything looks suspicious or determine what line it crashes at. That may be faster given we know roughly where its happening.

matijazigic commented 2 years ago

It fails when the bin has a negative value, and also theta value is negative, I put one std::cout inside GridCollisionChecker::inCollision to print the angle_bin value. I'll invest more time later, to debug it further and try to fix it.

[planner_server-11] angle_bin: -39.381 [planner_server-11] [ERROR] [1644427350.836848361] [planner_server]: Action server failed while executing action callback: "vector::reserve"

Should we transform such angles to values between 0 and 2pi?

SteveMacenski commented 2 years ago

That is odd. The angle should already be normalized from the control set generated. I don't see any reason that I can identify why that could be negative.

https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/node_lattice.cpp#L504

angle is set to the motion projection which is fresh off of the control set generator, so unless you see a negative value in your json file for the control set primitives, its not that.

If we reflect it for a backwards maneuver, we also normalize it again to be safe

https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/node_lattice.cpp#L507-L513

Then those values are set for use and calls the isNodeValid function where this all goes down. Plus, if it were negative, I don't think motion_table.getAngleFromBin(this->pose.theta) would return a valid solution.

https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/node_lattice.cpp#L516-L525

matijazigic commented 2 years ago

angle is set to the motion projection which is fresh off of the control set generator, so unless you see a negative value in your json file for the control set primitives, its not that.

But I'm using default json, and the backward flag is false, I need to dive deeper in code, I'm still not familiar with it.

https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/lattice_primitives/sample_primitives/5cm_resolution/0.5m%20_turning_radius/ackermann/output.json

I just added a dummy fix:

        if(prim_pose._theta < 0.0) {
          // fix
          prim_pose._theta = 2 * M_PI + prim_pose._theta;
        }

        if (collision_checker->inCollision(
            prim_pose._x,
            prim_pose._y,
            prim_pose._theta / bin_size /*bin in collision checker*/,
            traverse_unknown))
        {
          return false;
        }

And now it's working. I'll try to find the root cause tomorrow.

Also, I figure out that there is a problem with generated motion primitives: https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/lattice_primitives/sample_primitives/5cm_resolution/0.5m%20_turning_radius/ackermann/output.json#L2096

Adjacent poses have a large diff in the theta. From 4.185 to -2.034, is that something that is supported by the code?

SteveMacenski commented 2 years ago

If you do not specify a file, this is the default https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/test/default_model.json. Your configuration file shows that you're overriding the default.

Either way, I scanned them both and they both look good to me. Might be worth adding some prints along that trail I laid and see where the first negative value appears. They should always be between 0-15 for a 16 discretization control set.

SteveMacenski commented 2 years ago

@matijazigic how are things going?

mattbooker commented 2 years ago

Also, I figure out that there is a problem with generated motion primitives: https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/lattice_primitives/sample_primitives/5cm_resolution/0.5m%20_turning_radius/ackermann/output.json#L2096

Adjacent poses have a large diff in the theta. From 4.185 to -2.034, is that something that is supported by the code?

Took a look at this and fixed the bug causing it. It doesn't affect any of the downstream planning stuff as the values were representing the correct direction, they just were off by 2 * pi.

In any case, it may affect other peoples code so I have fixed it by normalizing all the angles in every trajectory. I've also added some test cases for the output to make sure there aren't other small issues like this.

SteveMacenski commented 2 years ago

I think it could cause an issue within the planner since we assume all of the angles are normalized from 0-2PI so that could be causing this issue as well (albeit, its also just as likely that we have a floating point round up issue).

SteveMacenski commented 2 years ago

@matijazigic how are things going? Was just normalizing prim_pose._theta the only problem? If so, I think @mattbooker's updated PR should fix the generator with that respect. Shortly, we should have some files generated using the new change that I would love if you could try out and give the :+1: that it fixes the problem so you don't need to maintain that patched fork.

matijazigic commented 2 years ago

@SteveMacenski I'll try it. Hopefully today, I was really busy in the last couple of days. Thanks.

mattbooker commented 2 years ago

@matijazigic I tested out my changes and it still wasn't working. I've found the problem and I need to update some stuff. So for now no need to try anything. Will update you again when its fixed.

SteveMacenski commented 2 years ago

The PR should fix it, if there are continued issues let me know and we can reopen