Closed matijazigic closed 2 years ago
Stacktrace:
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0208 17:27:37.680630 383299 main.cpp:29] *** Aborted at 1644337657 (unix time) try "date -d @1644337657" if you are using GNU date ***
E0208 17:27:37.683888 383299 main.cpp:29] PC: @ 0x0 (unknown)
E0208 17:27:37.684031 383299 main.cpp:29] *** SIGSEGV (@0x8) received by PID 383029 (TID 0x7f3390b0b700) from PID 8; stack trace: ***
E0208 17:27:37.685885 383299 main.cpp:29] @ 0x7f33c82693c0 (unknown)
E0208 17:27:37.689031 383299 main.cpp:29] @ 0x7f33c7e03975 nav2_costmap_2d::FootprintCollisionChecker<>::footprintCost()
E0208 17:27:37.689402 383299 main.cpp:29] @ 0x7f33c00f0e5f nav2_smac_planner::GridCollisionChecker::inCollision()
E0208 17:27:37.689700 383299 main.cpp:29] @ 0x7f33c00fae9a nav2_smac_planner::NodeLattice::isNodeValid()
E0208 17:27:37.689934 383299 main.cpp:29] @ 0x7f33c00fc0fc nav2_smac_planner::NodeLattice::getNeighbors()
E0208 17:27:37.690172 383299 main.cpp:29] @ 0x7f33c00e63f6 nav2_smac_planner::AStarAlgorithm<>::createPath()
E0208 17:27:37.690488 383299 main.cpp:29] @ 0x7f33c00ceee2 nav2_smac_planner::SmacPlannerLattice::createPlan()
E0208 17:27:37.692167 383299 main.cpp:29] @ 0x7f33c862169f nav2_planner::PlannerServer::getPlan()
E0208 17:27:37.693833 383299 main.cpp:29] @ 0x7f33c862710e nav2_planner::PlannerServer::computePlan()
E0208 17:27:37.695358 383299 main.cpp:29] @ 0x7f33c864bdeb nav2_util::SimpleActionServer<>::work()
Can you compile with -g so we can see exact lines / memory locations? That would narrow it down. I haven’t seen this before but looks like it should be a quick fix if I know exactly what variable to analyze
also smac, not smack 😉 changed the ticket name for later searchability
I tried compiling with the -g flag, and everything that is suggested here, with no luck. I added -g flag inside nav2_planner and nav2_smac_planner. Am I doing something wrong?
Only thing that I can get is:
[component_container_isolated-6] [WARN] [1644321675.406004450] [planner_server]: GridBased plugin failed to plan calculation to (0.43, 1.77): "vector::reserve"
So I removed try/catch block from planner_server and implemented glog crash dump and got this. Here you can find an exact line where SIGSEGV happened.
Wait, is this a crash or just an exception is thrown and caught? That log is just an exception thrown and caught
A couple of follow up questions then.
Also, I see that you're missing spaces between the parent footprint polygon array, if you add that, does it work? I think maybe there's an error you missed on ~setup~ initialization that is indicating a problem with your footprint potentially that this is just the run-time impact of
It is an exception caught. It would not plan because of that.
Hybrid also working. I used parameters from example.
Hybrid with the same footprint as you used in Lattice?
From that, I think this is the issue
https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/node_lattice.cpp#L214
Does changing
const double & angle = motion_table.getAngleFromBin(this->pose.theta) / bin_size;
to
const double & angle = floor(motion_table.getAngleFromBin(this->pose.theta) / bin_size);
fix it? If so, please submit a PR with that and I'll merge. I think what's happening is that it rounds up when its in the N-1'th bin size to N which doesn't exist (that would be 0 turning around the circle)
Unless that is that your footprint isn't being read in correctly and then from https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/collision_checker.cpp#L126 its empty so the checker fails because the size is 0
Hello @SteveMacenski, Maybe the problem lies here: https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/collision_checker.cpp#L117 orientedfootprints is a vector, and angle_bin is float?
I think everything is okay with the footprint because I can see it in the rviz (/global_costmap/published_footprint).
Hybrid-A* does the same thing, but computes the float angle differently, that's why I'm suspect of the line I linked to avoid that would floor the angle used in the collision checker rather than allowing it to round up when it is cast into an int
for that line you linked to.
The error vector::reserve
and that your trace shows footprintCost
makes me suspicious that that's not actually the issue, its actually in your footprint itself, but its a good first thing to try.
Edit: though 2 lines down with current_footprint.reserve(oriented_footprint.size());
... that could be causing it if its an invalid reference! So perhaps that floor()
will fix it. That would be more clear with line numbers, but I'm pretty sure that's it.
I tried changing node_lattice.cpp, as you suggested, but the same problem occurs.
OK, I'll actually need a traceback with line numbers and memory locations then. Or you can add some prints and see if anything looks suspicious or determine what line it crashes at. That may be faster given we know roughly where its happening.
It fails when the bin has a negative value, and also theta value is negative, I put one std::cout inside GridCollisionChecker::inCollision to print the angle_bin value. I'll invest more time later, to debug it further and try to fix it.
[planner_server-11] angle_bin: -39.381 [planner_server-11] [ERROR] [1644427350.836848361] [planner_server]: Action server failed while executing action callback: "vector::reserve"
Should we transform such angles to values between 0 and 2pi?
That is odd. The angle should already be normalized from the control set generated. I don't see any reason that I can identify why that could be negative.
https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/src/node_lattice.cpp#L504
angle is set to the motion projection which is fresh off of the control set generator, so unless you see a negative value in your json file for the control set primitives, its not that.
If we reflect it for a backwards maneuver, we also normalize it again to be safe
Then those values are set for use and calls the isNodeValid
function where this all goes down. Plus, if it were negative, I don't think motion_table.getAngleFromBin(this->pose.theta) would return a valid solution.
angle is set to the motion projection which is fresh off of the control set generator, so unless you see a negative value in your json file for the control set primitives, its not that.
But I'm using default json, and the backward flag is false, I need to dive deeper in code, I'm still not familiar with it.
I just added a dummy fix:
if(prim_pose._theta < 0.0) {
// fix
prim_pose._theta = 2 * M_PI + prim_pose._theta;
}
if (collision_checker->inCollision(
prim_pose._x,
prim_pose._y,
prim_pose._theta / bin_size /*bin in collision checker*/,
traverse_unknown))
{
return false;
}
And now it's working. I'll try to find the root cause tomorrow.
Also, I figure out that there is a problem with generated motion primitives: https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/lattice_primitives/sample_primitives/5cm_resolution/0.5m%20_turning_radius/ackermann/output.json#L2096
Adjacent poses have a large diff in the theta. From 4.185 to -2.034, is that something that is supported by the code?
If you do not specify a file, this is the default https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/test/default_model.json. Your configuration file shows that you're overriding the default.
Either way, I scanned them both and they both look good to me. Might be worth adding some prints along that trail I laid and see where the first negative value appears. They should always be between 0-15 for a 16 discretization control set.
@matijazigic how are things going?
Also, I figure out that there is a problem with generated motion primitives: https://github.com/ros-planning/navigation2/blob/main/nav2_smac_planner/lattice_primitives/sample_primitives/5cm_resolution/0.5m%20_turning_radius/ackermann/output.json#L2096
Adjacent poses have a large diff in the theta. From 4.185 to -2.034, is that something that is supported by the code?
Took a look at this and fixed the bug causing it. It doesn't affect any of the downstream planning stuff as the values were representing the correct direction, they just were off by 2 * pi.
In any case, it may affect other peoples code so I have fixed it by normalizing all the angles in every trajectory. I've also added some test cases for the output to make sure there aren't other small issues like this.
I think it could cause an issue within the planner since we assume all of the angles are normalized from 0-2PI so that could be causing this issue as well (albeit, its also just as likely that we have a floating point round up issue).
@matijazigic how are things going? Was just normalizing prim_pose._theta
the only problem? If so, I think @mattbooker's updated PR should fix the generator with that respect. Shortly, we should have some files generated using the new change that I would love if you could try out and give the :+1: that it fixes the problem so you don't need to maintain that patched fork.
@SteveMacenski I'll try it. Hopefully today, I was really busy in the last couple of days. Thanks.
@matijazigic I tested out my changes and it still wasn't working. I've found the problem and I need to update some stuff. So for now no need to try anything. Will update you again when its fixed.
The PR should fix it, if there are continued issues let me know and we can reopen
Bug report
Required Info:
Steps to reproduce issue
Expected behavior
Actual behavior
Additional information
Crash report: [component_container_isolated-6] [WARN] [1644321675.406004450] [planner_server]: GridBased plugin failed to plan calculation to (0.43, 1.77): "vector::reserve" [component_container_isolated-6] [WARN] [1644321675.406055584] [planner_server]: [compute_path_to_pose] [ActionServer] Aborting handle.
Feature request
Feature description
Implementation considerations