Zig-zag due to unfavorable angle increments and starting angle

mangoschorle commented 2 years ago

I am experiencing some interesting behavior when using SmacPlanner Hybrid. I am not quite sure whether this is a common mistake/behavior maybe caused by incorrect usage, or whether it is a known limitation. I currently have no idea how to approach this / solve this. If ideas pop up along this thread, I am more than willing to contribute.

So the issue is that depending on which angle-bin the start node falls into, I am getting either a straight line or a very messy zig-zag.

We have 64 angle bins. Here the start is in bin 32 (pi) Screenshot from 2022-09-05 21-53-44

In this image the starting angle is in bin 31 Screenshot from 2022-09-05 21-54-39

I also found that due to my choice of resolution (10cm) and min_turning_radius (50cm), the turning motions of the minimal primitive set (Dubins) cover four angle bins at a time, i.e., increments=4 https://github.com/ros-planning/navigation2/blob/634d2e3d9b6bde5558c90fe6583d52b3ed66cf55/nav2_smac_planner/src/node_hybrid.cpp#L104)

This is actually quite bad since you one will therefore never end up in odd angle bins. But only in every fourth bin that are multiples of four. This also causes the zig-zagging because just can't reach an angle bin that would directly go in the right direction. Also as can be seen in the image, the planner has to expand WAY more nodes to make progress.

This is a distribution of the angle bins. As you can see, we only reach every fourth angle bin. Screenshot from 2022-08-04 16-42-16

I wonder / am collecting ideas, on how we could find remedy for this zig-zagging issue. I know, we could double the resolution or use a larger turning radius (>2 meters if we want to cover only a single bin) but we are running into other issues if we do so. Maybe there are other ideas. I have tried adding additional turning primitives which turn much less than the min_turning_radius but didn't get good results. Again, if I understand the idea / solution, I'd be more than happy to implement, test, and submit a PR.

SteveMacenski commented 1 year ago

Having 72 bins but always skipping three at a time is just not so very sensible. Its better to just use 24 bins.

Of course.

This will cause much more A Star expansions. The 2D heuristic isn't a perfect guidance anymore. This is exactly what we see in several of the images that i posted before.

I'm on the same page, that's just not the behavior that I see. What I see is that because the cost-aware heuristic creates essentially a channel that the search is doing down, the additional primitives gives the system more granular options to try to follow that central channel. As such, then minor changes make the planner expand older visited nodes to try to see if a new branch can beat the existing longest branch. That creates thicker expansions in areas where there are options as to before. This is why I tried the quadratic cost functions to try to make the channel more steep such that we pruned more branches quicker. That does actually help - but also helps for the non-interpolated primitives version. It has slightly different behavior which I think is good to add, which is why I would like to make that an additional option for folks.

If you can tell me that you tried my branch and you see the behavior you're looking for / described, that would be enormously helpful. Because if that's the case, then its just my testing situations / maps that aren't showing those benefits but they are there and I can pack this stuff up and ship it. I'm trying to replicate your success and if I've already done it but I'm just testing different things than you, then this is all moot and I can move on! My test maps aren't quite the same as your hallway application you've shown in previous graphics.

The round-up to 1 essentially means that we cannot drive at the desired turning radius

It rounds up, but note that we use the angle to calculate the actual displacement of the motion primitive, so its not exactly $\sqrt(2)$ long, but longer. That is intentional because I know that we either are going to need to round up to 1 or round up to another integer so we can have the primitives be exact angular quantizations apart. Once we know what the angle is for what "would have" been a $\sqrt(2)$ primitive, then we calculate the actual $\Delta X$, $\Delta Y$ based on the actual rounded angle -- which is then guaranteed to be longer than $\sqrt(2)$

So I think that's already handled via

  angle = increments * bin_size;

  // find deflections
  // If we make a right triangle out of the chord in circle of radius
  // min turning angle, we can see that delta X = R * sin (angle)
  float delta_x = min_turning_radius * sin(angle);
  // Using that same right triangle, we can see that the complement
  // to delta Y is R * cos (angle). If we subtract R, we get the actual value
  float delta_y = min_turning_radius - (min_turning_radius * cos(angle));

This is using the actual increments, not the theoretical for $\sqrt(2)$. That should still be the exact turning radius due to using the ceil angle with the actual radius to find the primitive displacements.

Whether A* optimizes for the sum of squares of distances between nodes or the sum of distances should make no difference. This makes me think that if it makes a difference when we have cost factors for turning ect. then we are essentially just mimicking a different cost factor setting with this approach

If we have normalized costs between 0-1, then some cost 55/254 vs 200/254 are 0.21 and 0.78 respectively. If we consider normalized squared costs, those are now 0.04 and 0.62 respectively. That has a difference in the planning by making the system less responsive to smaller cost gradients as less important in low cost zones (and more responsive in higher cost zones). That has the benefit of smoother paths in low-cost central regions since its not trying to optimize as much for small variations -- but at the cost of slightly closer paths to obstacles when given the same settings of costmaps. Its a different behavior certainly, but the added smoothness in open spaces where it doesn't really matter is a nice option to have for many very open free-space applications.

This would be making the 2D heuristic inadmissible as the sum of squares is larger or equal to the square of the sum.

I update the heuristic when this setting is on to use the same quadratic formulation so its admissible :-). The handling of cost + distance is the same for both H and T to make sure we keep the same objective functions being used in both places to measure cost-to-travel and cost-of-search

SteveMacenski commented 1 year ago

@stevedanomodolor @mangoschorle I've pushed all the changes up to the PR branch for you to take a look / test and let me know what you think! It should be in its final form, minus any changes that @mangoschorle requests (if he can test this branch for the expected behavior on expansions!).

I'll plan to characterize the branch fully later this week and update the documentation with these new parameters and suggested # bins to use with them on. I'd love it if you both could do some evaluation on it and make sure you're satisfied.

mangoschorle commented 1 year ago

Regarding the turning radius, I am still convinced that the way your implementation is, you will have situations where you are unable to drive at large turning radii but I want to focus my time on the other topic.

I see, you only squared the obstacle part of your costs.

I have done some shallow analysis and independently compared the modifications to my baseline. I compared the features individually as we wouldn't be able to distinguish the impact on the result if all applied at once. Note that the baseline contains our version of the sub-sampled version of the primitives.

The seeds where the same and you only see a small excerpt of the map.

Baseline

ours

Baseline with Cost-Based-On-Arc Length Modification

arc_length_2 arc_length_1

Baseline with Squared-Cost-Modification

cost_squared

Baseline with your version of the sub-sampled primitives

steve steve2 steve3

Some comments:

I think the proposed version of sub-sampled primitives is good. It seems to solve the issue.
The arc length proposal seems good, but at least on my map, I do not see notable changes
The squared cost seems to have more effect but nothing where I would say its a game changer or clearly better than the previous results (But then I don't know what a game changer would be because the planner already behaves as I want it to behave). Maybe that's the case on smaller maps.

SteveMacenski commented 1 year ago

I am still convinced that the way your implementation is, you will have situations where you are unable to drive at large turning radii

I'm open to being convinced otherwise, but I think it would be best done with examples of inputs-in and the specific data showing that it is not following it properly as demonstration. If there's a problem that I'm not seeing, I'd love to be made aware of it.

I think the proposed version of sub-sampled primitives is good. It seems to solve the issue.

Sweet! I'll take that as an action item to do the documentation, retuning, and characterization of that change for users that would like to leverage it.

Though, the graph you show has much longer compute times than your baseline, which is the behavior I was trying to get away from since your other comments mentioned it was faster. Does that not bother you? Any thoughts as to why that's happening and not in your baseline implementation with the same feature? Is the baseline and that test using the same number of primitives / bins?

The arc length proposal seems good, but at least on my map, I do not see notable changes

If you turn off the turning / changing penalties, you should see it since there would be no other turning contribution other than the primitives'. If you already had a 5%-ish penalty on turning / change, that small adjustment will be negligible. But your hallway confined tests, I doubt it would make much of a difference, if at all.

The squared cost seems to have more effect but nothing where I would say its a game changer or clearly better than the previous results

This one I can show visually. On the left is the quadratic costs and the right the baseline before any changes. You can see that the path is "smoother" since its less sensitive to finite changes when far from obstacles (but then in exchange, its naturally then closer to obstacles). I think for this behavior to be notable, its actually more about being in open spaces than smaller maps. For your hallway tests, I would probably not expect you to note substantive difference.

Screenshot from 2023-08-30 09-45-13

stevedanomodolor commented 1 year ago

@stevedanomodolor @mangoschorle I've pushed all the changes up to the PR branch for you to take a look / test and let me know what you think! It should be in its final form, minus any changes that @mangoschorle requests (if he can test this branch for the expected behavior on expansions!).

I'll plan to characterize the branch fully later this week and update the documentation with these new parameters and suggested # bins to use with them on. I'd love it if you both could do some evaluation on it and make sure you're satisfied.

Are there more maps i can test with for example with smaller hallways to compare with the one i already did?

SteveMacenski commented 1 year ago

It might be worth looking for some online! The remaining of my maps I cannot distribute for a variety of reasons (mainly of which, they were privately given to me for personal use only).

I’ve seen the TB4 documentation using a good-sized space and the AWS warehouse world, hospital world both have maps in their github packages as well!

SteveMacenski commented 1 year ago

@mangoschorle I merged in the code, but please do follow up on the performance items. I think that's really quite important so we can potentially turn this off as an option and turn on as the default.

SteveMacenski commented 1 year ago

Well, I'd still really love that follow up on performance testing, but for now I think this can be closed since we solved the issue. I've shown for the same number of bins (eg 24 feature-on or 72 feature-off skipping 3rds) are the same performance so I'm convinced it works as expected and that the change in performance run-time is application / tuning based. If that's not the case and anyone can follow up with some metrics of how to improve and/or ideas, I'm all ears!

ros-navigation / navigation2