Closed SimonGiampy closed 9 months ago
The error consists in the TF breaking, because the link transform map->odom is not published anymore, therefore the localization fails.
The choice of planning algorithm nor setting within a planning algorithm will make that requirement or the presence of that transform change. I believe you have some other problem that is simply coincidence that changing that parameter triggers your problem (if it does, deterministically?). Clearly something is wrong more than your information provides, the planner server nor its internal planning plugins should not be interacting in any way with localization.
Any update?
The choice of planning algorithm nor setting within a planning algorithm will make that requirement or the presence of that transform change. I believe you have some other problem that is simply coincidence that changing that parameter triggers your problem (if it does, deterministically?). Clearly something is wrong more than your information provides, the planner server nor its internal planning plugins should not be interacting in any way with localization.
After many failed attempts at finding the wrong parameter configuration causing the problem with the localization, I was finally able to successfully generate reverse motion trajectories with the Smac Hybrid A* planner.
As I mentioned in my first message, I couldn't make the Reeds-Shepp parameter work within the simulated environment. But then I had the idea to test everything on the real life robot in a real map, and everything magically worked. This is something that I didn't try before because I took for granted that if something doesn't work in simulation, it will never work on the real robot. Today is the day when this assumption demonstrated to be invalid, for the first time. It never happened to me that the "simulation-to-reality" gap was reversed, so my bad for not trying it beforehand.
The NAV2 configuration parameters used in both simulated and real environments, are practically identical, and the only differences are the names of the topics used for the sensors. So I was able to make the configuration work with the real robot without any changes.
The TF tree breaks in the map->odom link, but the link breaking first seems to be actually odom->base_link. So it's still not clear to me whether the localization breaks, or the odometry breaks. During the tests I've conducted, the TF tree always breaks (so yes, it is deterministic), and since no clear errors arise, it's difficult to tell exactly what went wrong. My guess is that the problem is at odometry level. And the odometry is provided by Ignition Gazebo 6 via the differential drive odometry plugin.
When loading the motion model "REEDS_SHEPP", NAV2 takes longer to load, and that may be correlated to the problem cause. So when I use that parameter in the simulation, there is something obscure that causes a faulty interaction between NAV2 and the odometry plugin in gazebo (along with the bridge), which in turn breaks the TF tree. If it's not this I don't know what it is.
I've done a lot of research, and I know very well that these modules must not affect each other, and that they must be completely not correlated, but this is what I found, and I'm totally sure of my findings. So I'm perfectly aware that it is very strange that the odometry from Gazebo could cause directly or indirectly this problem, but this is my best guess, with the knowledge that I have.
Summing up:
I will not mark this issue as solved since I technically didn't solve it yet, and because I only found a different case scenario where the problem doesn't arise. It is also not really important for me to have this issue solved in the simulated environment, because I actually care more about the real robot.
When loading the motion model "REEDS_SHEPP", NAV2 takes longer to load, and that may be correlated to the problem cause.
This is true, due to the lookup table calculation on initialization. It would be fragile if whatever your application is based on timing and not events like lifecycle transitions.
Closing since this isn't a bug in Nav2 and shown to work fine. Its potentially rooted in your application software or the simulator (either way to be taken up with the respective party). If it something in the simulator, feel free to tag me in that new ticket to track and help over there.
[Smac Planner Hybrid A*] Reeds-Shepp motion model not working
Error description
I'm trying to do navigation with a skid-steering robot, working with both differential drive and Ackermann approximations for the local planner (controller server). What I want to achieve is to have the global plan generating trajectories in both forward and reverse directions, as in an Ackermann vehicle. Everything works fine and smoothly with my configuration, until I use Reeds-Shepp motion model parameter for the Hybrid A* algorithm in the global planner. With Dubin motion model the navigation works fine but trajectories are always in forward motion.
The error source is the parameter
motion_model_for_search: "REEDS_SHEPP"
, required for enabling both reverse and forward motion in the global path planning algorithm. When I use this parameter the navigation stack doesn't work anymore. There is no apparent crash of any node, and everything seems to be loaded correctly, according to the initialization logs. The error consists in the TF breaking, because the link transform map->odom is not published anymore, therefore the localization fails.Setup description and Parameters used
Required Info and Setup:
ros-iron-rmw-cyclonedds-cpp
My configuration for NAV2:
I uploaded the yaml file renaming it as .txt (yaml file uploading not supported) to avoid information cluttering in this thread.
Configuration YAML file: simulation_reedssheep_not_working.yaml
This is the specific part of the parameters code that causes the error:
In particular, with
motion_model_for_search: "REEDS_SHEPP"
, the localization doesn't work anymore. UsingDUBIN
as motion model, everything works fine.Steps to reproduce issue
Use the parameters reported above to reproduce the bug. I'm not sure whether it's just that one parameter for the motion model causing the bug, or whether it is a specific combination of some parameters that makes the localization not working. I am also not 100% sure about this situation represents an actual bug or if I misunderstood the documentation about the Smac Planner for Hybrid A*.
This is the link to my repository containing the code for running NAV2 with both simulation environment and the real robot, in case it may be useful for replicating everything. My code repository
What I've already tried:
Since the error source seems to be in the global planner, I read the documentation thoroughly, including the documentation about the other NAV2 nodes, trying to find any possi9ble parameter interfering with the global trajectory planner.
The bug is reported in the log as the missing transform map->odom, which is be published by the localization node. I tried using both AMCL and SLAM_toolbox for localization in my tests, and the results were the same. So I can confirm the actual source of error doesn't come from the localization node, as the error log seems to show.
Expected behavior
Navigation working fine with correct TF tree and working localization.
Actual behavior
I report here a portion of the errors showing, immediately after NAV2 finishes initializing all the nodes and starts everything. The errors starts immediately after NAV2 stack is ready for navigation. The errors about the TF go on indefinitely.