Closed MarqRazz closed 1 year ago
To be honest, we have yet to evaluate an Ignition Gazebo setup due to other long-standing problems with it preventing us (e.g. binaries for ros2 control ign didn't work for a long time [maybe still?], missing plugins, etc) so that's not part of our officially supported setup yet. It is planned for later this year all things according to plan.
What version of Ignition Gazebo? I've heard some nasty things about some of the older versions and @bperseghetti suggests a particular version where things are working better. Could it be due to that?
I will say it just looks like you're having some run of the mill TF problems that's not really related to Nav2 itself -- but could either be due to the robot URDF/state publisher or due to your simulator not publishing what's expected.
There's alot you're changing all at once (simulator, custom robot, etc) so the generic but truthful advice would be to start with something more simple like the TB3 demo and slowly build up changes until you see where the problem lies. I'm guessing its with your robot, so using Gazebo classic like we have will at least help you isolate the issue to simulation vs urdf/sdf related to work through the problem. I know there are some users on new-Gazebo, though I haven't spoken to them in detail regarding it
I'm pretty comfortable with Ignition and the issues it has (and specific versions you need to make things work), but your statement does't line up with the fact that nav/slam-toolbox are using the system clock when I have set use_sim_time
to true which is why I opened the issue. If you have any suggestions on where I could start looking I would appreciate any pointers!
I will say it just looks like you're having some run of the mill TF problems that's not really related to Nav2 itself
Any specifics here would be helpful. Is the zero time with the static tf not true with GzClassic?
@MarqRazz On a call right now, but I can share a little later how we run ours with humble and gz-garden using sim time including for the ros_gz_bridge, I can share launch as well as examples on some simulated robots.
My pointers are what I said - simplify to something we know works and build from there. if the issue if TF2/robot specific, there's a hundred ways it can manifest and its a bit of a logical puzzle. If the issue is the simulator, it would be good to know that sooner than later so you don't go down a rabbit hole. And, if you still see the issue with the old simulator + a base robot we know works, then that's something actionable I can also start to look into. So being processed can help us know where we need to focus efforts. For now, I'd be simply guessing.
The static TF looks that way for gazebo too, but that's generated by the URDF & robot state publisher, not the simulator, so I'm not sure why we would expect that to change.
transforms:
- header:
stamp:
sec: 0
nanosec: 0
frame_id: base_footprint
child_frame_id: base_link
transform:
translation:
x: 0.0
y: 0.0
z: 0.01
rotation:
x: 0.0
y: 0.0
z: 0.0
w: 1.0
Might also want to see if you have something like this using sim_time depending on the plugins you are using in sim: https://github.com/CogniPilot/mrbuggy3_simulator/blob/8b26ae67e0003119a5b2a5b14acfcec1b4a46256/mrbuggy3_gz_bringup/launch/ros_gz_bridge_base.launch.py#L77-L92
Do you have anywhere that you can show your ros_gz_bridge settings?
Here is an example with our posix Cerebri (Zephyr-RTOS) connected to the cmd_vel output of NAV2 to control the vehicle with gz-garden and running full nav2 stack (including SLAM): https://github.com/ros-planning/navigation2/assets/10233412/a8a90426-7c50-4543-9100-abf185708cb9
Sorry @SteveMacenski I misread your post and though you said you could see an issue with tf
in my issue report. For now I think I will keep pushing forward with Ignition, I'm so close.
@bperseghetti you rock, THANKS for the tips and pointers!
Here are my bridge settings for the clock.
As for odom
to base_link
I'm using ros2_control's DiffDriveController and this is what I see if I use tf_echo
# ros2 run tf2_ros tf2_echo odom base_link
At time 25.134000000
- Translation: [0.000, 0.000, 0.000]
- Rotation: in Quaternion [0.000, 0.000, 0.000, 1.000]
- Rotation: in RPY (radian) [0.000, -0.000, 0.000]
- Rotation: in RPY (degree) [0.000, -0.000, 0.000]
- Matrix:
1.000 0.000 0.000 0.000
0.000 1.000 0.000 0.000
0.000 0.000 1.000 0.000
0.000 0.000 0.000 1.000
If I ask for map
to base_laser
while running SLAM I get
# ros2 run tf2_ros tf2_echo map base_laser
At time 14.574000000
- Translation: [0.000, 0.000, 0.394]
- Rotation: in Quaternion [0.000, 0.000, 0.000, 1.000]
- Rotation: in RPY (radian) [0.000, -0.000, 0.000]
- Rotation: in RPY (degree) [0.000, -0.000, 0.000]
- Matrix:
1.000 0.000 0.000 0.000
0.000 1.000 0.000 0.000
0.000 0.000 1.000 0.394
0.000 0.000 0.000 1.000
and here is the output from tf_view_frames
. You can see that the Most recent/Oldest transform
point to tf
using the sim clock.
If I query for subscribers of the clock everything looks good and all the nodes I expect are there (here's a gist of the output)
So what's left to check? That seems like most of it
I'm thinking my next step is to build Nav/SlamToolbox from source and confirm the value of use_sim_time
and the clock it's using. My guess is I'm missing setting it somewhere when launching.
[component_container_isolated-2] [transformPoseInTargetFrame 1684848588.293940594]:
Extrapolation Error looking up target frame: Lookup would require extrapolation into the future.
Requested time 1684848585.909372 but the latest data is at time 59.498000, when looking up
transform from frame [base_link] to frame [map]
It doesn't take that long to compile if you ignore nav2_system_tests
, that'll take you 20 minutes itself
I think we did have to recreate some of the lower level nav2 bringup script to make sure sim_time was passed throughout. Maybe I'll try to make a "native nav2" launch this weekend and see what all I had to figure out about it previously. Granted not sure if there are also any non-sim_time calls from the actuator arm side from moveit on your setup?
Sorry I got distracted playing with my Isaac simulation. I verified Nav2 is using sim time everywhere I expect but while inspecting the lookup frames sent to nav2_util::transformPoseInTargetFrame I figured out that Rviz is the source of those errors when sending Nav2 Goal
's with the plugin. I tried changing the launch file to use_sim_time
and that didn't fix it, but if I set the fixed_frame
to be map
the error goes away and the robot navigates. Unfortunately I don't know of an easy way to inspect the action goal message (like you could in ROS1) from the Rviz plugin but will keep digging.
As you can see I still have issues with the map smearing while SLAMing but progress!
https://github.com/ros-planning/navigation2/assets/25058794/4ecb515d-41a9-4870-8346-0f6d157b8fdf
https://github.com/ros-planning/navigation2/assets/25058794/0a130dc0-643c-485d-a67f-5060fffc94c4
I tried changing the launch file to use_sim_time and that didn't fix it, but if I set the fixed_frame to be map the error goes away and the robot navigates. Unfortunately I don't know of an easy way to inspect the action goal message (like you could in ROS1) from the Rviz plugin but will keep digging.
Got it - that was actually pertinent information that you were sending it in the non-global frame. That is actually a known artifact of the rviz plugin that you have to send the goals in the global frame. I honestly didn't know that was the specific error given, but that is a quirk I was aware of.
looking at the hidden topics ros2 topic echo /navigate_to_pose/_action/status
we can see the status
status_list:
- goal_info:
goal_id:
uuid:
- 152
- 146
- 130
- 181
- 123
- 17
- 33
- 151
- 165
- 220
- 233
- 200
- 122
- 9
- 16
- 223
stamp:
sec: 69
nanosec: 886000000
status: 2
Which appears to be in the simulation time. So I think the BT navigator is OK.
I took a look from the logs I see and:
[component_container_isolated-5] [ERROR] [1685751532.615056985] [transformPoseInTargetFrame]: Extrapolation Error looking up target frame: Lookup would require extrapolation into the future. Requested time 1685751532.389336 but the latest data is at time 70.167000, when looking up transform from frame [base_scan] to frame [map]
Leads me to look at transformPoseInTargetFrame
. That's only used in costmap2D and RPP, and I'm not currently using RPP.
Within costmap 2D its used in the function transformPoseToGlobalFrame
which is only called by the planner server in transformPosesToGlobalFrame
.
It appears that the goal is obtained directly from the action's request field geometry_msgs::msg::PoseStamped goal_pose = goal->goal
so that tells me that rviz panel isn't using sim time. The rviz panel sets the pose in navigation_goal_.pose = pose;
and if you track back where pose
is populated you find pose.header.stamp = rclcpp::Clock().now();
That explains that. We're just making a clock willy-nilly which is going to default to system time. Silly goose! I looked to see where else we do that in the stack and the answer is ..... no where else. So yeah, that's really suspect. I'm currently compiling a change to use the rviz panel's internal node instead to get the time, which I assume will fix this issue
Might also want to see if you have something like this using sim_time depending on the plugins you are using in sim: https://github.com/CogniPilot/mrbuggy3_simulator/blob/8b26ae67e0003119a5b2a5b14acfcec1b4a46256/mrbuggy3_gz_bringup/launch/ros_gz_bridge_base.launch.py#L77-L92
I saw in the launch files you are creating pose bridge but where gazebo is publishing to that pose topic. When I do gz topic-l there is no topic related to robot/pose. I am having /robot/odometry which I am sending to ROS using bridge. I am also using the same ackermann plugin from gazebo. Setting NAV2 and Gazebo garden having similar issues. Could you advice something.
Steps to reproduce issue
pr-add_navigation
branch (link to my open PR).Expected behavior
I have been experiencing several issues with the clock as I setup my robot to run in Gazebo. As you can see in my launch file I have
use_sim_time
set to True and below have verified through the parameter server that its set properly.Here is the output from a few of the nodes on the parameter server...
Here is what the beginning of my
/scan
message looks like:and
/tf_static
data for the base_laser frame:1) When I run the navigation stack with a dummy map that I have previously made I get errors about my LIDAR's frame_id data not being in sync with the system clock.
And a map is never published so I get tf errors:
2) If I try to run SLAM I get spammed with errors in the console saying the message queue is full and if I try to navigate the robot somewhere I can see that
transformPoseInTargetFrame
node is using the system clock.Actual behavior
I cant navigate or SLAM in sim.