Open evshary opened 1 week ago
Package | # Failing tests - Rolling | # Failing tests -dev/1.0.0 | ID of Zenoh falling tests |
---|---|---|---|
costmap_queue | 0 | 0 | () - () |
dwb_core | 1 | 0 | (6) - (0) |
dwb_critics | 0 | 0 | () - () |
dwb_plugins | 0 | 0 | () - () |
nav_2d_utils | 0 | 0 | () - () |
nav2_amcl | 0 | 0 | () - () |
nav2_behavior_tree | 4 | 11 | (7,17,18,28) - (17,18,27,34,38,39,40,44,45,46,47) |
nav2_behaviors | 0 | 0 | () - () |
nav2_bringup | 0 | 0 | () - () |
nav2_bt_navigator | 0 | 0 | () - () |
nav2_collision_monitor | 1 | 3 | (7) - (7,11,12) |
nav2_constrained_smoother | 0 | 0 | () - () |
nav2_controller | 3 | 1 | (6,7,8)-(6) |
nav2_core | 0 | 0 | () - () |
nav2__costmap_2d | 14 | 14 | (8,9,10,15,17,19,20,21,22,23,24,25,26,28)-(8,9,10,15,17,18,19,20,21,22,23,24,25,26) |
nav2_graceful_controller | 1 | 0 | (6) - () |
nav2_lifecycle_manager | 0 | 1 | () - (10) |
nav2_loopback_sim | 0 | 0 | () - () |
nav2_map_server | 5 | 4 | (8,9,10,11,12) - (8,9,10,11) |
nav2_mppi_controller | 12 | 12 | (6,7,8,9,10,11,12,13,14,15,16,17)- (6,7,8,9,10,11,12,13,14,15,16,17) |
nav2_navfn_planner | 1 | 1 | (7) - (7) |
nav2_planner | 1 | 1 | (7) - (7) |
nav2_regulated_pure_pursuit_controller | 1 | 1 | (6) - (6) |
nav2_rotation_shim_controller | 1 | 1 | (6) - (6) |
nav2_rviz_plugins | 0 | 0 | () - () |
nav2_simple_commander | 0 | 0 | () - () |
nav2_smac_planner | 11 | 11 | (10,11,12,13,14,15,16,17,18,19,20)-(10,11,12,13,14,15,16,17,18,19,20) |
nav2_smoother | 2 | 2 | (7,8)-(7,8) |
nav2_system_tests | 13 | 16 | (12,14,15,17,18,19,20,21,22,23,29,30,31)-(9,12,14,15,18,19,20,21,22,23,29,30,31) |
nav2_theta_star_planner | 1 | 1 | (1) - (1) |
nav2_util | 6 | 5 | (7,10,13,15,16,17)-(7,10,13,15,17) |
nav2_velocity_smoother | 1 | 1 | (6) - (6) |
nav2_voxel_grid | 0 | 0 | () - () |
nav2_waypoint_follower | 2 | 2 | (7,8) - (7,8) |
opennav_docking | 7 | 7 | (9,10,11,12,13,14,16) -(9,10,11,12,13,14,16) |
opennav_docking_bt | 0 | 2 | () - (7,8) |
opennav_docking_core | 0 | 0 | () - () |
Hello @evshary , this is very helpful! May I ask what tests you are referring to?
I was trying to make a remote connection to my robot over wifi. Similarly, I also tried both ends of the 1.0.0 PR (https://github.com/ros2/rmw_zenoh/pull/276) i.e., Rolling and dev/1.0.0. However, both of them are unstable. With your results I can see clearly what's going wrong.
If I skip the navigation i.e., basically running the ROS2 control for the motors and running the lidar along with its filters everything works fine. I can very smoothly visualize the robot remotely. It also properly updates the odometry if I move it with a joystick. But as soon as I try to run nav2 stack, things go wrong.
Hi @alireza-moayyedi In fact, the table is just to show the unit test result in the navigation2 repository On our side, navigation2 works well although not passing all the tests. Perhaps you could describe more about how you run and what the issues you face. BTW, you might need to ensure the version of nav2 you're using includes the fix here. https://github.com/ros-navigation/navigation2/pull/4725
Hi @evshary,
Well that's a surprise to be honest. This is the exact usecase that I am trying work out:
On the robot side:
ros2 run rmw_zenoh_cpp rmw_zenohd
)include file="$(find-pkg-share nav2_bringup)/launch/bringup_launch.py"
) on the robot with some launch arg overwrites such as a slightly modified param file wrt the default config of the nav2_bringup pkg (e.g., laser topics, robot footprint) and the input map; nothing special.On a separate computer:
connect: { endpoints: [...
inside DEFAULT_RMW_ZENOH_ROUTER_CONFIG.json5
to the robot's ipExpected behavior:
Actual behavior in rolling:
Actual behavior in dev/1.0.0:
I am certain that this is an rmw issue because if I connect the separate computer directly with an ethernet cable to the robot and use CycloneDDS with a explicit peers address list and explicit network interface then everything works very smoothly and I can easily initialize and control the robot remotely. Of course the downside then is that I have to follow the robot with my laptop in the hand.
Regarding the release, I am using the latest apt release:
Package: ros-jazzy-nav2-bringup
Version: 1.3.2-1noble.20241015.123150
Hi @alireza-moayyedi
Thank you for the detailed steps. I didn't see anything weird. I would suggest doing some experiments (with rmw_zenoh) to narrow the issue down.
For the dev/1.0.0 version, perhaps you could share the logs with us. I think the fix I mentioned before hasn't been included in the apt binary, but it's more related to the Rviz plugin crash, which is not the same as your description.
Hi @evshary,
As suggested I tried to narrow it down furthur and here are my findings (everything run with dev/1.0.0):
So I guess at this point we can conclude something is going wrong with communicating over wifi. Therefore, I tried to dig deeper. First, to omit the possibility of a faulty office wifi, I set up a separate router (2.4 GHz) where only my computer and the robot connected to it. But still the same issues as I reported originally. Here are some logs that might be relevant:
Next, I connected a display to the robot and I tried to see if I could run rviz simultaneously on both the robot as well as the remote computer and check if there was some difference in the behavior. On the robot I managed to get the map loading in the robot's rviz while the remote computer was still not loading it (though not so easily as I will explain later why). Surprisingly, I noticed that after giving the initial pose in the robot's rviz, amcl started to work properly and in the remote rviz I could also see the topics such as costmaps in the map frame (still no map). I drove around a bit and it seemed stable. Here is the remote rviz showing some topics in the map frame after initializing the localization in the robot's rviz:
So then I got more suspicious on the map server and started digging deeper into it. Now as I mentioned earlier, it was difficult to get the map showing in the robot's rviz when I was trying to also visualize it simultaneously in the remote's rviz. I noticed some irregular behavior when I tried to run the rviz first on the remote computer and then run the nav2 stack on the robot. For some reason, it caused the map server not to load properly: Which kind of explained why I had to restart the launches so many times to get the simultaneous rviz loads working. Apparently the order of launching things (rviz remote -> rviz robot -> nav2 robot) was affecting the behavior.
So now in order to make it work, I need to first run nav2 on the robot, initialize the localization on the robot's rviz and only then run the rviz on the remote.
This got me thinking if the /map topic needs some furthur tuning in the zenoh router's configuration to accomodate for the topic's bandwidth. Or maybe this is actually related to the rviz plugin that you mentioned which in that case I should test building nav2 from the source including that fix.
Sorry for the long posts, and I appreciate much your patience. Unfortunately I have not yet found anyone around me who has successfully managed to setup the Zenoh rmw in combination with nav2 for establishing a remote connection. Therefore, I have decided to dig deeper into it myself and report it directly to you here.
Hi @alireza-moayyedi Thank you for the detailed description. It helps a lot. I will investigate it. Feel free to share with us if there is anything else you find.
@alireza-moayyedi you can try to tune the /map
topic when using dev/1.0.0
branch via the downsampling
configuration. See here a guideline: https://github.com/ZettaScaleLabs/roscon2024_workshop/blob/main/exercises/ex-7.md
If you don't know the topic type name and hash, you can replace each with *
characters in the key_expr
.
e.g.: key_expr: "0/map/*/*"
(assuming ROS_DOMAIN_ID=0
and no namespace is set).