ros-navigation / navigation2

ROS 2 Navigation Framework and System
https://nav2.org/
Other
2.57k stars 1.29k forks source link

Port nav2 systems test to new gazebo #4455

Closed stevedanomodolor closed 1 month ago

stevedanomodolor commented 4 months ago

Bug report

Required Info:

Steps to reproduce issue

Expected behavior

Actual behavior

Additional information


Feature request

Feature description

stevedanomodolor commented 4 months ago

@SteveMacenski, so I am not going crazy, does the wait, backup, drive on heading behavior tests fail once in a while. It does not seem to be stable.

SteveMacenski commented 4 months ago

@SteveMacenski, so I am not going crazy, does the wait, backup, drive on heading behavior tests fail once in a while. It does not seem to be stable.

You're not crazy, they're poorly written and it drives me bananas. I want to rewrite them at some point (#4349) to not try to fake everything out in some crazy way. I'm not really sure why this is the way that it is, I suspect it may have been written before we had any GZ simulations setup in the systems tests to work from. It is my plan to have a new contributor rewrite them as a project, but certainly if anyone else was interested in just deleting them and restarting now with the new GZ and the Simple Commander API, I think it could be done much, much more simply.

But, they should generally speaking work - it shouldn't be like 50%, it would be like ~80% successful, except for the Spin test that I disabled because it was failing ~80% of the time. Wait basically never failed. Backup/Drive on Heading was like 9/10 successful.

I see you have a couple in draft (#4471 #4472), is that why?

stevedanomodolor commented 4 months ago

@SteveMacenski, so I am not going crazy, does the wait, backup, drive on heading behavior tests fail once in a while. It does not seem to be stable.

You're not crazy, they're poorly written and it drives me bananas. I want to rewrite them at some point (#4349) to not try to fake everything out in some crazy way. I'm not really sure why this is the way that it is, I suspect it may have been written before we had any GZ simulations setup in the systems tests to work from. It is my plan to have a new contributor rewrite them as a project, but certainly if anyone else was interested in just deleting them and restarting now with the new GZ and the Simple Commander API, I think it could be done much, much more simply.

But, they should generally speaking work - it shouldn't be like 50%, it would be like ~80% successful, except for the Spin test that I disabled because it was failing ~80% of the time. Wait basically never failed. Backup/Drive on Heading was like 9/10 successful.

I see you have a couple in draft (#4471 #4472), is that why?

Yes thats why they are on draft. I couldn't figure out why they fail.

SteveMacenski commented 4 months ago

Working on the costmap filters right now and the file cleanup

SteveMacenski commented 4 months ago

I completely rewrite the backup/spin/drive on heading tests to remove the flakiness and work with new GZ from your draft PRs. See https://github.com/ros-navigation/navigation2/pull/4515

stevedanomodolor commented 3 months ago

@SteveMacenski Working on the GPS tests, I have migrated it, I am just having an issue with robot localization, it crashes. I am trying to figure out why. It happens after we call the fromLL service with a out of range value

1: [controller_server-5] [INFO] [1721076228.573393372] [controller_server]: Cancellation was successful. Stopping the robot.
1: [controller_server-5] [WARN] [1721076228.573490911] [controller_server]: [follow_path] [ActionServer] Client requested to cancel the goal. Cancelling.
1: [bt_navigator-9] [INFO] [1721076228.573510760] [bt_navigator]: Goal canceled
1: [bt_navigator-9] [WARN] [1721076228.573635926] [bt_navigator]: [navigate_to_pose] [ActionServer] Client requested to cancel the goal. Cancelling.
1: [controller_server-5] [INFO] [1721076228.575679904] [controller_server]: Optimizer reset
1: [tester_node-18] [INFO] [1721076230.533746634] [nav2_gps_waypoint_tester]: Sending goal request...
1: [waypoint_follower-10] [INFO] [1721076230.534812656] [waypoint_follower]: Converting GPS waypoints to map Frame..
1: [navsat_transform_node-17] terminate called after throwing an instance of 'GeographicLib::GeographicErr'
1: [navsat_transform_node-17]   what():  Latitude 35, longitude -118 out of legal range for UTM zone 30
1: [tester_node-18] [INFO] [1721076230.597235718] [nav2_gps_waypoint_tester]: Goal accepted
1: [tester_node-18] [INFO] [1721076230.598689055] [nav2_gps_waypoint_tester]: Waiting for 'follow_gps_waypoints' action to complete
1: [ERROR] [navsat_transform_node-17]: process has died [pid 98044, exit code -6, cmd '/opt/ros/rolling/lib/robot_localization/navsat_transform_node --ros-args -r __node:=navsat_transform --params-file /opt/overlay_ws/src/navigation2/nav2_system_tests/src/gps_navigation/dual_ekf_navsat_params.yaml --params-file /tmp/launch_params_0y7ot46v -r imu/data:=imu/data -r gps/fix:=gps/fix -r gps/filtered:=gps/filtered -r odometry/gps:=odometry/gps -r odometry/filtered:=odometry/global'].
SteveMacenski commented 3 months ago
1: [waypoint_follower-10] [INFO] [1721076230.534812656] [waypoint_follower]: Converting GPS waypoints to map Frame..
1: [navsat_transform_node-17] terminate called after throwing an instance of 'GeographicLib::GeographicErr'
1: [navsat_transform_node-17]   what():  Latitude 35, longitude -118 out of legal range for UTM zone 30

I imagine that means that the GZ generated GPS is outside of what RL is expecting for the UTM zone. Did the other test set something for this like a datum to RL or the SDF for where to say the test is taking place w.r.t. GPS?

SteveMacenski commented 3 months ago

@stevedanomodolor any update on the GPS test :-) The last I heard you were thinking about that but not sure where that landed

stevedanomodolor commented 2 months ago

Lots of holidays, Just came back not long ago. I will continue on this.

stevedanomodolor commented 1 month ago
1: [waypoint_follower-10] [INFO] [1721076230.534812656] [waypoint_follower]: Converting GPS waypoints to map Frame..
1: [navsat_transform_node-17] terminate called after throwing an instance of 'GeographicLib::GeographicErr'
1: [navsat_transform_node-17]   what():  Latitude 35, longitude -118 out of legal range for UTM zone 30

I imagine that means that the GZ generated GPS is outside of what RL is expecting for the UTM zone. Did the other test set something for this like a datum to RL or the SDF for where to say the test is taking place w.r.t. GPS?

No, the goal we send in the test is outside of the UTM zone. The test that fail is this one

set waypoint outside of map

time.sleep(2)
test.setWaypoints([[35.0, -118.0]])
result = test.run(True, False)
assert not result
result = not result
assert (
    test.action_result.missed_waypoints[0].error_code
    == ComputePathToPose.Result().GOAL_OUTSIDE_MAP
)
waypoint_tester]: Sending goal request...
23: [waypoint_follower-10] [INFO] [1726598194.053635890] [waypoint_follower]: Converting GPS waypoints to map Frame..
23: [tester_node-18] [INFO] [1726598194.054799709] [nav2_gps_waypoint_tester]: Goal accepted
23: [waypoint_follower-10] [INFO] [1726598194.054883944] [waypoint_follower]: Converted all 3 GPS waypoint to map frame
23: [waypoint_follower-10] [INFO] [1726598194.054917546] [waypoint_follower]: Received follow waypoint request with 3 waypoints.
23: [tester_node-18] [INFO] [1726598194.055347863] [nav2_gps_waypoint_tester]: Waiting for 'follow_gps_waypoints' action to complete
23: [bt_navigator-9] [INFO] [1726598194.055691408] [bt_navigator]: Begin navigating from current location (0.00, -0.00) to (-0.05, -0.00)
23: [controller_server-5] [INFO] [1726598194.076576708] [controller_server]: Received a goal, begin computing control effort.
23: [controller_server-5] [WARN] [1726598194.076614514] [controller_server]: No goal checker was specified in parameter 'current_goal_checker'. Server will use only plugin loaded general_goal_checker . This warning will appear once.
23: [controller_server-5] [WARN] [1726598194.076625956] [controller_server]: No progress checker was specified in parameter 'current_progress_checker'. Server will use only plugin loaded progress_checker . This warning will appear once.
23: [controller_server-5] [INFO] [1726598194.091044267] [controller_server]: Reached the goal!
23: [controller_server-5] [INFO] [1726598194.092883752] [controller_server]: Optimizer reset
23: [bt_navigator-9] [INFO] [1726598194.116375488] [bt_navigator]: Goal succeeded
23: [parameter_bridge-2] [INFO] [1726598194.169744050] [ros_gz_bridge]: Passing message from ROS geometry_msgs/msg/Twist to Gazebo gz.msgs.Twist (showing msg only once per type)
23: [waypoint_follower-10] [INFO] [1726598194.555098693] [waypoint_follower]: Succeeded processing waypoint 0, processing waypoint task execution
23: [waypoint_follower-10] [INFO] [1726598194.555136075] [waypoint_follower]: Arrived at 0'th waypoint, sleeping for 200 milliseconds
23: [waypoint_follower-10] [INFO] [1726598194.755823117] [waypoint_follower]: Task execution at waypoint 0 succeeded
23: [waypoint_follower-10] [INFO] [1726598194.755887136] [waypoint_follower]: Handled task execution on waypoint 0, moving to next.
23: [bt_navigator-9] [INFO] [1726598194.756499342] [bt_navigator]: Begin navigating from current location (0.00, -0.00) to (-4.92, -1.24)
23: [controller_server-5] [INFO] [1726598194.777316325] [controller_server]: Received a goal, begin computing control effort.
23: [controller_server-5] [INFO] [1726598195.827478752] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598196.827447556] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598197.877443008] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598198.877443427] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598199.927461036] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598200.927454241] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598201.977476905] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598202.977445152] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598204.027445683] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598205.027454522] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598206.077445491] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598207.077452019] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598208.127445717] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598209.127449363] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598210.177444148] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598211.177449104] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598212.227447300] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598213.227448409] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598214.277459332] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598215.277451818] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598216.277446480] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598217.327458637] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598218.327502254] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598219.377446520] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598220.138863892] [controller_server]: Reached the goal!
23: [controller_server-5] [INFO] [1726598220.140583597] [controller_server]: Optimizer reset
23: [bt_navigator-9] [INFO] [1726598220.166755141] [bt_navigator]: Goal succeeded
23: [waypoint_follower-10] [INFO] [1726598220.856126841] [waypoint_follower]: Succeeded processing waypoint 1, processing waypoint task execution
23: [waypoint_follower-10] [INFO] [1726598220.856157011] [waypoint_follower]: Arrived at 1'th waypoint, sleeping for 200 milliseconds
23: [waypoint_follower-10] [INFO] [1726598221.059151889] [waypoint_follower]: Task execution at waypoint 1 succeeded
23: [waypoint_follower-10] [INFO] [1726598221.059196004] [waypoint_follower]: Handled task execution on waypoint 1, moving to next.
23: [bt_navigator-9] [INFO] [1726598221.059717633] [bt_navigator]: Begin navigating from current location (-4.83, -1.18) to (-4.15, -5.28)
23: [controller_server-5] [INFO] [1726598221.080368567] [controller_server]: Received a goal, begin computing control effort.
23: [controller_server-5] [INFO] [1726598222.130507585] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598223.130503712] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598224.180513698] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598225.180507485] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598226.230505963] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598227.230506893] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598228.280513541] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598229.280510268] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598230.330507210] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598231.330505773] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598232.380578643] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598233.380551186] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598234.430506319] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598235.430529290] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598236.480545651] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598237.480538415] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598238.530512050] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598239.530511819] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598240.241422875] [controller_server]: Reached the goal!
23: [controller_server-5] [INFO] [1726598240.242448252] [controller_server]: Optimizer reset
23: [bt_navigator-9] [INFO] [1726598240.269976580] [bt_navigator]: Goal succeeded
23: [waypoint_follower-10] [INFO] [1726598240.959362363] [waypoint_follower]: Succeeded processing waypoint 2, processing waypoint task execution
23: [waypoint_follower-10] [INFO] [1726598240.959393193] [waypoint_follower]: Arrived at 2'th waypoint, sleeping for 200 milliseconds
23: [waypoint_follower-10] [INFO] [1726598241.160896377] [waypoint_follower]: Task execution at waypoint 2 succeeded
23: [waypoint_follower-10] [INFO] [1726598241.160982761] [waypoint_follower]: Handled task execution on waypoint 2, moving to next.
23: [waypoint_follower-10] [INFO] [1726598241.160994432] [waypoint_follower]: Completed all 3 waypoints requested.
23: [tester_node-18] [INFO] [1726598241.167575144] [nav2_gps_waypoint_tester]: Goal succeeded!
23: [tester_node-18] [INFO] [1726598241.168256457] [nav2_gps_waypoint_tester]: Sending goal request...
23: [waypoint_follower-10] [INFO] [1726598241.168643667] [waypoint_follower]: Converting GPS waypoints to map Frame..
23: [waypoint_follower-10] [INFO] [1726598241.168996930] [waypoint_follower]: Converted all 1 GPS waypoint to map frame
23: [waypoint_follower-10] [INFO] [1726598241.169012802] [waypoint_follower]: Received follow waypoint request with 1 waypoints.
23: [bt_navigator-9] [INFO] [1726598241.169311644] [bt_navigator]: Begin navigating from current location (-4.10, -5.20) to (-0.05, -0.00)
23: [tester_node-18] [INFO] [1726598241.184820408] [nav2_gps_waypoint_tester]: Goal accepted
23: [controller_server-5] [INFO] [1726598241.190196778] [controller_server]: Received a goal, begin computing control effort.
23: [controller_server-5] [INFO] [1726598242.240329634] [controller_server]: Passing new path to controller.
23: [tester_node-18] [INFO] [1726598243.185808190] [nav2_gps_waypoint_tester]: Sending goal request...
23: [waypoint_follower-10] [INFO] [1726598243.219093619] [waypoint_follower]: Preempting the goal pose.
23: [waypoint_follower-10] [INFO] [1726598243.219238670] [waypoint_follower]: Converting GPS waypoints to map Frame..
23: [waypoint_follower-10] [INFO] [1726598243.220284021] [waypoint_follower]: Converted all 1 GPS waypoint to map frame
23: [bt_navigator-9] [INFO] [1726598243.220644659] [bt_navigator]: Received goal preemption request
23: [bt_navigator-9] [INFO] [1726598243.220822927] [bt_navigator]: Begin navigating from current location (-4.04, -5.17) to (-4.92, -1.24)
23: [tester_node-18] [INFO] [1726598243.222097874] [nav2_gps_waypoint_tester]: Goal accepted
23: [controller_server-5] [INFO] [1726598243.240597103] [controller_server]: Passing new path to controller.
23: [controller_server-5] [INFO] [1726598244.290329411] [controller_server]: Passing new path to controller.
23: [waypoint_follower-10] [WARN] [1726598245.269532364] [waypoint_follower]: [follow_gps_waypoints] [ActionServer] Client requested to cancel the goal. Cancelling.
23: [bt_navigator-9] [ERROR] [1726598245.289996794] [bt_navigator_navigate_to_pose_rclcpp_node]: Failed to get result for follow_path in node halt!
23: [bt_navigator-9] [WARN] [1726598245.290220688] [bt_navigator]: [navigate_to_pose] [ActionServer] Client requested to cancel the goal. Cancelling.
23: [bt_navigator-9] [INFO] [1726598245.290369384] [bt_navigator]: Goal canceled
23: [controller_server-5] [INFO] [1726598245.290324155] [controller_server]: Cancellation was successful. Stopping the robot.
23: [controller_server-5] [WARN] [1726598245.290365719] [controller_server]: [follow_path] [ActionServer] Client requested to cancel the goal. Cancelling.
23: [controller_server-5] [INFO] [1726598245.291095699] [controller_server]: Optimizer reset
23: [tester_node-18] [INFO] [1726598247.236768575] [nav2_gps_waypoint_tester]: Sending goal request...
23: [waypoint_follower-10] [INFO] [1726598247.238026552] [waypoint_follower]: Converting GPS waypoints to map Frame..
23: [navsat_transform_node-17] terminate called after throwing an instance of 'GeographicLib::GeographicErr'
23: [navsat_transform_node-17]   what():  Latitude 35, longitude -118 out of legal range for UTM zone 30
23: [tester_node-18] [INFO] [1726598247.254926124] [nav2_gps_waypoint_tester]: Goal accepted
23: [tester_node-18] [INFO] [1726598247.255442004] [nav2_gps_waypoint_tester]: Waiting for 'follow_gps_waypoints' action to complete
23: [ERROR] [navsat_transform_node-17]: process has died [pid 54129, exit code -6, cmd '/opt/ros/rolling/lib/robot_localization/navsat_transform_node --ros-args -r __node:=navsat_transform --params-file /opt/overlay_ws/src/navigation2/nav2_system_tests/src/gps_navigation/dual_ekf_navsat_params.yaml --params-file /tmp/launch_params_gf2s71vx -r imu/data:=imu/data -r gps/fix:=gps/fix -r gps/filtered:=gps/filtered -r odometry/gps:=odometry/gps -r odometry/filtered:=odometry/global'].
1/1 Test #23: test_gps_waypoint_follower .......***Timeout 180.13 sec
stevedanomodolor commented 1 month ago

I suspect this might actually be the fix for it https://github.com/cra-ros-pkg/robot_localization/pull/886. Has a new version of robot localization been release yet as a debian package? I would try with the latest ros2 and see if it solves the problem.

SteveMacenski commented 1 month ago

No, not recently. If you clone and use ros2 branch, does that fix the issue? If so, I can prioritize a release.

stevedanomodolor commented 1 month ago

I am trying it now. I will let you know if it solves it as soon i get a result.

stevedanomodolor commented 1 month ago

The current solution resolves the original issue, but it introduces a new problem, which I'm unsure is the intended behavior. When the "out of range" error occurs, the system returns the default values from the fromll ROS2 service (used for coordinate conversion). In this case, those default values are zero, causing the robot to navigate to coordinates (0,0). The main issue is that the fromll service doesn't differentiate between a successful conversion and an "out of range" scenario.

The root cause of the "out of range" error stems from the fact that when the fromll service is called, it assumes that the UTM zone is the same as the one initially set when the robot started the test. Since this UTM zone is never updated throughout the test, it leads to issues later on. Now that I reflect on it, this is essentially what you had pointed out earlier.

1: [waypoint_follower-10] [INFO] [1721076230.534812656] [waypoint_follower]: Converting GPS waypoints to map Frame..
1: [navsat_transform_node-17] terminate called after throwing an instance of 'GeographicLib::GeographicErr'
1: [navsat_transform_node-17]   what():  Latitude 35, longitude -118 out of legal range for UTM zone 30

I imagine that means that the GZ generated GPS is outside of what RL is expecting for the UTM zone. Did the other test set something for this like a datum to RL or the SDF for where to say the test is taking place w.r.t. GPS?

stevedanomodolor commented 1 month ago

To solve the problem, we need to set the datum every time we cross into a new UTM zone. I changed the test point for a point inside the utm zone but outside the map and it works. However, I find it a bit confusing that we need to handle this each time we call fromll because, in most cases, we don’t know when we’re switching UTM zones. Ideally, when calling fromll, the conversion should be independent of the UTM zone, much like how it functions within the gpsFixCallback method. I opened an issue in localization for this but I don't know how useful this would be given the fact that the repo will be deprecated soon but anyways.

stevedanomodolor commented 1 month ago

No, not recently. If you clone and use ros2 branch, does that fix the issue? If so, I can prioritize a release.

Not necessary, it is nice to have to though because if someone tries to call the fromll with a value outside the utm zone set initially by the robot, RL will crash. Setting a value inside the utm zone but outside the map solved the problem in this case.

stevedanomodolor commented 1 month ago

@SteveMacenski prs are open

SteveMacenski commented 1 month ago

@stevedanomodolor can we simply update the test so that all the points are within the same UTM zone? At the end of the day, this test is for Nav2's use of GPS waypoint navigation, not testing R_L's edge cases

stevedanomodolor commented 1 month ago

@SteveMacenski it is indeed. Without that it would not have worked hence my previous comment. All i had to do was that, put all the points in the same utm zone.