osrf / rmf_core

Provides the centralized functions of RMF: scheduling, etc.
Apache License 2.0
102 stars 41 forks source link

full_control fleet adapter crashes when adding new robot #315

Closed RValner closed 3 years ago

RValner commented 3 years ago

Hi,

I'm attempting to integrate rmf with free_fleet and initially I just want to get the basic pipeline working. Demos in rmf_demos and ff_examples individually build and work fine. But when I tried to combine them, i.e., control the simulated turtlebot (via free_fleet demos) instead of the tinyRobot from the rmf_demos package, the full_control fleet adapter crashes as soon as the robot is registered.

I'm on Ubuntu 20.04 and the configuration of the demo pipeline is (launch files in the links): turtlebot_sim(ROS Noetic) -> free_fleet_client(ROS Noetic) -> free_fleet_server(ROS Foxy) -> rmf_core(ROS Foxy)

If run the full_control node in GDB, it's saying that an uninitialized pointer is being accessed in the FleetUpdateHandle::Implementation::get_nearest_charger method (or more percisely, the pointer in here)

More detailed backtrace here.

It might easily be that configurations in some of the above mentioned launch files just do not make sense but I'm out of ideas on what leads to this crash.

aaronchongth commented 3 years ago

Hello there @RValner!

Just a quick check, did you create and pass in a navigation graph of the turtlebot3 world for the fleet adapter? The examples in ff_examples do not come with their own navigation graph, which are required by the fleet adapters.

RValner commented 3 years ago

Hi, yes, I did generate a navigation graph via ros2 run building_map_tools building_map_generator nav tb3_world.building.yaml .. The source files and the resulting graph (0.yaml) can be seen here and here's the launch file that passes the graph to the fleet adapter.

mxgrey commented 3 years ago

Since it's crashing in the get_nearest_charger function, is it possible that your navigation graph is missing a charging waypoint, and that's leading to the crash?

If that's the issue, we should have the fleet adapter do something more sensible, like throw an exception with a clear explanation that the user is missing a charging point.

RValner commented 3 years ago

I have configured one of the navigation graph locations as a charger waypoint (shown here) but unfortunately that did not change the error. Just to be clear, the nav graph is also copied to the install space during colcon build, so ros2 launch is fetching the up-to-date version of the navgraph.

Yadunund commented 3 years ago

Hi @RValner ,

I think the reasons for you segmentation fault is an empty StartSet passed into FleetUpdateHandle::add_robot. As a result, accessing start[0] inside the get_nearest_charger() method results in a crash. The StartSet is computed in the full_control implementation here using the compute_plan_starts() method.

There could be a couple of scenarios where the compute_plan_starts() method would return empty.

  1. The level_name ( variable state.location.level_name) as reported in the RobotState/FleetState message does not match the level name in the supplied navigation graph.
  2. The reported location (variable {l.x, l.y, l.yaw}) of the robot in the published RobotState/FleetState message is nowhere near any of the waypoints/lanes in the navigation graph. This could happen if your robot is not spawned near a navigation graph or if your coordinate transformation between the turtlebot and RMF frames is incorrect. So could you check that these two points are valid? Kindly perform a sanity check of the RobotState/FleetState message published by free_fleet.

Additionally could you try computing the StartSet outside the fleet->add_robot() call here? Then you can check whether the result is really empty. If so, we would need to modify the full_control implementation to throw an error when this is the case.

RValner commented 3 years ago

Thanks @Yadunund, you're right, both the level_name in free_fleet client launch file and the transformation in server launch file were incorrect. Seems that if either of the aforementioned variables is ill defined, the full_control fleet adapter will segfault. I guess this crash is still probably not the intended behaviour as any faulty free_fleet client could potentially crash the adapter this way. But I got my problem solved and I'll close the issue.

Thanks a lot guys!