Ekumen-OS / beluga

A general implementation of Monte Carlo Localization (MCL) algorithms written in C++17, and a ROS package that can be used in ROS 1 and ROS 2.
https://ekumen-os.github.io/beluga/
Apache License 2.0
196 stars 15 forks source link

The nav2_amcl version present in the humble container crashes replaying the perfect odometry bagfile #253

Open glpuga opened 1 year ago

glpuga commented 1 year ago

Bug description

Running the perfect odometry bagfile against nav2 amcl in the Humble development container causes amcl to crash early in the bagfile.

This is an issue for us because this prevents performance measurements from being captured for this node, and therefore lacking any significant comparison baseline for beluga.

Building nav2_amcl from source with the source taken from a the current HEAD of navigation2's repository runs just fine, so it's probably an issue within amcl itself, but a relevant change has not been identified as the fix.

Platform (please complete the following information):

How to reproduce

  1. Go to the beluga working copy
  2. ./docker/run --build
  3. Once in the container: colcon build && . install/setup.bash && ros2 launch beluga_example perfect_odometry.launch.xml localization_package:=nav2_amcl localization_node:=amcl

Expected behavior

AMCL should be able to produce localization data from the bagfile replay.

Actual behavior

AMCL crashes. When capturing performance data, no data is captured at all.

Additional context

This issue has been present since early June or so, but it was initially thought to be caused by #232 and other problems in the bagfiles.

The stack dump of the crash can be obtained with:

sudo apt update && sudo apt install xterm -y
colcon build
. install/setup.bash
ros2 launch beluga_example perfect_odometry.launch.xml localization_package:=nav2_amcl localization_node:=amcl localization_prefix:="xterm -e gdb --args "

Once in xterm start the amcl process with the run command and wait until it crashes. Then use the back command.

Screenshot from 2023-09-02 18-36-45

glpuga commented 1 year ago

This is somehow related to the launchfiles being broken and loading the wrong param file in #254 . When the right param file is loaded, nav2_amcl works fine.

The failure may then be related to some value in the default param yaml file.

nahueespinosa commented 1 year ago

This might be related to https://github.com/Ekumen-OS/beluga/pull/238#issuecomment-1629892401. I think we agreed not to use Humble to benchmark.

Most likely, adaptive recovery is enabled in the default param yaml file, which causes the crash. We do need to fix the launch files.

glpuga commented 1 year ago

This might be related to https://github.com/Ekumen-OS/beluga/pull/238#issuecomment-1629892401. I think we agreed not to use Humble to benchmark.

Thanks for the pointer. I had totally forgot about that. Fixing the launchfile should prevent this from happening for the benchmark runs at least.

glpuga commented 1 year ago

For easier future reference this is the root issue that needs fixing in Humble:

hidmic commented 1 year ago

Ping the maintainer to backport https://github.com/ros-planning/navigation2/pull/3315#issuecomment-1790573075. We'll see when and if we get a response.

hidmic commented 12 months ago

Backported https://github.com/ros-planning/navigation2/pull/3938 ! We should be in the clear in the next package sync.

hidmic commented 8 months ago

Well, no, the Nav2 folks did not release in time for https://discourse.ros.org/t/preparing-for-humble-sync-2023-12-15/35091.