ros-swg / turtlebot3_demo

Repository to build and test Turtlebot3 packages
Apache License 2.0
20 stars 6 forks source link

QoS issues over scan topic with security enabled #12

Open ruffsl opened 5 years ago

ruffsl commented 5 years ago

It seems that amcl is still having issues with dropping scan messages when security is enabled. That, or the tf lookup is having to look into the future for transforms between scan and baselink:

``` [world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1 timeout was 1. [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 31.801 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 39.802 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 40.002 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 32.001 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 32.201 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 40.202 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 32.401 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 40.402 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 32.601 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 40.602 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 32.801 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 40.801 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.098 timeout was 1. [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 33.001 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 41.002 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 33.201 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 41.202 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 41.402 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 33.401 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 41.602 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 33.601 for reason(0) [amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 41.802 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 33.801 for reason(0) [world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.002 timeout was 1. ```
``` [dwb_controller-5] [INFO] [local_costmap.local_costmap]: Timed out waiting for transform from base_link to odom to become available, tf error: Could not find a connection between 'odom' and 'base_link' because they are not part of the same tree.Tf has two or more unconnected trees.. canTransform returned after 1.099 timeout was 1. [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 38.401 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 38.601 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 38.800 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 39.001 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 39.201 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap]: Timed out waiting for transform from base_link to odom to become available, tf error: Could not find a connection between 'odom' and 'base_link' because they are not part of the same tree.Tf has two or more unconnected trees.. canTransform returned after 1.098 timeout was 1. [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 39.401 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 39.601 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 39.801 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 40.001 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 40.201 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 40.401 for reason(0) [dwb_controller-5] [INFO] [local_costmap.local_costmap]: Timed out waiting for transform from base_link to odom to become available, tf error: Could not find a connection between 'odom' and 'base_link' because they are not part of the same tree.Tf has two or more unconnected trees.. canTransform returned after 1.099 timeout was 1. ```

As of b0a41cbb9693b158507fd5768787508af9420cc8 , I've updated the readme with some simple instructions, so please check if you can get the navigation planner to move when using the secure.conf. You can also revert 3a1562c8055dc7dbe47e319ed038a4adccd0e9c0 if you'd like rebuild the dockerfile using osrf/ros2:nightly, but I'm not sure if rviz2 is fixed yet: https://github.com/ros-planning/navigation2/issues/1015#issuecomment-525890257 .

crdelsey commented 5 years ago

We get the Drop message: frame 'base_scan' at time 39.802 for reason(0) messages even without security, although, too many of them is correlated with failure.

Here's what I know.

This message comes from the message filter code in tf. It receives the scan messages, but ends up dropping them out of the buffer before it can transform them and pass them on to the stack. I think we'd see this message if amcl and costmap were not getting the right transforms or were not able to process them fast enough.

There also were some race conditions in the most recent implementation of this code that @jacobperron fixed. There might still be some other problems

ruffsl commented 4 years ago

I've opened a PR https://github.com/ros-swg/turtlebot3_demo/pull/13 to switch to using dashing release for everything but the newer SROS2 CLI, and if I spam the set initial pose using the script, it seems to work eventually. So I'm thinking this might be a discovery issue, where the subscribers for initial pose don't immediately find the publisher given the initial burst of discovery traffic, missing that adversized by rviz. So by using the ros2 cli to topic publish the pose, eventually upon enough tries in creating/destroying the cli node, the initial pose message finds its mark.

ruffsl commented 4 years ago

Looks like the amcl and costmap nodes are not subscribing correctly, looking at it from the graph API: (Cropping nodes with no unhidden topics from screenshots in node view in rqt https://github.com/ros-swg/turtlebot3_demo/pull/22)

image

Working Unsecure demo

Unlocalized

image

Localized

image

Flaky Secure demo

Unlocalized

image