SteveMacenski / spatio_temporal_voxel_layer

A new voxel layer leveraging modern 3D graphics tools to modernize navigation environmental representations
http://wiki.ros.org/spatio_temporal_voxel_layer
GNU Lesser General Public License v2.1
640 stars 189 forks source link

[ROS2] costmap does not clear when voxels are cleared #187

Open tal-grossman opened 3 years ago

tal-grossman commented 3 years ago

Bug report

I'm new in using the the stvl layer (b.t.w - another great job!), and might be doing something wrong. But according to the tutorials and to the answer given here Spatio temporal voxel layer: Clearing voxels not seen but included in fov - which explicitly saying that if a voxel is removed than the costmap should be updated as well , I think there might be a sync issue - when voxels are cleared after decaying but the costmap is not.

please look at the video - voxels are green, taken from depth pcl. The 2D local costmap layer is pink.

https://user-images.githubusercontent.com/54407092/103073891-1b187080-45d1-11eb-8a0e-a9d6b47d882f.mp4

Required Info:

Steps to reproduce issue

Expected behavior

costmap is synced to stvl layer. when voxeled are cleared the costmap should be updated accordingly

Actual behavior

After voxels decayed and cleared from layer the costmap is not totally cleared

Additional information

SteveMacenski commented 3 years ago

voxel_min_points: 8

this is a new user-contributed feature, so that’s suspect to me to consider the source of the issue. I haven’t seen that behavior before with a similar configuration on my robots. If you try the default file for standard env, that’s exactly what I use, I’d be interested if you see it there.

If the internal voxel grid decays them all, the 2D costmap should definitely fully-clear. I’d venture to guess that the new feature addition wasn’t fully thought through and I didn’t catch it.

tal-grossman commented 3 years ago

I tried both.

got the same issue

https://user-images.githubusercontent.com/54407092/103175303-a6179600-4871-11eb-9fee-ed106a9ca537.mp4

SteveMacenski commented 3 years ago

I'd recommend trying on a commit hash after the voxel min was added to validate that that is indeed the issue

tal-grossman commented 3 years ago

I'd recommend trying on a commit hash after the voxel min was added to validate that that is indeed the issue

I am not sure which commit hash did you mean but I tried

They all got the same undesired behavior, So I can't say that this new user-contributed feature is the source of the issue

SteveMacenski commented 3 years ago

Ok. I’m back to work on Monday and I can take a look at this sometime next week. A minimum reproduction example would be helpful if you have a short dataset you can share with that config file.

tal-grossman commented 3 years ago

After some more digging trying to find better possible causes for that behavior we think we might find something.

local_costmap:
  local_costmap:
    ros__parameters:
      update_frequency: 5.0
      publish_frequency: 2.0
      global_frame: odom
      robot_base_frame: base_link
      use_sim_time: True
      rolling_window: true
      width: 3
      height: 3
      resolution: 0.05
      robot_radius: 0.22
      plugins: ["rgbd_obstacle_layer", "inflation_layer"]
      inflation_layer:
        plugin: "nav2_costmap_2d::InflationLayer"
        cost_scaling_factor: 3.0
        inflation_radius: 0.08
      rgbd_obstacle_layer:
        plugin: "spatio_temporal_voxel_layer/SpatioTemporalVoxelLayer"
        enabled:                  true
        voxel_decay:              10.0  # seconds if linear, e^n if exponential
        decay_model:              0     # 0=linear, 1=exponential, -1=persistent
        voxel_size:               0.1  # meters
        track_unknown_space:      true  # default space is known
        max_obstacle_height:      2.0   # meters
        unknown_threshold:        15    # voxel height
        mark_threshold:           0     # voxel height
        update_footprint_enabled: true
        combination_method:       1     # 1=max, 0=override
        origin_z:                 0.0   # meters
        publish_voxel_map:        true # default off
        transform_tolerance:      0.2   # seconds
        mapping_mode:             false # default off, saves map not for navigation
        map_save_duration:        60.0  # default 60s, how often to autosave
        observation_sources:      rgbd1_mark
        rgbd1_mark:
          data_type: PointCloud2
          topic: /intel_realsense_r200_depth/points
          marking: true
          clearing: false
          obstacle_range: 3.0          # meters
          min_obstacle_height: 0.3     # default 0, meters
          max_obstacle_height: 2.0     # default 3, meters
          expected_update_rate: 0.0    # default 0, if not updating at this rate at least, remove from buffer
          observation_persistence: 0.0 # default 0, use all measurements taken during now-value, 0=latest
          inf_is_valid: false          # default false, for laser scans
          filter: "passthrough"        # default passthrough, apply "voxel", "passthrough", or no filter to sensor data, recommend on
          voxel_min_points: 0          # default 0, minimum points per voxel for voxel filter
          clear_after_reading: true    # default false, clear the buffer after the layer gets readings from it
        rgbd1_clear:
          data_type: PointCloud2
          topic: /intel_realsense_r200_depth/points
          marking: false
          clearing: true
          max_z: 7.0                  # default 0, meters
          min_z: 0.1                  # default 10, meters
          vertical_fov_angle: 0.8745  # default 0.7, radians
          horizontal_fov_angle: 1.048 # default 1.04, radians
          decay_acceleration: 5.0     # default 0, 1/s^2. If laser scanner MUST be 0
          model_type: 0                # default 0, model type for frustum. 0=depth camera, 1=3d lidar like VLP16 or similar
  local_costmap_client:
    ros__parameters:
      use_sim_time: True
  local_costmap_rclcpp_node:
    ros__parameters:
      use_sim_time: True

https://user-images.githubusercontent.com/54407092/104180976-afc6ff00-5416-11eb-9ca2-f7d1bb7f95ad.mp4 rviz configuration: rviz_config_stvl_tb3.zip

The bottom line is that we are not sure it is a STVL issue but rather a nav2_costmap issue. any thoughts?

SteveMacenski commented 3 years ago

The inflation stuff should have nothing to do with STVL. The layers in the costmap don't impact each other except for the final pixel assemblies for which to take: max or last, which you're using combination method max so that's not an issue. I think what you're seeing isn't an actual trend that generalizes.

voxel_size: 0.1 # meters

Have you tried making this match your actual costmap resolution? Maybe your actual issue is raytracing-based. It's really bizarre to me that your STVL resolution doesn't match the master costmap's resolution. I see the lethal-marked voxels on the costmap with space between them like its a multiple of the costmap resolution.

That video is strange because I think that light blue color left over isn't actually from STVL. STVL doesn't do "some cost" obstacles, its either lethal or its not. So I don't think that's actually (maybe?) cause by STVL if I'm seeing that correctly as a non-lethal cost. Maybe a weird interaction in the inflation layer not really handling layer data exported that don't match the costmap resolution?

tal-grossman commented 3 years ago

I tried running the tb3 simulation again but with matching the voxel_size to the costmap resolution (both 0.05) and had no inflation layer and got the same bad result, see params and attached video:

local_costmap:
  local_costmap:
    ros__parameters:
      always_send_full_costmap: True
      update_frequency: 5.0
      publish_frequency: 2.0
      global_frame: odom
      robot_base_frame: base_link
      use_sim_time: True
      rolling_window: true
      width: 3
      height: 3
      resolution: 0.05
      robot_radius: 0.22
      plugins: ["rgbd_obstacle_layer"]
      inflation_layer:
        plugin: "nav2_costmap_2d::InflationLayer"
        cost_scaling_factor: 3.0
        inflation_radius: 0.1
      rgbd_obstacle_layer:
        plugin: "spatio_temporal_voxel_layer/SpatioTemporalVoxelLayer"
        enabled:                  true
        voxel_decay:              10.0  # seconds if linear, e^n if exponential
        decay_model:              0     # 0=linear, 1=exponential, -1=persistent
        voxel_size:               0.05  # meters
        track_unknown_space:      true  # default space is known
        max_obstacle_height:      2.0   # meters
        unknown_threshold:        15    # voxel height
        mark_threshold:           0     # voxel height
        update_footprint_enabled: true
        combination_method:       1     # 1=max, 0=override
        origin_z:                 0.0   # meters
        publish_voxel_map:        true # default off
        transform_tolerance:      0.2   # seconds
        mapping_mode:             false # default off, saves map not for navigation
        map_save_duration:        60.0  # default 60s, how often to autosave
        observation_sources:      rgbd1_mark
        rgbd1_mark:
          data_type: PointCloud2
          topic: /intel_realsense_r200_depth/points
          marking: true
          clearing: false
          obstacle_range: 3.0          # meters
          min_obstacle_height: 0.3     # default 0, meters
          max_obstacle_height: 2.0     # default 3, meters
          expected_update_rate: 0.0    # default 0, if not updating at this rate at least, remove from buffer
          observation_persistence: 0.0 # default 0, use all measurements taken during now-value, 0=latest
          inf_is_valid: false          # default false, for laser scans
          filter: "passthrough"        # default passthrough, apply "voxel", "passthrough", or no filter to sensor data, recommend on
          voxel_min_points: 0          # default 0, minimum points per voxel for voxel filter
          clear_after_reading: true    # default false, clear the buffer after the layer gets readings from it
        rgbd1_clear:
          data_type: PointCloud2
          topic: /intel_realsense_r200_depth/points
          marking: false
          clearing: true
          max_z: 7.0                  # default 0, meters
          min_z: 0.1                  # default 10, meters
          vertical_fov_angle: 0.8745  # default 0.7, radians
          horizontal_fov_angle: 1.048 # default 1.04, radians
          decay_acceleration: 5.0     # default 0, 1/s^2. If laser scanner MUST be 0
          model_type: 0                # default 0, model type for frustum. 0=depth camera, 1=3d lidar like VLP16 or similar
  local_costmap_client:
    ros__parameters:
      use_sim_time: True
  local_costmap_rclcpp_node:
    ros__parameters:
      use_sim_time: True

https://user-images.githubusercontent.com/54407092/104302576-a2227f80-54d1-11eb-96d5-2d1877570f5d.mp4

than, I added the inflation layer with inflation_radius: 0.1 (so its costmap resolution by factor of 2) and got GOOD expected result, see params and attached video:

local_costmap:
  local_costmap:
    ros__parameters:
      always_send_full_costmap: True
      update_frequency: 5.0
      publish_frequency: 2.0
      global_frame: odom
      robot_base_frame: base_link
      use_sim_time: True
      rolling_window: true
      width: 3
      height: 3
      resolution: 0.05
      robot_radius: 0.22
      plugins: ["rgbd_obstacle_layer", "inflation_layer"]
      inflation_layer:
        plugin: "nav2_costmap_2d::InflationLayer"
        cost_scaling_factor: 3.0
        inflation_radius: 0.1
      rgbd_obstacle_layer:
        plugin: "spatio_temporal_voxel_layer/SpatioTemporalVoxelLayer"
        enabled:                  true
        voxel_decay:              10.0  # seconds if linear, e^n if exponential
        decay_model:              0     # 0=linear, 1=exponential, -1=persistent
        voxel_size:               0.05  # meters
        track_unknown_space:      true  # default space is known
        max_obstacle_height:      2.0   # meters
        unknown_threshold:        15    # voxel height
        mark_threshold:           0     # voxel height
        update_footprint_enabled: true
        combination_method:       1     # 1=max, 0=override
        origin_z:                 0.0   # meters
        publish_voxel_map:        true # default off
        transform_tolerance:      0.2   # seconds
        mapping_mode:             false # default off, saves map not for navigation
        map_save_duration:        60.0  # default 60s, how often to autosave
        observation_sources:      rgbd1_mark
        rgbd1_mark:
          data_type: PointCloud2
          topic: /intel_realsense_r200_depth/points
          marking: true
          clearing: false
          obstacle_range: 3.0          # meters
          min_obstacle_height: 0.3     # default 0, meters
          max_obstacle_height: 2.0     # default 3, meters
          expected_update_rate: 0.0    # default 0, if not updating at this rate at least, remove from buffer
          observation_persistence: 0.0 # default 0, use all measurements taken during now-value, 0=latest
          inf_is_valid: false          # default false, for laser scans
          filter: "passthrough"        # default passthrough, apply "voxel", "passthrough", or no filter to sensor data, recommend on
          voxel_min_points: 0          # default 0, minimum points per voxel for voxel filter
          clear_after_reading: true    # default false, clear the buffer after the layer gets readings from it
        rgbd1_clear:
          data_type: PointCloud2
          topic: /intel_realsense_r200_depth/points
          marking: false
          clearing: true
          max_z: 7.0                  # default 0, meters
          min_z: 0.1                  # default 10, meters
          vertical_fov_angle: 0.8745  # default 0.7, radians
          horizontal_fov_angle: 1.048 # default 1.04, radians
          decay_acceleration: 5.0     # default 0, 1/s^2. If laser scanner MUST be 0
          model_type: 0                # default 0, model type for frustum. 0=depth camera, 1=3d lidar like VLP16 or similar
  local_costmap_client:
    ros__parameters:
      use_sim_time: True
  local_costmap_rclcpp_node:
    ros__parameters:
      use_sim_time: True

https://user-images.githubusercontent.com/54407092/104303530-dcd8e780-54d2-11eb-89e7-7e43ee1f56b4.mp4

Since we have the same issue without the inflation layer but got expected result with inflation layer with a certain inflation radius I still suspect an issue with the nav2_costmap.

SteveMacenski commented 3 years ago

I cannot emphasize enough how much I don't think the layers interact at all unless its an issue with the master costmap itself. This is possible, but doesn't seem like the most likely candidate.

I think maybe a good next step is to publish the STVL 2d costmap to visualize so you can compare against the master costmap here and that one. That way when you see things left over, you can see if STVL state is wrong or the costmap layer isn't being updated correctly. Maybe a bounds issue?

I'm also vaguely wondering if this is an rviz issue too where the /updates topic is dropping a message so the diffs are wrong. What happens if you send as https://github.com/ros-planning/navigation2/blob/053d0c8cc96553f951d21be843f86099dba627d5/nav2_costmap_2d/src/costmap_2d_ros.cpp#L286 send full costmap as true?

This just seems like such a bizarre issue to not actually be caused by STVL (or if STVL, where it came from out of the blue if not caused by the only recent change with the min voxel count)

tal-grossman commented 3 years ago

I cannot emphasize enough how much I don't think the layers interact at all unless its an issue with the master costmap itself. This is possible, but doesn't seem like the most likely candidate.

This just seems like such a bizarre issue to not actually be caused by STVL (or if STVL, where it came from out of the blue if not caused by the only recent change with the min voxel count)

As I said, I do not think it is necessarily a STVL issue and it seems more sense that it is a "master" nav2_costmap issue, due to the impact the costmap's resolution and other layers (inflation) have on this bug. regarding the recent change with the min voxel count, tried it already with 0 and with commits before this change.

I think maybe a good next step is to publish the STVL 2d costmap to visualize so you can compare against the master costmap here and that one. That way when you see things left over, you can see if STVL state is wrong or the costmap layer isn't being updated correctly. Maybe a bounds issue?

Can you elaborate on how should we do this? should be possible as I can easily reprocuce it with the tb3 simulation

I'm also vaguely wondering if this is an rviz issue too where the /updates topic is dropping a message so the diffs are wrong.

I dont believe it is an rviz issue since my actual robot would stuck in these ghost obstacles and the nav_to_goal would fail.

What happens if you send as https://github.com/ros-planning/navigation2/blob/053d0c8cc96553f951d21be843f86099dba627d5/nav2_costmap_2d/src/costmap_2d_ros.cpp#L286 send full costmap as true?

as you can see from the params we already had this one set to "true". did you mean something else?

SteveMacenski commented 3 years ago

Can you elaborate on how should we do this? should be possible as I can easily reprocuce it with the tb3 simulation

When we populate the master grid (this area of the code) https://github.com/SteveMacenski/spatio_temporal_voxel_layer/blob/foxy-devel/src/spatio_temporal_voxel_layer.cpp#L642-L661, create a new occupancy grid object and then publish it so you can see what STVL sees -- or just translate costmap_ itself at the end. If you don't see it there, then we know its a costmap master issue we can go sorting down. If its there, then its an STVL / configuration / porting issue.

I dont believe it is an rviz issue since my actual robot would stuck in these ghost obstacles and the nav_to_goal would fail.

Good to know

as you can see from the params we already had this one set to "true". did you mean something else?

:+1:

SteveMacenski commented 3 years ago

Any update on this? I haven't heard any other reports

Edit: whoops, hit the wrong button

tal-grossman commented 3 years ago

sorry, I have not found the time to make the adjustments and find the root cause. If I do, I let you know

SteveMacenski commented 3 years ago

Its my plan to look at this later today to see if I can reproduce at all using ROS2 master or the Foxy branch if I can get to some other maintenance tasks first. Any other info you can share on this or if you fixed your issue?

tal-grossman commented 3 years ago

Good to hear! I think I have shared with you all the information I could get. Currently as a workaround we're using the STVL with another 'inflation layer' with inflation_radius=0.1 which is double the size of the costmap's resolution- it seems to help most of the times. Other than than all the information + videos are listed in the ticket history.

SteveMacenski commented 3 years ago

@tal-grossman I'm really not seeing this behavior in ROS2 Foxy or on the new ros2 branch. The only thing I can think of is the fact that you may be using somewhat strange selections of voxel sizes in the costmap / specific layers / inflation that are extremely atypical and none of these systems were designed with that in mind (in all cases, it is the base expectation that these match). Selecting an inflation radius smaller than a single voxel is nonsensical (since it would do nothing at all) and not matching the costmap resolution for STVL should be OK, but I could see the inflation being too small might impact clearing in some strange unforeseeable way.

For example, for the basic turtlebot3 nav2 demo, I use the following for the local costmap (processing the laser)

local_costmap:
  local_costmap:
    ros__parameters:
      update_frequency: 5.0
      publish_frequency: 2.0
      global_frame: odom
      robot_base_frame: base_link
      use_sim_time: True
      rolling_window: true
      width: 3
      height: 3
      resolution: 0.05
      robot_radius: 0.22
      plugins: ["rgbd_obstacle_layer", "inflation_layer"]
      inflation_layer:
        plugin: "nav2_costmap_2d::InflationLayer"
        cost_scaling_factor: 3.0
        inflation_radius: 0.55
      rgbd_obstacle_layer:
        plugin: "spatio_temporal_voxel_layer/SpatioTemporalVoxelLayer"
        enabled:                  true
        voxel_decay:              15.0  # seconds if linear, e^n if exponential
        decay_model:              0     # 0=linear, 1=exponential, -1=persistent
        voxel_size:               0.05  # meters
        track_unknown_space:      true  # default space is known
        max_obstacle_height:      2.0   # meters
        unknown_threshold:        15    # voxel height
        mark_threshold:           0     # voxel height
        update_footprint_enabled: true
        combination_method:       1     # 1=max, 0=override
        origin_z:                 0.0   # meters
        publish_voxel_map:        true # default off
        transform_tolerance:      0.2   # seconds
        mapping_mode:             false # default off, saves map not for navigation
        map_save_duration:        60.0  # default 60s, how often to autosave
        observation_sources:      rgbd1_mark rgbd1_clear
        rgbd1_mark:
          data_type: LaserScan
          topic: /scan
          marking: true
          clearing: false
          obstacle_range: 30.0          # meters
          min_obstacle_height: -1.0     # default 0, meters
          max_obstacle_height: 2.0     # default 3, meters
          expected_update_rate: 0.0    # default 0, if not updating at this rate at least, remove from buffer
          observation_persistence: 0.0 # default 0, use all measurements taken during now-value, 0=latest
          inf_is_valid: false          # default false, for laser scans
          filter: "passthrough"        # default passthrough, apply "voxel", "passthrough", or no filter to sensor data, recommend on 
          voxel_min_points: 0          # default 0, minimum points per voxel for voxel filter
          clear_after_reading: true    # default false, clear the buffer after the layer gets readings from it
        rgbd1_clear:
          data_type: LaserScan
          topic: /scan
          marking: false
          clearing: true
          max_z: 70.0                  # default 0, meters
          min_z: 0.1                  # default 10, meters
          vertical_fov_angle: 0.523  # default 0.7, radians
          horizontal_fov_angle: 6.29  # default 1.04, radians
          decay_acceleration: 5.0    # default 0, 1/s^2. If laser scanner MUST be 0
          model_type: 1                # default 0, model type for frustum. 0=depth camera, 1=3d lidar like VLP16 or similar

You can see in black/white the local/global costmap published voxel maps (both with same params). In it, as we enter a new area, we see the points populate in the voxel grid as well as on the 2D costmap. As we pause on the other side of the map, we see the original points decay on the other side and eventually go away. Then we drive back to the original side of the map and see the same behavior again. You may note towards the end you can see a couple of voxels hanging around, but you can also see the laser scan data in those voxels are still visible so they're still rightly marked. I don't see any wrongly marked voxels.

This all leads me to believe that you may have found yourself in a strange position while messing with parameters. I'd recommend highly to try to go back to some reasonable defaults and start again changing them to see where this odd situation has come from. I don't completely rule out a bug here, but I really can't see where it is and I don't see the behavior you're seeing on your end no matter what I test :shrug:

https://user-images.githubusercontent.com/14944147/110557374-549f4900-80f5-11eb-9670-bfe07a0bdcae.mp4

nickvaras commented 3 years ago

In case it hasn't become obvious yet, the PCL min_points feature is broken. It only works with Pointcloud1 type messages but not with the pointcloud2 format.

tal-grossman commented 3 years ago

In case it hasn't become obvious yet, the PCL min_points feature is broken. It only works with Pointcloud1 type messages but not with the pointcloud2 format.

yeah we realized it when debugging this issue. but I don't think it has to do with the issue above. As mentioned, this issue has been tested before and after the min points feature was added.

SteveMacenski commented 3 years ago

@nickovaras @tal-grossman I'd be happy to merge a PR reverting this across the board then

Edit: re-reading, maybe isn't the issue for this ticket then because I and @tal-grossman didn't see a change in behavior

nickvaras commented 3 years ago

Based on what I know about this, I don't see the need to revert it, I believe that at worst it just doesn't do anything. I've had deep looks at the PCL code and done some standalone experiments as well, and when using Pointcloud2 messages the min_points parameters just get completely ignored, at least on the PCL version used by melodic. I can't think of any way in which the point cloud filtering would cause the costmap not to clear.

SteveMacenski commented 3 years ago

Agreed- though something not working I could argue should be removed because its deceptive.

@tal-grossman I honestly have no idea how you have the problem that you're seeing if I can't reproduce. The best I can do is ask you to debug a bit further to see where the disconnect is and with more information I could jump in and help work out a solution.

tonynajjar commented 2 years ago

Agreed- though something not working I could argue should be removed because its deceptive.

small feedback: It did turn out to be deceptive for me, spent some time debugging until I could find this issue. Will build latest pcl from source, unless you can advise of a better solution