Robot jumps sometimes | slam

abylikhsanov commented 4 years ago

Hi Carlos,

Sometimes during the localization the robot jumps very far away. Is that because I need to restrict the box search further down to avoid that?

Also, is it possible to update the map along the navigation? Because I have a very dynamic obstacle and I think I need to always refresh my map

carlosmccosta commented 4 years ago

Good afternoon,

The ICP registration algorithms try to find the least squares minimization between the distances of the corresponding points [sensor data -> map]. This means that objects that are not on the map, but in the sensor data are close to the map geometry, may cause the pose estimation to shift a small amount when the number of outliers is very significant.

To mitigate this issue, I developed the localization pipeline for running with a tracking configuration and a recovery configuration. This way, if you set the max_correspondence_distance to a low value (0.5 meters for example), then all the dynamic objects that appear in the sensor data that are farther than [max_correspondence_distance] in relation to the closest point in the map will be discarded, allowing the registration algorithms to rely on (hopefully) sensor data that belongs to the objects that are on the map.

The drawback of setting [max_correspondence_distance] to a value low, is that you need reliable and accurate odometry. If that is not the case, then [max_correspondence_distance] needs to be high enough for allowing ICP to find enough correspondences [sensor data -> map], otherwise it will not be able to estimate the robot pose.

If the odometry is not reliable only sporadically (wheel slippage for example), then you can set [max_correspondence_distance] to a lower value for normal operation and rely on the tracking_recovery_matchers for recovering from these cases (because a higher [max_correspondence_distance] is used).

Here are some examples:

If the robot is jumping very, very far away (several meters), then it is likely that the transformation_validators along with the tracking configurations that you are using detected that the robot was lost and tried to activate the recovery algorithms. If these failed to perform a point cloud registration that respects the transformation_validators_tracking_recovery, then the initial pose algorithms were triggered for finding the initial pose of the robot on the map (which means drl can place the robot anywhere on the map that best fits sensor data -> map).

If you do not want the robot to jump very far and don't mind providing its initial pose using rviz when the system starts, then I recommend disabling the initial pose estimation algorithms (just comment the associated yaml file) and confirm that it is not being loaded into the parameter server.

Otherwise, you can tune transformation_validators to accept point cloud registrations with more outliers and also tune the tracking and recovery configurations to only activate the initial pose estimation algorithms after a longer time and after more attempts were made to find the robot pose.

If you need dynamic map update just for navigation in the form of a occupancy grid, you can use octomap_server. If you need SLAM, you can use drl, ethzasl_icp_mapper, hector_mapping, gmapping, among others.

abylikhsanov commented 4 years ago

Hi. Thank you. I recently started to using the drl on Intel Atom x7 z8700 and getting a very high cpu consumption. Do you have an example config file for such CPUs?

Also the octomap server, does it save the dynamically changed map for the next iteration? I could not find that

carlosmccosta commented 4 years ago

Good afternoon,

I tested the guardian_config configurations in a ML300 Rend Lake NUC, equiped with a 2.80 GHz Intel Core i5-3427U, which had better performance when compared with the Intel Atom x7 z8700.

Overall, for reducing the CPU usage I would recommend to tune the drl configurations depending on your localization accuracy requirements, which will be a trade off between accuracy and computational requirements:

Reducing the number of points for registration by:
- Increasing the cell size of the map (reduces the number of points in the reference point cloud) and / or increasing the look up table cell size
- Increasing the leaf size of the voxel grid
- Reducing the number of points in the random sample filter
Reducing the max_correspondence_distance for making ICP rely on less points and optimize less / better correspondences (beware that reducing too much requires good odometry)
Adjusting the termination criteria of the registration algorithms (the definition of "good enough" registration to stop ICP from iterating again, defined the pcl::registration::DefaultConvergenceCriteria::hasConverged function (default_convergence_criteria.h))

In the guardian_localization.launch, dynamic_robot_localization_system.launch and octo_map.launch you can see how I integrated my fork of octomap_mapping for using octomap with drl. This way, the initial map comes from octomap_server to drl and drl sends to octomap_server new registered sensor data, which octomap integrates into the map using ray tracing, sending back to drl an updated map after 1 second. drl can build its map internaly without octomap, but I did not add yet the export of the point cloud to an ros nav_msgs/OccupancyGrid because I was using octomap due to its ray tracing capabilities for adding and removing geometry to the map.

I also added the option of updating just the navigation map. This way drl operates much more efficiently, because it uses a static map, avoiding the need to rebuild the look up table for new updated maps.

You can try to use the oficial version of octomap. You just need to:

Configure octomap_server to receive the point cloud registered by drl (or the raw sensor data)
Configure drl to receive the updated map in the reference_costmap_topic topic.

Have a nice day :)

abylikhsanov commented 4 years ago

Hi, I have used your suggestion to use the update_navigation_map parameter and I made the octomap_server to subscribe to the ambient_pointcloud because the aligned_pointcloud existed but no messages were published. I have also changed the parameter:

reference_costmap_topic = /map_navigation and now the DRL started to use that map as a reference. However, I am not too sure on how to use it. Here is my example map (original) with map_navigation on top of it: WhatsApp Image 2019-12-11 at 1 48 40 PM

So if the map_navigation is totally new, how does DRL localize against it?

abylikhsanov commented 4 years ago

I am also a bit puzzled with the dynamic map navigation. So let's say I have created an inital map and DRL localizes against that map with incoming lidar measurements. Then I turn the octomap_server which basically recreates the map with additional new obstacles (or removes it) and now, the DRL localizes against the new map created by the Octomap, so far so good.

Now, how can DRL localize incoming LiDAR data with the newly made map from Octomap? Does it mean that DRL updates its hash table (correspondance table) right after the Octomap server updates the map? If so, does it mean that the localization during the map update should be very precise otherwise over some period of time, the covariance will be too large? And if so, how does this affect the initial pose estimation if the DRL uses the map created by the Octomap is much smaller than the initial map (given the screenshot above)? Because I had a look at your fork of the octomap_mapping and I could not find the parameter occupancy_grid_in

EDIT: I found the occupancy_grid_in param: https://github.com/carlosmccosta/octomap_mapping/blob/e102f8b53511c90f5405bc650e1cc20fa1f35b9b/octomap_server/src/OctomapServer.cpp#L172

carlosmccosta commented 4 years ago

Hello :)

The update_navigation_map in guardian_config was setup to update only the map for navigation. The goal was to use a static and clean map (without garbage / dynamic objects) for localization, in order to allow drl to operate efficiently (if the map is dynamic, every time the map changes, the lookup table needs to be rebuilt). This way, the navigation system could perform path planning with a continuously updated map and the localization system could perform efficiently with a static and clean map (dynamic objects negatively affect the accuracy of the localization performed by the map matching algorithms).

Both the initial occupancy grid and the octomap octree have the origin of the map in the same place, so the maps from localization and navigation should / must overlap, but they can have different range and resolution (for example, the localization map could have the entire view of a warehouse and the navigation map could / should have only the zones in which the robot could move -> giving better performance to path planning algorithms). For example, we usually used a localization map with 2 cm cells for the occupancy grid and a navigation map with 6 cm cells. For path planning of the mobile platform, 6 cm were enough, and allowed to accelerate the path planing algorithm (when compared with using 2 cm occupancy grid cells).

For drl, the map update requires the rebuild of the lookup table (or kd-tree), but it does not require initial pose estimation, since both the new and old map have the same origin, and as such, the pose of the robot should be in the same place in both old and new maps (with minor offsets due to robot motion).

If you really need to update the map used for localization, then you should change the argument use_slam in guardian_localization.launch.

For the navigation map update to work as expected, the octomap should receive the occupancy grid loaded by the map_server for building the initial map. Then, you can configured octomap to update the navigation map using new raw data from the lidars (use_original_laser_scans_in_slam).

Have a nice day :)

abylikhsanov commented 4 years ago

Hi,

Ok, I get it, the only purpose is for the navigation.
If I turn on the use_slam option, will it start to build an entirely new map or will it simply update the existing map that I have? Also, Do I need to manually save the map after each turn or should I implement some simple node which calls the service to save the map for the future use?
The last 2 questions are more about the DRL itself. We have set the max_outliers_percentage to 0.9 as with lower values DLR simply rejected at specific spots. Sometimes, the percentage goes to even bigger that 0.9, did you experience that before? We are using the default parameters for the tracking.
We have quite a powerful machine and the cpu consumption is below 60%, however, sometimes DRL rejects the lidar measurements because they are too old (we already changed the TF patience from 0.15 to 0.4) which makes the delay in pose estimation and finally when DRL comes back it rejects the new pose as the robot travelled quite a distance which is above the max_translational_fix which in turn DRL goes to the recovery mode. What is the reason behind that? As I said, overall consumption of the CPU is below 60%, could it be in lidar delays as the LiDAR is attached to a different machine and send to DLR via WiFi?

carlosmccosta commented 4 years ago

In slam mode drl starts with no map and builds it over time when new sensor data arrives. But you can give drl an occupancy grid in the beginning for initializing from a known map.

You just need to be careful how you configure drl for registering the occupancy grid subscriber: https://github.com/carlosmccosta/dynamic_robot_localization/blob/melodic-devel/yaml/schema/drl_configs.yaml#L187 https://github.com/carlosmccosta/dynamic_robot_localization/blob/melodic-devel/include/dynamic_robot_localization/localization/impl/localization.hpp#L1365

The saving of the map is currently manual, but there are already tools for that:

We used drl in dynamic environments but in which the outliers were usually below 75%. Depending on the environment, sensor data and configuration, drl can estimate the robot pose with 90% of outliers. But in that case, you should look in rviz to the overlap between the map and the aligned point cloud for confirming that drl is performing correctly.

I would recommend to have all your robot hardware connected using ethernet. Wifi can add a lot of unnecessary latency that will affect both the localization system and motion controller when it reaches high values.

You can adjust the transformation_validators for letting drl tolerate higher offsets when tracking the robot pose: https://github.com/carlosmccosta/dynamic_robot_localization/blob/melodic-devel/yaml/schema/drl_configs.yaml#L752

abylikhsanov commented 4 years ago

Thanks and what does this parameter do? reference_pointcloud_update_mode: 'NoIntegration'

So if I understood correctly, if I want to reuse the map and update it, I need to:

<arg name="reference_pointcloud_available" default="false" if="$(arg use_slam)" />
        <arg name="reference_pointcloud_available" default="true" unless="$(arg use_slam)" />
        <arg name="reference_pointcloud_type_3d" default="false" />
        <arg name="reference_pointcloud_update_mode" default="OutliersIntegration" if="$(arg use_slam)" /> <!-- NoIntegration | FullIntegration | InliersIntegration | OutliersIntegration -->
        <arg name="reference_pointcloud_update_mode" default="NoIntegration" unless="$(arg use_slam)" />
        <arg name="initial_2d_map_file" default="$(arg initial_2d_map_file_slam)" />

And :

<arg name="reference_costmap_topic" default="/map" />

So that the DRL will provide the initial_2d_map_file and /map will be updated with OutliersIntergation and the DRL will localize against that new reference_costmap_topic, correct?

Also, I have noticed that if I use use_slam then octomap_frame is empty: <arg name="octomap_sensor_frame_id" default="" if="$(arg use_original_laser_scans_in_slam)" />

And DRL launches laser_to_pcl , is that correct?

carlosmccosta commented 4 years ago

reference_pointcloud_update_mode specifies the modes of slam:

NoIntegration -> static map
FullIntegration -> all aligned data is used for updating the map
InliersIntegration -> only points close to the map are used for updating the map
OutliersIntegration -> only points far from the map are used for updating the map https://www.youtube.com/watch?v=GKDG7c61OLo

For using an initial map and having the occupancy grid subscriber created, you need to: https://github.com/carlosmccosta/dynamic_robot_localization/blob/melodic-devel/include/dynamic_robot_localization/localization/impl/localization.hpp#L1365

<arg name="reference_pointcloud_available" default="true" />
<arg name="reference_pointcloud_required" default="true" />
<arg name="reference_pointcloud_type_3d" default="false" />
<arg name="reference_pointcloud_update_mode" default="OutliersIntegration" /> <!-- FullIntegration | InliersIntegration | OutliersIntegration -->
<arg name="initial_2d_map_file" default="$(arg initial_2d_map_file_slam)" />
<arg name="reference_costmap_topic" default="/map" />

For octomap performing accurate ray tracking, it needs to process the sensor data in their original frame. In my tests I had a robot with 2 lidars, one in the front and another in the back, that were merged and sent to drl for alignment. Octomap could either integrate the merged (both lidars) and aligned point cloud with an approximate point of view in the center of the robot or integrate each scan separately in their original frame for more accurate ray tracing (the ray tracing algorithms require a point of view for the sensor data).

abylikhsanov commented 4 years ago

Ok, I understood, but in drl you also used: <arg name="publish_pose_tf_rate" default="-1.0" if="$(arg use_slam)" />

and:

<node name="map_server_localization" pkg="map_server" type="map_server" args="$(arg map_file)" respawn="$(arg nodes_respawn)" unless="$(arg use_slam)" >
        <param name="frame_id" value="map" />
    </node>

I guess I need to activate them as I plan to SLAM and Localize at the same time?

carlosmccosta commented 4 years ago

Yes, when publish_pose_tf_rate is <= 0, the pose_to_tf_publisher will only publish tf after receiving a PoseStamped msg from drl. This is important, for octomap correctly transform the laser data in their original frame to the map frame.

This is more related with our motion controller implementation that was using the tf base_link -> map, and since we had +- good odometry, I could configure pose_to_tf_publisher to republish the last odom -> map correction with an updated timestamp so that the lookup transform within the motion controller would not block until a new pose estimation was given after processing a new lidar scan. This is due to the fact that our lidars operated at around 10 hz, and a motion controller usually runs at 50 hz or more.

By having the pose_to_tf_publisher republish the odom -> map with timestamp ros::Time::now(), the motion controller would be getting at high frame rate the base_link -> map, at the expense of higher error (because it was using odometry estimation between the time stamps in the middle of the lidars time stamps).

Depending on your use case, you can set publish_pose_tf_rate to be always -1.

The map server is necessary to read the pgm / png file and publish the occupancy grid msg.

carlosmccosta commented 4 years ago

If you are planning on using slam, then you should probably use kd-trees instead of lookup tables, since a kd tree is usually faster to initialize.

abylikhsanov commented 4 years ago

OK get it and should I keep including the tracking, initial_estimate and recovery yaml files too? As I still need to localize

carlosmccosta commented 4 years ago

Yes, you should just adjust the part relative to the correspondence_estimation_approach: https://github.com/carlosmccosta/dynamic_robot_localization/blob/melodic-devel/yaml/schema/drl_configs.yaml#L506 Just change it to: CorrespondenceEstimation.

carlosmccosta / dynamic_robot_localization

Robot jumps sometimes | slam #14