Visual Proximity detection parameterization

adityarasam commented 4 years ago

Hi

I am trying to understand visual proximity link detection performance in more details to understand its use case for our project and based on my experiments following are my questions

In your paper “Long-Term Online Multi-Session Graph-Based SPLAM with Memory Management (2017)” section 3.3 you mentioned about different thresholds to narrow down the “close by nodes” for proximity detection, like radius R, number of paths and radius L. Among these, the parameter for radius ‘R’ is the “RGBD/ProximityPathFilteringRadius”, the number of path is RGBD/ProximityMaxPaths = "3". However what is the parameter for L(inner radius)?
In another paper “RTAB-Map as an Open-Source Lidar and Visual SLAM Library for Large-Scale and Long-Term Online Operation” in section 3.4, you talked about “RGBD/ProximityMaxGraphDepth”. Are there any other changes related to Visual Proximity detection since your 2017 paper?
Regarding the ProximityMaxGraphDepth threshold, what is the issue with large odometry drift? My understanding is that if a proximity detection link is detected, the motion estimation will correct the odometry drift just like a global loop closure? I understand that detection is less likely with large odometry drift but if successful, why shouldn’t we take advantage of it?
Also just to see proximity detection performance, is there a way to disable the global loop closure (green frames) so that the system could only perform the visual proximity detection (yellow frames).

adityarasam commented 4 years ago

One follow up question: In your 2017 paper you clearly show the benefit of Proximity detection in Fig. 9. But I saw on your website (https://github.com/introlab/rtabmap/wiki/ICP) that you recommend not using ICP in rtabmap because of not being robust enough. In the example of the 2017 paper, wouldn’t the benefits of ICP out way the negatives? Do you have any experience in which case ICP caused some major problems?

matlabbe commented 4 years ago

Hi, Here are some answers.

That paper tried to be general, so variables used were not the same than the parameter name in the libary. Here are the correspondences about the proximity detection module:
- R = RGBD/LocalRadius (default 10 meters)
- L = RGBD/ProximityPathFilteringRadius (default 1 meter)
- Maximum paths searched: RGBD/ProximityMaxPaths (default 3 paths)
- RGBD/ProximityMaxGraphDepth (default 50 nodes, which means the last 50 seconds if Rtabmap/DetectionRate=1 hz): in the paper there was no limit. This has been afterwards to avoid proximity detection on parts of the graph seen to much time ago (of current node and last loop closure) because there is highly chance that odometry drifted too much.
- RGBD/ProximityPathMaxNeighbors (default 10 for 2d scans, 1 for 3d scans): in the paper there was no limit on how many nodes were used in the local radius when merging scans.
See previous answer.
If we set RGBD/ProximityMaxGraphDepth to 0 (no limit), for each new frame, we will try to find a transformation with the closest node in the graph, independently if we saw that node 1 minute or 1 hour ago. It is very unlikely that the node seen 1 hour ago, if we didn't find loop closures in that area for a while, would be very close of the current node if we traverse the same area. It is likely that the closest node would be a node from another place. Proximity detection based on laser scans may create a wrong loop closure with another part of the map, because if ICP converges, we assume the transformation is correct. If you don't use a lidar, just a camera, the transformation would be a little more robust as same visual features should be found in both frames. However, with the camera, the global loop closure detection approach would likely already find a closure with the closest node if we are actually close the correct node. In highly repetitive environments, the loop closure hypothesis may not be high enough for common places, so there, proximity detection could be useful.
- RGBD/ProximityMaxGraphDepth can be increased if odometry drift is very small over time or if Rtabmap/DetectionRate is increased.
- RGBD/ProximityMaxGraphDepth would be lowered if odometry drifts a lot.
Yes, set Rtabmap/LoopThr to 1. For efficiency, if you want to do only lidar proximity detection, set Kp/MaxFeatures to -1 to disable feature extraction.
The article was about using ICP for RGB-D cameras, not lidar. I updated the article to make it more clear. The general problem with ICP is when the geometry of the captured cloud or scan is not very complex. In indoor environments, it is likely that the camera will see flat walls filling its full field of view (unless we use fisheye stereo cameras or RICOH THETA 360 cameras, but with those we use visual feature-based approaches, not ICP). It is less likely with long-range 360 or >180 degs degrees 2D lidars (like RPLIDAR, hokuyo, SICK) or 3D lidars (like Velodyne, Ouster or RobotSense) that we will capture enough geometry in most common environments, making ICP quite accurate for such cases. However, there could be some cases where ICP would work poorly with 2D lidar indoor, in particular in long corridors. In paper RTAB-Map as an Open-Source Lidar and VisualSLAM Library for Large-Scale and Long-Term OnlineOperation, an approach in Section 3.1.2 (third bullet point) has been presented (parameter Icp/PointToPlaneMinComplexity) to limit this problem if we have access to external odometry (e.g., wheel encoders in this case). It is primarly why HectorSLAM failed the short-range lidar test in figure 13, as it is the only one in this comparison not using external odometry to help ICP.

Regards, Mathieu

introlab / rtabmap

Visual Proximity detection parameterization #508