Open naitiknakrani-eic opened 2 months ago
After you created a database, you can extract all timing statistics and export them to MATLAB or Octave to show a graph like that. To do so, you can see the available statistics in the database with:
rtabmap-report --stats rtabmap.db
To show the statistics of the figure above (respectively to legend, the names may slightly differ):
rtabmap-report \
Timing/Memory_update/ms \
Timing/Proximity_by_space/ms \
Timing/Posterior_computation/ms \
Timing/Map_optimization/ms \
RtabmapROS/TimeUpdatingMaps/ms \
Timing/Reactivation/ms \
rtabmap.db
In the figure that will open, you can already see something similar than the figure above. To export the data, right-click on the legend area and click "Copy all curve data to clipboard".
Paste data in a text file, then import in any data analyzing software you want.
x Timing/Memory_update/ms Timing/Proximity_by_space/ms Timing/Posterior_computation/ms Timing/Map_optimization/ms RtabmapROS/TimeUpdatingMaps/ms Timing/Reactivation/ms
0.000000 6.505010 0.001907 0.012875 0.025034 NaN 0.000000
1.004863 11.481000 0.000954 0.005960 0.011921 NaN 0.005960
2.071385 8.010860 0.000000 0.006199 0.012875 NaN 0.001907
3.072435 7.767920 0.000000 0.005960 4.143950 NaN 1.037120
4.072935 10.537100 0.000954 0.008106 0.799894 NaN 0.039101
5.073380 10.127100 0.001192 0.005007 0.937939 NaN 0.032902
6.073429 10.460100 0.000954 0.007868 1.488920 NaN 0.041962
7.139824 10.862100 0.000954 0.005960 1.379010 NaN 0.034094
8.139961 7.934090 0.000954 0.005960 1.073120 NaN 0.035048
9.140281 8.811000 0.000954 0.006914 1.636030 NaN 0.039101
10.140464 7.201910 0.000954 0.007868 1.781940 NaN 0.049830
11.140506 8.593800 0.000954 0.010014 2.104040 NaN 0.061989
12.206952 8.467200 0.000954 0.020981 1.698020 NaN 0.044823
13.273479 10.787000 0.000954 0.025034 2.268080 NaN 0.051975
...
Note that in this case that map didn't have "Map Assembling Time" (or RtabmapROS/TimeUpdatingMaps/ms
), I took a random database, it may not have been taken on a robot with map assembling enabled.
@matlabbe Thanks. This helps a lot. For our database, we are getting graph like this,
We have Map assembling time, but its very less hence invisible in the graph. Below one is separate graph of Map assembling.
From this graph, it seems Global map assembling is not time consuming process. Is it on expected line ? You mentioned in other thread as well as in paper, that Map assembling is computationally heavy process. Are we missing something ?
Also, we did detailed time profile of STM, and other modules but can't able to find which function does Global map assembling. Is it part of Rtabmap.cpp or Corewrapper.cpp ? Can you help us pointing that function ?
You mentioned in other thread as well as in paper, that Map assembling is computationally heavy process
... (and more precisely) when there is a loop closure and graph deformation is over GridGlobal/UpdateError
(default=0.01m).
The global map assembling is called from the MapsManager called from here, but the actual implementation is in those files: https://github.com/introlab/rtabmap/tree/master/corelib/src/global_map (depending on the topic we are subscribing). The global maps are not generated if no one is subscribed on one of them.
The biggest update spikes will happen when the map has to be fully updated because of a loop closure or memory management retrieved past data back in the current WM (or more explicitly when this is true). Depending on the GlobalMap flavor, the full update may take more or less time (e..g, 2d maps are relatively fast to regenerate). In this paper Table 10, the last two columns show a comparison depending on the map used and if a loop closure happened or not. As you can see, when there is no loop closure, appending new local grids to global map is relatively fast (maybe what you are seeing).
Hi @matlabbe, continuing with above issues we kept different decimation on pointcloud, coming from pointcloudXYZ node with decimation kept as 1, 2 and 4. This can be viewed in below graphs
For decimation 1 and decimation 2 the neighbors value are 105 and 292 respectively, shouldn't be it like for more data points neighbors should be more as compared to case for decimation 2 and decimation 4, can you help us with reason for which this behavior can be seen here.
The number of neighbor links are roughly the same number of nodes added to the graph. I don't now what Rtabmap/DetectionRate
you are using and actual processing time per node, but if the processing time was over the update rate in decimation 1, that could make sense that with decimation 2 there are more nodes because the processing time would be lower (higher rate to add nodes). The difference between decimation 2 and decimation 4 I cannot really say without seeing the actual processing time. If you can share the three databases, that could be useful.
@matlabbe Thanks, is Rtabmap/UpdateRate same as Rtabmap/DetectionRate which is 30 in all three cases, for processing time Below is the attached zip for three databases, hope this could be useful here https://file.io/iUN8rPnCqU3S
@matlabbe If Memory management is disabled, and actual processing time exceeds the Rtabmap/detectionrate, then what happen ?
@samarth-eic
Yes, Rtabmap/DetectionRate
is the update rate I referred. I corrected the typo in my previous post.
Like what I thought, the processing time is high, so it cannot process in real-time at rate of 30 Hz (Rtabmap/DetectionRate
). Here is a breakdown of the computation on the first database (from your computer):
Rtabmap/DetectionRate
is not meant to be used at 30 Hz, as updating the map is likely to be higher than 33 ms. The default is 1 Hz. If you want more poses without increasing Rtabmap/DetectionRate
, you can set Rtabmap/CreateIntermediateNodes
at true
(there is no benefice to save more poses, that parameter is used mostly just to compare with benchmark datasets providing poses for every frame).
A lot of time is used to create the local grids, that could be optimized like this:
Grid/ClusterRadius=0.1
instead of 1
(for node 39, 600 ms -> 20 ms)Grid/NormalsSegmentation=false
instead of true
(20 ms -> 10 ms) (with Grid/MaxObstacleHeight=1
and Grid/MaxGroundHeight=0.1
)Grid/NoiseFilteringRadius=0
instead of0.07
(10 ms -> <1 ms)Example by reprocessing on my computer with these values:
rtabmap-reprocess \
--Grid/ClusterRadius 0.1 \
--Grid/NormalsSegmentation false \
--Grid/MaxObstacleHeight 1 \
--Grid/MaxGroundHeight 0.1 \
--Grid/NoiseFilteringRadius 0 \
radius_decimation1.db output.db
To improve proximity detection time, if you don't care not keeping the raw laser scans, is to downsample the scans with --Mem/LaserScanVoxelSize 0.05
and precompute the normals (--Mem/LaserScanNormalK 20
). Also in general for 3D point clouds I use --Icp/Iterations 10
. If computing ressources are limited, set also --RGBD/ProximityMaxPaths 1
and disable one-to-many detection by setting --RGBD/ProximityPathMaxNeighbors 0
:
rtabmap-reprocess \
--Grid/ClusterRadius 0.1 \
--Grid/NormalsSegmentation false \
--Grid/MaxObstacleHeight 1 \
--Grid/MaxGroundHeight 0.1 \
--Grid/NoiseFilteringRadius 0 \
--Mem/LaserScanVoxelSize 0.05 \
--Mem/LaserScanNormalK 20 \
--Icp/Iterations 10 \
--RGBD/ProximityMaxPaths 1 \
--RGBD/ProximityPathMaxNeighbors 0 \
radius_decimation1.db output.db
@naitiknakrani-eic
With memory management disabled or not, input data are always dropped between each rtabmap updates (i.e., the rate is limited either by Rtabmap/DetectionRate
or actual maximum rate rtabmap can process based on the machine). When memory management is disabled, if processing time increases over time, there will be more and more space between nodes in the graph. With memory management enabled, nodes in the graph will be uniformly spaced out based on Rtabmap/DetectionRate
, but some nodes may be transferred to LTM and not available in the current local map.
@matlabbe That's fantastic to know how these parameters really help in optimizing computation time.
"the rate is limited either by Rtabmap/DetectionRate or actual maximum rate rtabmap can process based on the machine", Like this is the reason why we kept it at 30 in order to achieve as max throughput as possible. Keeping detection rate at 1, actually underutilized processing capabilities of NVIDIA boards.
"the rate is limited either by Rtabmap/DetectionRate or actual maximum rate rtabmap can process based on the machine", Like this is the reason why we kept it at 30 in order to achieve as max throughput as possible. Keeping detection rate at 1, actually underutilized processing capabilities of NVIDIA boards.
Here some reasons why we set Rtabmap/DetectionRate
at 1 Hz by default:
Rtabmap/DetectionRate
is, the bigger the map (in space) can be processed online without any memory management.Where you would want to put more computation resources first is the visual odometry or lidar odometry, for which higher frequency usually means better accuracy and lower drift. Then equally important, robot controller and obstacle detection with low latency. Then lastly, re-localization/mapping.
cheers, Mathieu
Hi @matlabbe Thanks for your explanation and sorry for delay in revert back. We were understanding code and logic for NormalSegmentation. It was surprise to see that the Rtabmap works and create occupancy with keeping NormalSegmentation = off. It is a process which takes lot of CPU resources and contributes to ~60% of total timing (Refer First Figure 1 of early response, Timing/total avg - 1266 ms and Timing/occupancy - 781 ms).
This parameter skips the following function from file "rtabmap/corelib/include/rtabmap/core/impl /util3d_mapping.hpp":
if(segmentFlatObstacles && flatSurfaces->size())
{
int biggestFlatSurfaceIndex;
std::vector<pcl::IndicesPtr> clusteredFlatSurfaces = extractClusters(
cloud,
flatSurfaces,
clusterRadius,
minClusterSize,
std::numeric_limits<int>::max(),
&biggestFlatSurfaceIndex);
...
Can you please tell us in what scenario the segmentation is required and must ? what was original thought process behind implementing it ?
The normal segmentation is based on that experiment: http://wiki.ros.org/rtabmap_ros/Tutorials/StereoOutdoorNavigation#Global_costmap :
Since the ground is not even, the ground is segmented by normal filtering: all points with normal in the +z direction (+- fixed angle) are labelled as ground, all the others are labelled as obstacles. The images below show an example.
We enable it by default because it is more safe in general and easier to use to detect small obstacles. When we disable it, we do a simple passthrought filtering at some height (e..g, 10 cm) over the ground, which is very fast to do, but requires more tuning to detect small obstacles and cannot handle a ground that is not perfectly flat (e.g., ramps or slopes). Here another example from this paper:
For less-powerful computers and/or the robot is moving always on a flat floor, disabling it can give a good boost of performance.
To improve proximity detection time, if you don't care not keeping the raw laser scans, is to downsample the scans with
--Mem/LaserScanVoxelSize 0.05
and precompute the normals (--Mem/LaserScanNormalK 20
). Also in general for 3D point clouds I use--Icp/Iterations 10
. If computing ressources are limited, set also--RGBD/ProximityMaxPaths 1
and disable one-to-many detection by setting--RGBD/ProximityPathMaxNeighbors 0
:rtabmap-reprocess \ --Grid/ClusterRadius 0.1 \ --Grid/NormalsSegmentation false \ --Grid/MaxObstacleHeight 1 \ --Grid/MaxGroundHeight 0.1 \ --Grid/NoiseFilteringRadius 0 \ --Mem/LaserScanVoxelSize 0.05 \ --Mem/LaserScanNormalK 20 \ --Icp/Iterations 10 \ --RGBD/ProximityMaxPaths 1 \ --RGBD/ProximityPathMaxNeighbors 0 \ radius_decimation1.db output.db
@matlabbe Interesting point to note, with your given parameter setting the average Timing/Total/ms is just ~25 ms but it does not reflect in higher publishing rate for /MapData or /Map topic. We hardly get 4 to 5 FPS by checking "ros2 topic hz /mapData". Is it because of inherent limitation of DDS or something else we are missing? Our efforts in reducing computational time inside RTABMap is not translating to higher MapData publish rate.
We benchmarked mapping for two devices AGX orin and Orin nano on the same recorded bagfile. Here we converted depth into pointcloud with zero ROI and 2 decimation form pointcloud_xyz node. One thing we couldn't understand is the deviation in occupancy map generation. Please see the attached two snapshots:
1. On AGX Orin Find its db file: https://file.io/L6ku7EaRiUNG
2. On Orin Nano Find its db file: https://file.io/vsT9dT3pFFrI
The Orin Nano has more filled grid cells compared to AGX Orin. What could be the possible reason ? Note that both has given same data as in input.
Hi @francoisferland @matlabbe,
Can you help me to regenerate processing time logs for creating graph as shown in figure below?
Any tools will be helpful ? If tools are not available, can you tell methodology used to derive these graph ?
Thanks