Open cyn-liu opened 2 months ago
Easily learn about pipeline latency up-to a node you like by subscribing to node_name/pipeline_latency_ms
topic.
https://github.com/orgs/autowarefoundation/discussions/2822
Takes more time to prepare, requires post processing to analyze timings.
For checking the latency of whole system.
An object spawns at a predetermined time and the time when the velocity changes is measured.
@cyn-liu I updated this post.
Node Relationship Graph of Autoware's Lidar Mode Pperception Pipeline
We have calculated the average running time of each node in the lower half of the node relationship graph above, as shown in the following:
We have calculated the average running time of each node in the upper half of the node relationship graph above, as shown in the following:
We use the Pipeline timings statistical method recommended by xmfc above to perform latency time statistics on Autoware's perception pipeline(only lidar mode).
Autoware perception module publish the following pipeline_latency_time
debug information, we subscribe to these topics and analyze numerical changes through rqt
to evaluate the time consumption of the perception pipeline.
/perception/obstacle_segmentation/crop_box_filter/debug/pipeline_latency_ms
/perception/occupancy_grid_map/raw_pc_downsample_filter/debug/pipeline_latency_ms
/perception/occupancy_grid_map/obstacle_pc_downsample_filter/debug/pipeline_latency_ms
/perception/occupancy_grid_map/pointcloud_based_occupancy_grid_map/debug/pipeline_latency_ms
/perception/object_recognition/detection/clustering/voxel_grid_based_euclidean_cluster/debug/pipeline_latency_ms
/perception/object_recognition/detection/centerpoint/lidar_centerpoint/debug/pipeline_latency_ms
/perception/object_recognition/detection/obstacle_pointcloud_based_validator/debug/pipeline_latency_ms
/perception/object_recognition/detection/object_lanelet_filter/debug/pipeline_latency_ms
/perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
We found that the multi_object_tracker
node is very time-consuming, so we analyzed the various modules inside the node, a lot of debug information has been published within the node to help us analyze the time consumption.
/perception/object_recognition/tracking/multi_object_tracker/debug/input_latency_ms
Indicate the time delay from receiving data to starting data processing.
/perception/object_recognition/tracking/multi_object_tracker/debug/processing_latency_ms
Indicate the time delay from data processing to data publishing.
/perception/object_recognition/tracking/multi_object_tracker/debug/processing_time_ms
The time from the start of publishing to the completion of publishing + the time from the start of processing to the end of processing.
But I don't know why the value of this topic is equal to 0.
/perception/object_recognition/tracking/multi_object_tracker/debug/meas_to_tracked_object_ms
Indicate the time delay from receiving data to triggering the timer (before data publishing).
/perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
Indicate the time delay from receiving data to publishing.
/perception/object_recognition/tracking/multi_object_tracker/debug/cyclic_time_ms
Indicate the time interval between two pubs.
From the above figure, it can be seen that /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
takes approximately 125 ms
, and it can be roughly assumed that pipeline_latency_ms
= input_latency_ms
+ processing_latency_ms
, in which, the time consumption of processing_latency_ms
is relatively stable, about 75 ms
, while the time consumption of input_latency_ms
fluctuates greatly, usually around 50 ms
.
https://github.com/user-attachments/assets/d3fffa90-ec82-42cf-8df6-8b8b1076c2c8
After analysis, the reason for the long time consumption of node multi_object_tracker
has been identified:
From the above analysis, it can be seen that /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
takes approximately 125ms, which is mainly composed of input_latency_ms
and processing_latency_ms
.
input_latency_ms
, it refers to the time consumption of data preprocessing, calculated as follows:
double input_latency_ms = (now - last_input_stamp_).seconds() * 1e3;
now
refers to the time after data preprocessing.
last_input_stamp_
refers to the timestamp of the earliest received data in the data queue.
After analysis, it was found that the data preprocessing function is not time-consuming, so the reason why input_latency_ms
takes a lot of time is that the data in the data queue is too old.
processing_latency_ms
, it refers to the time taken between starting the objects tracking algorithm and publishing the results, calculated as follows:
double processing_latency_ms = ((now - stamp_process_start_).seconds()) * 1e3;
now
refers to the time after publishing the results.
stamp_process_start_
refers to the time to start executing the tracking algorithm.
After analysis, it was found that the time consumption of each function in the tracking algorithm is very small. The main source of time consumption for processing_latency_ms
is that the publishing thread is triggered by a timer, while the tracking algorithm thread is triggered by data. The two are not synchronized, resulting in a high time consumption.
pipeline_latency_ms
represents the time it takes for the entire pipeline from point cloud publishing to the completion of execution at the current node.
/perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
refers to the time from the earliest data in the input data queue to the completion of node execution.
From the above statistical results, it was found that node multi_object_tracker
's /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
is very high and fluctuates greatly.
After analysis, it was found that this is because the data received by node multi_object_tracker
is placed in a queue. When there are some frame drops in the bag, the fluctuation of node multi_object_tracker
will be significant.
When replacing a bag without frame loss for testing, it was found that the fluctuation of /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
was very small, When there were two data in the input data queue of node multi_object_tracker
, /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
was about 200ms. When there was only one data in the input data queue of node multi_object_tracker
, /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms
was about 100ms. Therefore, the entire perception pipeline took about 100ms from point cloud publishing to tracking completion.
@cyn-liu Thank you for the in-depth analysis! :sparkles:
I cannot speak to the latency of the later stages in the pipeline, but we can soon get rid of CropBoxFilter completely as this is being implemented more efficiently in the LiDAR driver (Nebula):
CropBoxFilter will remain as a legacy option, as not all LiDAR vendors are supported by the new feature yet.
Checklist
Description
Counting the running time of the perception module(lidar-only mode) during real vehicle operation.
Environment: Device: DELL precison3581 OS: Ubuntu22.04 GPU: NVIDIA RTX A1000 6GB Laptop CPU: intel CORE i7
Purpose
In order to get a detailed of each node and the overall time consumption of the perception pipeline, and to facilitate developers in improving the performance of the perception module.
Possible approaches
We will investigate a reasonable method to calculate the running time of the perception pipeline. @mitsudome-r @xmfcx Do you have any suggestions for statistical tools and methods?
Definition of done
TODO