ros2 / rosbag2

Apache License 2.0
262 stars 240 forks source link

Show size contribution of each topic with ros2 bag info #1601

Open tonynajjar opened 3 months ago

tonynajjar commented 3 months ago

Description

I would like to know the contribution of each topic to the size of the the bag (e.g. to throttle, compress, downsample, etc...)

On humble this is the current output of ros2 bag info:

Files:             rosbag2_2024_04_03-13_48_20/rosbag2_2024_04_03-13_48_20_0.mcap
Bag size:          50.1 MiB
Storage id:        mcap
Duration:          18.891s
Start:             Apr  3 2024 13:48:21.76 (1712144901.76)
End:               Apr  3 2024 13:48:39.967 (1712144919.967)
Messages:          3802
Topic information: Topic: /plan | Type: nav_msgs/msg/Path | Count: 0 | Serialization Format: cdr
                   Topic: /mission/progress_markers | Type: visualization_msgs/msg/MarkerArray | Count: 0 | Serialization Format: cdr
                   Topic: /global_costmap/costmap | Type: nav_msgs/msg/OccupancyGrid | Count: 50 | Serialization Format: cdr
                   Topic: /mission/state | Type: test_interfaces/msg/MissionState | Count: 13 | Serialization Format: cdr
                   Topic: /mission/path | Type: nav_msgs/msg/Path | Count: 0 | Serialization Format: cdr
                   Topic: /diagnostics | Type: diagnostic_msgs/msg/DiagnosticArray | Count: 233 | Serialization Format: cdr
                   Topic: /motors/states | Type: test_interfaces/msg/MotorStates | Count: 295 | Serialization Format: cdr
                   Topic: /map_origin_setter/map_origin | Type: geographic_msgs/msg/GeoPose | Count: 1 | Serialization Format: cdr
                   Topic: /odometry/gps | Type: nav_msgs/msg/Odometry | Count: 14 | Serialization Format: cdr
                   Topic: /local_costmap/costmap | Type: nav_msgs/msg/OccupancyGrid | Count: 40 | Serialization Format: cdr
                   Topic: /rosout | Type: rcl_interfaces/msg/Log | Count: 426 | Serialization Format: cdr
                   Topic: /remote/cmd_vel | Type: geometry_msgs/msg/Twist | Count: 0 | Serialization Format: cdr
                   Topic: /picker/state | Type: test_interfaces/msg/PickerStatus | Count: 15 | Serialization Format: cdr
                   Topic: /robot_description | Type: std_msgs/msg/String | Count: 1 | Serialization Format: cdr
                   Topic: /navigation/cmd_vel | Type: geometry_msgs/msg/Twist | Count: 0 | Serialization Format: cdr
                   Topic: /watchdog/capabilities | Type: test_interfaces/msg/Capabilities | Count: 30 | Serialization Format: cdr
                   Topic: /tf_static | Type: tf2_msgs/msg/TFMessage | Count: 32 | Serialization Format: cdr
                   Topic: /tf | Type: tf2_msgs/msg/TFMessage | Count: 1462 | Serialization Format: cdr
                   Topic: /teb_poses | Type: geometry_msgs/msg/PoseArray | Count: 0 | Serialization Format: cdr
                   Topic: /cmd_vel | Type: geometry_msgs/msg/Twist | Count: 0 | Serialization Format: cdr
                   Topic: /gps/fix_raw | Type: sensor_msgs/msg/NavSatFix | Count: 15 | Serialization Format: cdr
                   Topic: /imu/data | Type: sensor_msgs/msg/Imu | Count: 740 | Serialization Format: cdr
                   Topic: /odometry/encoders | Type: nav_msgs/msg/Odometry | Count: 435 | Serialization Format: cdr

Count is already useful but not enough. I have no idea how the size would be calculated, is it difficult?

tonynajjar commented 2 months ago

cross-linking related question: https://answers.ros.org/question/318667/using-rosbag-to-get-size-of-each-topic/

nicolaloi commented 1 month ago

I could work on a PR to address this issue.

tonynajjar commented 1 month ago

Nice @nicolaloi, I recently stumbled upon the topic throttle node which can throttle by bytes published and looking at its implementation, rclcpp::SerializedMessage has a size() function, maybe this could be used somehow (see also capacity())

@MichaelOrlov do you think it's feasible? Do you have some implementation directions?

MichaelOrlov commented 1 month ago

@tonynajjar @nicolaloi, The scope of the changes, is not clear to me. Do you want to crawl over each message in the bag file and gather statistics about its size during ros2 bag info, or gather this statistic during recording and save it to the metadata? If former, it could take some time, and it would be better to put it under the --verbose CLI option for the ros2 bag info. For the second option there are more structural changes required on multiple layers of the rosabg2. Since we will need to bump the Metadata version and make sure that we keep backward compatibility.

tonynajjar commented 1 month ago

Since you're discussing implementations details, I'll assume that this information is gatherable in the first place. I don't have a strong opinion about the implementation but I would choose the second option because I think it doesn't add much overhead while recording (?). But I'd love to hear an expert's opinion on the tradeoffs.

nicolaloi commented 1 month ago

My opinion as ROS 2 user:

For both options, I think you only need to crawl over messages of variable size (i.e. containing unbounded arrays, like PointCloud2), while for example for Odometry you only need to multiply the fixed message size for the number of messages.

About the possible "get the size of variable-size messages" overhead for both options, the size() method of rclcpp::SerializedMessage is a simple return serialized_message_.buffer_length;, so it is probably very negligible (plus a very likely negligible addition when updating the total size of the messages on-the-fly).

However, the bag should also be opened and processed with the first option, so I think it will have an extra overhead. With the second option, I think the only overhead should be more or less the size() call and the addition.

As a user, I would probably prefer to have these size statistics in the metadata (second option). However, if its overhead is problematic, natively getting the statistics from ros2 bag info with the first option will still be a nice feature.

tonynajjar commented 1 month ago

Another piece of code to take inspiration from for msg size: https://github.com/ros2/ros2cli/blob/rolling/ros2topic/ros2topic/verb/bw.py

MichaelOrlov commented 1 month ago

@tonynajjar The first option is relatively easy to implement. The second option, as it was mentioned above, will not introduce too much overhead in gathering such statistics. However, will require significantly more changes. Also, since this is sort of "statistics", it would be nice to design it in such a way as to add it in a separate class, which will be responsible for gathering statistics during recording. Among messages sizes per topic, it would be great to gather in future statistics about the number of messages lost per topic on the "wires" (transport layer) and on the rosbag2 layers.

I would suggest going with the first option for begging as a first stage.