ros2 / rosbag2

Apache License 2.0
283 stars 251 forks source link

Undefined behaviors when exceeding the maximum recording capability #354

Open weihunko opened 4 years ago

weihunko commented 4 years ago

Description

Actually I'm not sure if this should be a feature request or a bug report, but we have seen some undefined behaviors (segfault, or not subscribe to all the topics) when trying to test the recording limitation of rosbag2.

Expected Behavior

  1. Gracefully exit if the bandwidth or number of topics exceeds the maximum loading of rosbag2.
  2. Ideally, also improve the performance of rosbag2 recording

Actual Behavior

Segfault or only a subset of topics is discovered.

To Reproduce

  1. Prepare ROS 2 eloquent
    1. https://index.ros.org/doc/ros2/Installation/Eloquent/Linux-Development-Setup/
    2. https://github.com/ros2/ros2/tree/release-eloquent-20200124
  2. Clone and build ros2_bandwidth_tester
    1. mkdir -p ~/ros2_bandwidth_tester_ws/src
    2. cd ~/ros2_bandwidth_tester_ws/src && git clone https://github.com/weihunko/ros2_bandwidth_tester
    3. cd ~/ros2_bandwidth_tester_ws
    4. colcon build
  3. In terminal 1: ros2 bag record -a
  4. In terminal 2: ros2 run ros2_bandwidth_tester bandwidth_tester --number 50 --size 200 --rate 100
    1. This command will start 50 publishers, publishing strings with 200kb at rate 100Hz
  5. At this point ros2 bag record will segfault
  6. Tried the bag recording with different numbers of publishers, rates, and size of messages, sometimes the ros2 bag can still record but only discover a subset of all the existed publishers.

System (please complete the following information)

Additional context

N/A

dejanpan commented 4 years ago

@Karsten1987 @emersonknapp LMK if this can be put on your board and reviewed in the Tooling WG meeting on April 10.

Karsten1987 commented 4 years ago

@dejanpan it will be part of the issue triaging. So yes, we'll discuss this during the meeting. Or is there anything more to it you'd like to discuss as a separate agenda item?

dejanpan commented 4 years ago

@Karsten1987 nope, issue triaging is enough for it to get on the radar.

emersonknapp commented 3 years ago

There is still going to be some degradation at the limit of performacne - though we can make this better-defined, what happens in that case. Did the performance improvements since Foxy resolve this to a satisfactory state, or are there specific features we can implement to close this out?