Image publish subscribe very slow on default ROS2 Humble steup.

vik748 commented 6 months ago

Bug report

Required Info:

Operating System:
- Ubuntu 22.04
Installation type:
- ros2-humble from binaries
Version or commit hash:
- N/A
DDS implementation:
- Default ROS2 RMW which I believe is Fast-RTPS
Client library (if applicable):
- rclpy

Steps to reproduce issue

I noticed the issue when using the official Intel Realsense ROS drivers, but it can be reproduced with this simple minimal example.

# Create test_image_publisher.py with:
import numpy as np
import rclpy
from cv_bridge import CvBridge
from rclpy.node import Node
from rclpy.publisher import Publisher
from sensor_msgs.msg import Image

class TestImagePublisher(Node):
    """Test Node to publish image data."""

    cv_bridge: CvBridge
    rgb_pub = Publisher

    def __init__(self, *args, **kwargs):
        super().__init__(node_name="test_image_publisher", *args, **kwargs)
        self.cv_bridge = CvBridge()

        # Create publishers
        self.rgb_pub = self.create_publisher(Image, "/rgb", qos_profile=2)

        self.timer = self.create_timer(1 / 15, self._timer_callback)

    def _timer_callback(self):
        rgb_cv = np.round(np.random.random((480, 848, 3)) * 255).astype(np.uint8)
        rgb_msg = self.cv_bridge.cv2_to_imgmsg(rgb_cv)
        self.rgb_pub.publish(rgb_msg)

def main(args=None):
    rclpy.init(args=args)

    node = TestImagePublisher()

    try:
        rclpy.spin(node)
    except KeyboardInterrupt:
        print("\nKeyboard interrupt, shutting down.\n")
        node.destroy_node()

if __name__ == "__main__":
    main()

Test with:

# Launch the node
 python3 test_image_publisher.py

# On second terminal test subscriber frequency
ros2 topic hz /rgb
# Output
average rate: 4.283
    min: 0.065s max: 1.269s std dev: 0.39247s window: 8
average rate: 4.758
    min: 0.065s max: 1.269s std dev: 0.33082s window: 14
average rate: 3.743
    min: 0.065s max: 1.269s std dev: 0.40459s window: 16
average rate: 2.039
    min: 0.065s max: 4.062s std dev: 0.97536s window: 17
average rate: 1.847
    min: 0.065s max: 4.062s std dev: 0.97084s window: 18
average rate: 1.805
    min: 0.065s max: 4.062s std dev: 0.92204s window: 20

Expected behavior

The subscriber should receive images at 15hz.

Actual behavior

As seen above the images start at 4hz and drop down to almost 1hz on my 13th Gen Intel i7 laptop.

With Cyclone DDS I see a similar issue out of the box, but I can fix it by setting: sudo sysctl -w net.core.rmem_max=2147483647

None of the suggested settings from https://docs.ros.org/en/humble/How-To-Guides/DDS-tuning.html helped address this.

Additional information

A simple thing like image publishing should work out of the box. Am I missing something?

clalancette commented 6 months ago

A simple thing like image publishing should work out of the box. Am I missing something?

There are likely 2 things going on here; slowness in rclpy for publishing, and slowness of ros2 topic hz.

For the first one, you can try running with Python optimizations enabled:

python3 -O test_image_publisher.py

(this has been fixed in newer versions of ROS 2 automatically)

For the second one, I would try measuring the rate using another tool for now, preferably one written in C++.

Also, it might be helpful to try the whole thing in C++, as it is far more performant than Python at the moment.

vik748 commented 6 months ago

@clalancette given that by using cyclone with the specified settings, this issue goes away. I highly doubt that this is related to python. But for completeness I will try this with cpp too.

audrow commented 6 months ago

FYI @MiguelCompany and @EduPonz.

MiguelCompany commented 6 months ago

@vik748

With Cyclone DDS I see a similar issue out of the box, but I can fix it by setting: sudo sysctl -w net.core.rmem_max=2147483647

If setting the socket buffer sizes solves the issue with Cyclone DDS, I would try the same thing with Fast DDS.

You could also set the requested buffer sizes in the XML configuration of Fast DDS as explained here

vik748 commented 5 months ago

@clalancette @MiguelCompany here are something things I have tested:

I wrote a C++ publisher which had similar issues while measuring with ros2 topic hz.
I then wrote a C++ subscriber which performed much better and received the messages at 15hz, without any drop in rate.
I found this discussion - topic hz provides wrong rate for larger msgs which suggests that its the default QOS profile of ros2 topic hz which results in the difference. So when I changed my C++ subscriber to use rmw_qos_profile_sensor_data, I do observe rate drop similar to that from ros2 topic hz.
@MiguelCompany in the link you provided, it says:

In high rate scenarios or large data scenarios, network packages can be dropped because the transmitted amount of data fills the socket buffer before it can be processed. Using RELIABLE_RELIABILITY_QOS mode, Fast DDS will try to recover lost samples, but with the penalty of retransmission. With BEST_EFFORT_RELIABILITY_QOS mode, samples will be definitely lost.

This lines up with what I am observing, however increasing the socket buffer sizes both the read and the write socket, doesn't to solve the issue. I have tried increasing it with with the sysctl commands and with the XML file, but no luck. Have you been able to reproduce on your setup with the code I have above.

alaurenzi commented 2 months ago

I am experiencing the same issue on Ubuntu 24 + Jazzy. Raw images from usb cam are received (by, e.g., rqt_image_view around 2 Hz, whereas compressed theora runs at 30 Hz.

ros2 / rmw_fastrtps