ros2 / rmw_fastrtps

Implementation of the ROS Middleware (rmw) Interface using eProsima's Fast RTPS.
Apache License 2.0
147 stars 116 forks source link

Image publish subscribe very slow on default ROS2 Humble steup. #757

Open vik748 opened 1 month ago

vik748 commented 1 month ago

Bug report

Required Info:

Steps to reproduce issue

I noticed the issue when using the official Intel Realsense ROS drivers, but it can be reproduced with this simple minimal example.

# Create test_image_publisher.py with:
import numpy as np
import rclpy
from cv_bridge import CvBridge
from rclpy.node import Node
from rclpy.publisher import Publisher
from sensor_msgs.msg import Image

class TestImagePublisher(Node):
    """Test Node to publish image data."""

    cv_bridge: CvBridge
    rgb_pub = Publisher

    def __init__(self, *args, **kwargs):
        super().__init__(node_name="test_image_publisher", *args, **kwargs)
        self.cv_bridge = CvBridge()

        # Create publishers
        self.rgb_pub = self.create_publisher(Image, "/rgb", qos_profile=2)

        self.timer = self.create_timer(1 / 15, self._timer_callback)

    def _timer_callback(self):
        rgb_cv = np.round(np.random.random((480, 848, 3)) * 255).astype(np.uint8)
        rgb_msg = self.cv_bridge.cv2_to_imgmsg(rgb_cv)
        self.rgb_pub.publish(rgb_msg)

def main(args=None):
    rclpy.init(args=args)

    node = TestImagePublisher()

    try:
        rclpy.spin(node)
    except KeyboardInterrupt:
        print("\nKeyboard interrupt, shutting down.\n")
        node.destroy_node()

if __name__ == "__main__":
    main()

Test with:

# Launch the node
 python3 test_image_publisher.py

# On second terminal test subscriber frequency
ros2 topic hz /rgb
# Output
average rate: 4.283
    min: 0.065s max: 1.269s std dev: 0.39247s window: 8
average rate: 4.758
    min: 0.065s max: 1.269s std dev: 0.33082s window: 14
average rate: 3.743
    min: 0.065s max: 1.269s std dev: 0.40459s window: 16
average rate: 2.039
    min: 0.065s max: 4.062s std dev: 0.97536s window: 17
average rate: 1.847
    min: 0.065s max: 4.062s std dev: 0.97084s window: 18
average rate: 1.805
    min: 0.065s max: 4.062s std dev: 0.92204s window: 20

Expected behavior

The subscriber should receive images at 15hz.

Actual behavior

As seen above the images start at 4hz and drop down to almost 1hz on my 13th Gen Intel i7 laptop.

With Cyclone DDS I see a similar issue out of the box, but I can fix it by setting: sudo sysctl -w net.core.rmem_max=2147483647

None of the suggested settings from https://docs.ros.org/en/humble/How-To-Guides/DDS-tuning.html helped address this.

Additional information

A simple thing like image publishing should work out of the box. Am I missing something?

clalancette commented 1 month ago

A simple thing like image publishing should work out of the box. Am I missing something?

There are likely 2 things going on here; slowness in rclpy for publishing, and slowness of ros2 topic hz.

For the first one, you can try running with Python optimizations enabled:

python3 -O test_image_publisher.py

(this has been fixed in newer versions of ROS 2 automatically)

For the second one, I would try measuring the rate using another tool for now, preferably one written in C++.

Also, it might be helpful to try the whole thing in C++, as it is far more performant than Python at the moment.

vik748 commented 1 month ago

@clalancette given that by using cyclone with the specified settings, this issue goes away. I highly doubt that this is related to python. But for completeness I will try this with cpp too.

audrow commented 1 month ago

FYI @MiguelCompany and @EduPonz.

MiguelCompany commented 1 month ago

@vik748

With Cyclone DDS I see a similar issue out of the box, but I can fix it by setting: sudo sysctl -w net.core.rmem_max=2147483647

If setting the socket buffer sizes solves the issue with Cyclone DDS, I would try the same thing with Fast DDS.

You could also set the requested buffer sizes in the XML configuration of Fast DDS as explained here

vik748 commented 4 weeks ago

@clalancette @MiguelCompany here are something things I have tested: