ros2 / rclcpp

rclcpp (ROS Client Library for C++)
Apache License 2.0
534 stars 413 forks source link

Avoid copies when combining intra-process communication and shared memory transports #2203

Open alsora opened 1 year ago

alsora commented 1 year ago

Feature request

ROS 2 supports multiple communication modes. Consider the following scenario, with two processes:

Intra-process communication allow the publisher to send the message to subscriber X without copies. Similarly if zero-copy shared memory transports are enabled, the publisher should be able to send the message to the subscriber Y without performing any copy.

However, the mechanism breaks when both modes are used at the same time: indeed inter and intra-process currently need different copies of the message.

Note that disabling intra-process communication is a suboptimal solution. Indeed intra-process communication is faster and uses less CPU than shared memory transport, even if no copies are involved. Depending on the size of the message, the performance penalty from not using intra-process comm may be larger than the overhead caused by the copy that is needed when combining the two transport modes.

However, it should be possible to get the best of both worlds when all subscribers are only interested in read-only access to the message.

This would require to essentially use the same loaned message for both inter and intra process deliveries.

fujitatomoya commented 1 year ago

@alsora thanks for the summary.

I think this also clears double buffering message, it would be really nice to have this enhancement.

This would require to essentially use the same loaned message for both inter and intra process deliveries.

IMO, I guess we see a few options here,

alsora commented 1 year ago

When dealing with communication between two entities in the same process, we have multiple communication modes: ordered from the most efficient to the least efficient

My understanding is that currently the only way to avoid the copy of the message is to use rmw shared memory between all pubs and subs: the rmw intra-process would have the same limitation as the rclcpp intra-process in this context (this is different from the issue described in https://github.com/ros2/rclcpp/issues/2202, which would be solved by using rmw intra-process)

fujitatomoya commented 1 year ago

My understanding is that currently the only way to avoid the copy of the message is to use rmw shared memory between all pubs and subs

AFAIK, Fast-DDS can already support this using LoanedMessage w/o ROS 2 intra-process option? in this case, i think it can support QoS as well. what do you mean the same limitation by here?

CC: @MiguelCompany

alsora commented 1 year ago

The only problem is that using loaned messages between pubs and subs in the same process is more inefficient than using intra-process optimization mechanisms.

Assuming that the communication A -> Y is done through shared memory transport, we have the following options:

  1. communication A -> X is done through rclcpp intra-process
  2. communication A -> X is done through rmw intra-process
  3. communication A -> X is done through shared memory transport

Options 1) and 2) result in an extra copy of the messages, while option 3) has additional overhead due to the use of loaned message APIs. You can see here https://github.com/ros2/rclcpp/issues/1642#issuecomment-900387789 that loaned messages have a latency approximately 4 times the one of rclcpp intra-process.

IMO all the 3 options above are currently suboptimal.

To improve the situation we can either:

  1. allow rclcpp intra-process and shared memory to coexist without extra copies
  2. improve the performance of the shared memory transport when used within a single process.