ros2 / rclcpp

rclcpp (ROS Client Library for C++)
Apache License 2.0
514 stars 410 forks source link

Add ignore-local-endpoints functionality to avoid double delivery #2202

Open alsora opened 1 year ago

alsora commented 1 year ago

ROS 2 currently has a problem with double-delivery of messages when combining intra-process and inter-process communication. Consider for example a scenario where you have two processes:

When publisher A publishes a message, we would expect the message to be delivered to subscriber X via intra-process comm and to subscriber Y via inter-process comm.

However, what actually happens is a double delivery: the inter-process publication will indeed reach also subscriber X, which will then discard the message, but still wasting a non-negligible amount of resources.

In order to get around this problem, it's necessary to tell publisher A that it doesn't need to send inter-process messages to subscriber X which is in the same process and it's going to be serviced by intra-process comm.

ROS already has the concept of "ignore local endpoints", however this is applied to the whole process. This can be problematic if in a process there are both entities with intra-process comm enabled and disabled.

We are working with eProsima to allow a more fine-grained control of the "ignore local endpoints" functionality, such that each ROS 2 entity can indicate that to the RMW. This will allow to solve the problem of the double delivery.

In order to implement that, it looks like it's necessary to make some changes to the rclcpp logic that decides whether to publish inter or intra process.

Right now we check for

bool inter_process_publish_needed = get_subscription_count() > get_intra_process_subscription_count()

However, if publisher A is "ignoring local endpoints", this means that get_subscription_count() will only return the non-local ones.

We think that the logic here should be changed into something like

bool inter_process_publish_needed = get_non_local_subscription_count() > 0;
fujitatomoya commented 1 year ago

@alsora just checking my understanding.

So if we just rely on RMW LoanedMessage for intra-process communication w/o rclcpp one, this should be no problem? since that can be managed by RMW implementation? (i understand the performance is not as same with rclcpp intra-process communication)

thanks,

MiguelCompany commented 8 months ago

@alsora @fujitatomoya

I prepared a set of PRs for the following:

We think that the logic here should be changed into something like

bool inter_process_publish_needed = get_non_local_subscription_count() > 0;