New intra-process communication design

alsora commented 5 years ago

This PR adds a description of the current intra-process communication mechanism, together with a design proposal for improving its performances and supporting more QoS. The new design is supported by experimental evaluations.

An initial discussion about this design can be found here.

The implementation of this design can be found here.

@dgoel @raghaprasad

gbiggs commented 5 years ago

Please format with one sentence per line and no line breaks within a sentence so that changes made in response to comments are easier to track.

gbiggs commented 5 years ago

This is a tangential comment, but I wonder if we could achieve the same zero-copies-when-same-process result by reducing the number of copies requires for going into and out of the rmw layer to zero and using a DDS implementation that also supports zero copies (ignoring that there may not be any and that the standard API may not support this, both of which are solvable issues). One of the reasons for using DDS is to push all the communication issues down into an expert-vendor-supplied library, after all.

gbiggs commented 5 years ago

The current prototype implementation and the design focus on rclcpp. What about rclc? How are we going to ensure that rclpy has exactly the same capabilities and semantics?

raghaprasad commented 5 years ago

The current prototype implementation and the design focus on rclcpp. What about rclc? How are we going to ensure that rclpy has exactly the same capabilities and semantics?

I am not sure about the status of rclc in terms of feature parity, seems like the ros2 supported clients are rclcpp & rclpy.

While I agree about maintaining feature parity between the 2 supported clients, this design proposal is an improvement to the existing intra-process support, which currently is supported only in rclcpp.

If we continue to have the intra process manager live in rclcpp, we should have a separate discussion about what it means to do the same on rclpy (and if it is even required).

raghaprasad commented 5 years ago

This is a tangential comment, but I wonder if we could achieve the same zero-copies-when-same-process result by reducing the number of copies requires for going into and out of the rmw layer to zero and using a DDS implementation that also supports zero copies (ignoring that there may not be any and that the standard API may not support this, both of which are solvable issues). One of the reasons for using DDS is to push all the communication issues down into an expert-vendor-supplied library, after all.

How about moving the intra_process_management into an rmw ? This rmw could handle only intra_process communication and delegate inter-process communication to a any of the chosen DDS rmw implementations.

Support for zero copies is an important objective, but its not the only one. It has been observed that creating DDS participants is pretty resource heavy in terms of net memory required (atleast for FastRTPS & OpenSplice) and the discovery process is CPU intensive (due to multicast). This new rmw could drastically simplify the discovery process and most certainly reduce the memory footprint by needing only one participant per process to support inter_process communication.

I don't want to derail this PR as this is totally tangential topic and needs more thought and planning. Should we move this discussion to a discourse post ?

alsora commented 5 years ago

The aim of this PR is to improve the current intra-process communication, that is only supported in rclcpp at the moment.

I don't want to derail this PR as this is totally tangential topic and needs more thought and planning. Should we move this discussion to a discourse post ?

As @raghaprasad says, I think that moving the intra-process manager to a different place from rclcpp will involve many considerations not necessarily related with how the intra-process communication should actually take place, which is the focus of this PR.

However, I think that once it has been finalized and implemented there shouldn't be any major blockers avoiding to move it to a COMMON rmw layer. This can be used to have common behavior between C++ and Python applications.

While working to this PR I also explored the idea of implementing the intra-process manager logic at the RMW layer (I have a very dummy proof of concept for RMW DPS). One of the issues I had was that published messages have to go through the rclc layer, thus it's not possible to use smart pointers and message templates. Apart from this, the performances were comparable with the rclcpp implementation (slightly worst).

gbiggs commented 5 years ago

One of the issues I had was that published messages have to go through the rclc layer, thus it's not possible to use smart pointers and message templates.

But it is possible to do the rmw and rcl APIs and implementations such that they manage their raw pointers properly and provide a smart_ptr interface-compatible object in rclcpp. I'm not saying it would be easy, but this is how the STL is designed to be used and it would be the most powerful solution.

I feel that having inter-process only available in rclcpp is a further continuation of not putting things in rclc where they can be easily wrapped by other languages, which makes it harder to get back to what we were told at the start of ROS 2 development is a goal: to have a universal C client library provide all the functionality and other languages, including C++, just wrap it. I understand this policy was sidelined by a need to get features out faster but if we just keep doing things in C++ we will make it increasingly difficult to get back to that approach, and in the long run end up with a bunch of mostly-compatible client libraries again.

dgoel commented 5 years ago

I feel that having inter-process only available in rclcpp is a further continuation of not putting things in rclc where they can be easily wrapped by other languages, which makes it harder to get back to what we were told at the start of ROS 2 development is a goal: to have a universal C client library provide all the functionality and other languages, including C++, just wrap it. I understand this policy was sidelined by a need to get features out faster but if we just keep doing things in C++ we will make it increasingly difficult to get back to that approach, and in the long run end up with a bunch of mostly-compatible client libraries again.

ROS2 architecture [1] depicts very explicitly that intra-process communication (along with type masquerading optimizations) belongs in the language-specific client library. Also, intra-process communication is merely an optimization, and not a "feature" per say. All the features like services, parameters, etc. are rightly implemented in rcl layer already.

[1] http://docs.ros2.org/dashing/developer_overview.html#internal-ros-interfaces

ivanpauno commented 5 years ago

Some thoughts about moving intraprocess communication to rmw:

ROS2 architecture [1] depicts very explicitly that intra-process communication (along with type masquerading optimizations) belongs in the language-specific client library. Also, intra-process communication is merely an optimization, and not a "feature" per say. All the features like services, parameters, etc. are rightly implemented in rcl layer already.

[1] http://docs.ros2.org/dashing/developer_overview.html#internal-ros-interfaces

That's mostly "historical". I think that if we have a good rationale for implementing it at rmw level, we should.

But it is possible to do the rmw and rcl APIs and implementations such that they manage their raw pointers properly and provide a smart_ptr interface-compatible object in rclcpp. I'm not saying it would be easy, but this is how the STL is designed to be used and it would be the most powerful solution.

I think something like that could be possible, but quite complicated.

I would like to see something mimicking connext Zero Copy Transfer Over Shared Memory semantics (by default connext use shared memory, but it doesn't use zero copy transfer, which have an specific semantics). Basically, instead of creating a unique pointer and then publishing it:

auto msg = std::make_unique<MSG_TYPE>();
/* Fill the message here */
publisher->publish(std::move(msg))

You ask to the publisher a piece of memory, fill it, and then publish:

auto msg = publisher->new_message();
/* Fill the message here */
publisher->publish(std::move(msg)); // I'm using move semantics because the message will be undefined after calling publish. But how we wrap the msg for this is an implementation detail.

For dds vendors that have implemented zero copy transport, this could just wrap it. For others, we could have a default implementation that's used in those cases. That implementation could not use shared memory that allows INTERprocess zero copy transport, but just use a preallocated buffer in each publisher that allows INTRAprocess zero copy transport. This implementation is a good start for later doing something like this (if we want to do it).

I also think this idea will look idiomatic in other languages (for example, in python), and performance should be quite similar.

As @raghaprasad says, I think that moving the intra-process manager to a different place from rclcpp will involve many considerations not necessarily related with how the intra-process communication should actually take place, which is the focus of this PR.

I agree, I think this implementation will improve performance compared with the current intraprocess implementation. We could later decide if moving intraprocess comm to the rmw layer is a good idea or not.

emersonknapp commented 5 years ago

@ivanpauno I agree with the above discussion that we should evaluate this PR for what it is (an improvement to the existing scenario of Intra-Process existing only in rclcpp) - but I'm also very interested in making this feature more globally accessible. I was sold that one of ROS2's goals was to stop reimplementing features for each language, and to make language clients as thin as possible, therefore functionality like this should live in rcl or lower.

My question is, where can we start and maintain a focused discussion for this topic today, rather than just saying "we can decide later"? That way we can take this discussion off of this PR, but still know that we have actually begun to focus on the bigger-picture architectural discussion somewhere. Perhaps a new ros2/design PR would be most appropriate? (e.g. "Design for intra-process comms available to all ros2 language clients"?)

ivanpauno commented 5 years ago

@ivanpauno I agree with the above discussion that we should evaluate this PR for what it is (an improvement to the existing scenario of Intra-Process existing only in rclcpp) - but I'm also very interested in making this feature more globally accessible. I was sold that one of ROS2's goals was to stop reimplementing features for each language, and to make language clients as thin as possible, therefore functionality like this should live in rcl or lower.

I fully agree about making the feature globally available. Actually, it's currently available in rclcpp only for publish/subscribe communication, but for example, not for services.

My question is, where can we start and maintain a focused discussion for this topic today, rather than just saying "we can decide later"? That way we can take this discussion off of this PR, but still know that we have actually begun to focus on the bigger-picture architectural discussion somewhere. Perhaps a new ros2/design PR would be most appropriate? (e.g. "Design for intra-process comms available to all ros2 language clients"?)

Maybe a PR/issue here, and post the link in discourse too. If you already have an idea of the design, I think it's better to directly open a PR. If we just want to continue the discussion, I would rather open an issue.

As the topic is complex, if we want to move ahead with it fast, I think that it would be good to organize a meeting for discussing the topic. I would be interested in participate.

alsora commented 5 years ago

Github does not allow me to reply to the last comment https://github.com/ros2/design/pull/239#discussion_r326365860 ... However, I agree with you. When I tested this I also added new APIs for publish and callbacks that do not use pointers.

clalancette commented 5 years ago

@alsora Now that https://github.com/ros2/rclcpp/pull/778 is merged, is there anything we need to update here?

alsora commented 5 years ago

No I think this is fine. On the rclcpp side a follow up PR will be necessary to re introduce the use of intra process manager impl class, but it should not affect the design

clalancette commented 5 years ago

No I think this is fine. On the rclcpp side a follow up PR will be necessary to re introduce the use of intra process manager impl class, but it should not affect the design

All right, thanks. @wjwwood @ivanpauno You two had the earlier comments and still some unresolved conversations. Are you satisfied with this article now, or do we still need to resolve the open conversations?

alsora commented 4 years ago

Should we merge this?

ivanpauno commented 4 years ago

Merged. If there are any further comments, they can be addressed in a follow-up.

ros2 / design

New intra-process communication design #239