ros2 / design

Design documentation for ROS 2.0 effort
http://design.ros2.org/
Apache License 2.0
223 stars 194 forks source link

Reconsider Node to Participant mapping #250

Closed ivanpauno closed 4 years ago

ivanpauno commented 5 years ago

Adding a document analyzing the drawbacks of enforcing a one-to-one mapping between ROS Nodes and DDS Participants and proposing alternative approaches.

ivanpauno commented 5 years ago

We had a meeting with @artivis @hidmic @kyrofa @dirk-thomas @ruffsl @wjwwood and @drastx about how DDS Security API would look like after changing Node to Participant mapping to one Participant per Context.

Here are the meeting notes. Please, correct me if something is wrong or missing.

Items

Proposals

drastx commented 5 years ago

@ivanpauno, my github id is drastx

mikaelarguedas commented 5 years ago

That looks like a great plan!

I have a few questions to help me picture how this will all fit together:

If context to participant becomes one to one, multiple participants may coexist in the same process. These may not share the same access control policies BUT do share the address space. It also means that full intra-process communication, which currently occurs on a per context basis, may be precluded by having multiple participants.

Is it possible that context would span multiple participants in the future? If so a way to set and refer to participant names instead of context name would be valuable.

Introduce the concept of a context in security governance files.

Could you clarify what this would mean in terms of governance file's content?

Context can get a name, and fallback to “_default” or “” if none is provided.

Sorry the notion of context is still unclear to me. Where would that be done? Will the context name be set at the launchfile level or in code?

Then I assume that rcl would retrieve the context name, the corresponding security directory and pass it down to rmw to set up the participant's security plugins, is that right?

As a first step, we could only accept one access control policy file per Context. Combining policies of different nodes could be done in a second step.

This would be a great next step :+1:. As @kyrofa pointed out in the past, the node author is the most knowledgeable person to list all the resources the node needs access to, and offloading to each integrator to figure out the permissions of each node in the system can quickly result in a lot of extra work for each user.

ruffsl commented 5 years ago

Could you clarify what this would mean in terms of governance file's content?

In the notes, I think @ivanpauno is confusing policy/permissions/governances that we discussed. We worked towards an idea where we'll add an grouping of multiple profiles by say including a <context> tag just above or perhaps bellow the <profiles> tag, so that during templating multiple profiles from nodes that share the same context will be combined/flattened into a single corresponding permission/governance file for the one participant.

Is it possible that context would span multiple participants in the future? If so a way to set and refer to participant names instead of context name would be valuable.

I haven't formed an opinion on that, but it seems to me that if context would span multiple participants, the participant could just load the same permissions/governance. I'm not sure what boundary context would have outside of DDS rmw, but I think we'd like to use the process boundary as the line, given the implications of secure memory access.

Sorry the notion of context is still unclear to me. Where would that be done? Will the context name be set at the launchfile level or in code?

I think the launch file level, as that is where the composition of nodes is configured/declared, yes?

Then I assume that rcl would retrieve the context name, the corresponding security directory and pass it down to rmw to set up the participant's security plugins, is that right?

On the money :dollar: . I think we'll have to expand upon the keystore file tree structure, similar to the policy document tree I mentioned with the tags above, so that paths to contexts are resolvable. I'm not sure how that all should be nested, say in conjunction with nodes not composed, but we should open a separate issues for these details. Perhaps it makes sense to always declare a context for consistency, but I'm not sure how that should fold with node name uniqueness?

As a first step, we could only accept one access control policy file per Context. Combining policies of different nodes could be done in a second step.

Again, I think this per context is applicable to access control permission/governace files, but not (SROS2) policy files, as we'd likely want a document tree that can reflect intersection between multiple contexts. As composed node interactions are likely not going to be limited to only nodes of a common context, we linkly want to verify the correctness/completeness such orchestrations.

ivanpauno commented 5 years ago

In the notes, I think @ivanpauno is confusing policy/permissions/governances that we discussed. We worked towards an idea where we'll add an grouping of multiple profiles by say including a tag just above or perhaps bellow the tag, so that during templating multiple profiles from nodes that share the same context will be combined/flattened into a single corresponding permission/governance file for the one participant.

Sorry for the error. I've edited the original comment.

Is it possible that context would span multiple participants in the future? If so a way to set and refer to participant names instead of context name would be valuable.

I think we don't want to introduce the concept of Participant to ROS, which is something DDS specific. So, each Context will map one to one with a Participant.

Sorry the notion of context is still unclear to me. Where would that be done? Will the context name be set at the launchfile level or in code?

IMO, in code and with the possibility of remapping it. Usually, you don't need to explicitly create a Context. In that case, the default name would be used, and you could remap it from the launch file (or when executing it from the command line).

Again, I think this per context is applicable to access control permission/governace files, but not (SROS2) policy files, as we'd likely want a document tree that can reflect intersection between multiple contexts. As composed node interactions are likely not going to be limited to only nodes of a common context, we linkly want to verify the correctness/completeness such orchestrations.

Explicitly or implicitly constructed, a Context is always there. With the ability of remapping the Context name (regardless if it was implicitly or explicitly constructed), I think the orchestration can be verified.

mikaelarguedas commented 5 years ago

Great thanks for the insighful answers.

I didn't mean to bring implementation details here, I just wanted to make sure there will be an easy way at the launchfile level to remap and group nodes under a specific context name to be able to set permissions without having to touch / compile any code. And that seem to be the case :+1:

ivanpauno commented 5 years ago

Updated document, with details about in which layer will be implemented.

ivanpauno commented 5 years ago

@nburek this proposal switch from one Participant per Node to one Participant per Context. That would directly affect LIVELINESS_MANUAL_BY_NODE liveliness policy. Do you think is reasonable to change it to LIVELINESS_MANUAL_BY_CONTEXT, or should we implement LIVELINESS_MANUAL_BY_NODE on top of DDS?

nburek commented 5 years ago

@ivanpauno Good question. In my opinion, manual_by_topic is the more useful of the policies anyways. So I don't have a strong opinion on it remaining manual_by_node or being changed to manual_by_context.

Unless you remove the guarantees about subscribers only being able to request a liveliness QoS as granular as the publisher offers then my gut tells me that it wouldn't be totally trivial to implement manual_by_node on top of DDS without some other metadata being passed along when establishing connections. So it would definitely be a lot more work right now to support it.

The only problem I could foresee with manual_by_context is that if node composition causes components to use the same context (I'm not actually sure if it does or not) then it becomes a little weird when multiple components are asserting that liveliness. It might be unfortunate to lose that level of granularity offered by manual_by_node as opposed to the full context and it would be a slight API break, but I don't see it being a one way door. The manual_by_node could always be added back in the future if people thought there was value in it.

The short answer, if you're trying to reduce the amount of changes needed to move to multiple nodes per participant then I think it's reasonable to switch to manual_by_context to save a lot of work as long as there aren't major concerns with what that could mean for components used in composition.

wjwwood commented 5 years ago

Do you think is reasonable to change it to LIVELINESS_MANUAL_BY_CONTEXT

I don't think this is particularly useful. Since you want to know if another node is available regardless of the composition of the system. I'm trying to think of a use case that might be served with context instead of node and I'm having a hard time. Maybe if you want to know if a process is still "alive" but even then you could have more than one context per process, so... I don't know if that's useful.

Unless you remove the guarantees about subscribers only being able to request a liveliness QoS as granular as the publisher offers then my gut tells me that it wouldn't be totally trivial to implement manual_by_node on top of DDS without some other metadata being passed along when establishing connections. So it would definitely be a lot more work right now to support it.

Can you expand on this? I don't remember what guarantee you're speaking about.

The only problem I could foresee with manual_by_context is that if node composition causes components to use the same context (I'm not actually sure if it does or not) then it becomes a little weird when multiple components are asserting that liveliness.

It may or may not put them in the context, but I think the most common thing will be to put them in the same context in order to take advantage of intra-process communication and/or using fewer participants.


Another option is to just remove node liveliness and only provide topic liveliness.

nburek commented 5 years ago

Do you think is reasonable to change it to LIVELINESS_MANUAL_BY_CONTEXT

I don't think this is particularly useful. Since you want to know if another node is available regardless of the composition of the system. I'm trying to think of a use case that might be served with context instead of node and I'm having a hard time. Maybe if you want to know if a process is still "alive" but even then you could have more than one context per process, so... I don't know if that's useful.

In the case where you do have a single context for your process it can be used to provide a signal that you haven't deadlocked your executor, which wouldn't be caught by the AUTOMATIC level, which could have a thread in the rmw layer that sends the liveliness check. Your timer that asserts the liveliness can check on the health of other components of the system before asserting it too. However, these break down if you combine multiple systems in a single context that are each trying to use that level of assertion because you don't know which is healthy, only that something in the context is alive.

Unless you remove the guarantees about subscribers only being able to request a liveliness QoS as granular as the publisher offers then my gut tells me that it wouldn't be totally trivial to implement manual_by_node on top of DDS without some other metadata being passed along when establishing connections. So it would definitely be a lot more work right now to support it.

Can you expand on this? I don't remember what guarantee you're speaking about.

If you look at the design doc for liveliness (https://github.com/ros2/design/blob/gh-pages/articles/qos_deadline_liveliness_lifespan.md) it states that:

In order for a Subscriber to listen to a Publisher's Topic the level of liveliness tracking they request must be equal or less verbose than the level of tracking provided by the Publisher and the time until considered not alive set by the Subscriber must be greater than the time set by the Publisher.

Right now we just rely on DDS to provide this check.

Another option is to just remove node liveliness and only provide topic liveliness.

Yes, that's another option. If we don't think that the liveliness assertion at the context level provides enough meaningful value then it is probably the better option to just remove it until we have an actual need for it.

ivanpauno commented 5 years ago

Maybe if you want to know if a process is still "alive" but even then you could have more than one context per process, so... I don't know if that's useful.

I thought it as a "process" keep-alive. In DDS it works in a similar way. The recommendation is one Participant per process, but you may have more than one participant per process, so it's a similar situation.

The only problem I could foresee with manual_by_context is that if node composition causes components to use the same context

Combining the potential manual_by_context and composition (in the same context) doesn't sound like a good idea.


The short answer, if you're trying to reduce the amount of changes needed to move to multiple nodes per participant then I think it's reasonable to switch to manual_by_context to save a lot of work as long as there aren't major concerns with what that could mean for components used in composition.

Another option is to just remove node liveliness and only provide topic liveliness.

So yes, I basically want to reduce the amount of changes. Considering that the API is new and not really common or useful, I will go ahead with one of this two options: removing it or moving it from node to context. I slightly prefer removing it, because I don't think manual by context will have many use cases (and will probably cause confusion).

eboasson commented 4 years ago

This subject was brought my attention and I really do feel I must provide my view as well, in particular because (in my view, but backed up by a hard fact) the premise underlying the proposal is incorrect. To wit, although in “What is a Domain Participant”:

A Domain Participant is a type of DDS entity. Participants also group other entities, like Publishers, Subscribers, Data Writers, Data Readers, etc. But participants do more than that:

  • Each Participant does discovery by its own. Creating more than one Participant increases cpu usage and network IO load.
  • Each Participant keeps track of other Domain Participants and DDS entities. Using more than one will duplicate that data.
  • Each Participant may create multiple threads for event handling, discovery, etc. The number of threads created per participant depend on the DDS vendor.

For those reasons, Participants are heavyweight.

none of the points raised is strictly incorrect, in reality this all depends very much on choices of DDS implementors rather than on requirements imposed by the DDS specification. The “hard fact” will be no surprise: Eclipse Cyclone DDS and ADLINK’s Vortex OpenSplice do not suffer from these points to anything remotely approximating the degree in which many other implementations suffer from it. (I am not familiar with the inner workings of all DDS implementations, so I don’t know whether these are the only two or not.)

Furthermore, the maximum number of Domain participants is rather small. For example, RTI connext is limited to 120 participants per domain.

Again, this is not actually imposed by the specification, it rather follows from the default mapping to UDP port numbers. There is no strict requirement that each participant has its own unique port for unicast traffic, let alone a need for those to be “well-known” because the port numbers are included in the participant discovery data, so multicast participant discovery eliminates most limitations quite naturally.

The DDSI specification merely suggests that each participant be reachable at a unique unicast address but actually neglects to define whether that is a unicast address at networking level or a unique identifier for the participant. That leaves room for having multiple participants sharing an address. As the protocol does define a way of including the GUID of a destination in full, it allows such multiplexing. DDS implementations appear to have converged on always including full addressing information, so you really need only a fixed number of addresses per process, regardless of the number of participants.

And as an extreme example: when using OpenSplice with shared memory all network traffic is funnelled through a single process with a single instance of the DDSI stack. Even if you have thousands of participants in a single machine, it can still run using only a fixed set of 4 UDP ports and a handful of threads. Discovery takes place between nodes rather than between participants and so having 1 or 10 participants in a node only affects the number of samples describing participants (to a first-order approximation).

Cyclone DDS (and OpenSplice without shared memory) similarly shares its DDSI stack among all participants in a process and performs discovery once per process (within a DDS domain anyway), thereby giving a pretty significant reduction in overhead for processes with many participants/nodes.

The downsides of this proposal are spelled out quite correctly in the proposal itself. There is additional complexity for introspection/discovery and security, as well as rework of all DDS-based RMW implementations. I’m not much bothered by the increase in resource usage by adding one’s own discovery data, that should be a relatively small amount.

These downsides are significant enough that I would consider this proposal a workaround: the root cause is the profligate use of resources of most DDS implementations. Given the existence of an (open, free) DDS implementation that is a bit more frugal, there is an alternative workaround involving far less effort: leave the mapping unchanged and provide advice on configuring ROS2 for minimal resource usage in cases where there are many nodes in a single process.

ivanpauno commented 4 years ago

Thanks for the comment @eboasson.

These downsides are significant enough that I would consider this proposal a workaround: the root cause is the profligate use of resources of most DDS implementations. Given the existence of an (open, free) DDS implementation that is a bit more frugal, there is an alternative workaround involving far less effort: leave the mapping unchanged and provide advice on configuring ROS2 for minimal resource usage in cases where there are many nodes in a single process.

I agree that, this is in fact a workaround of the bad performance of creating multiple domain participants in many DDS vendors. I didn't know that those issues weren't a result of the standard itself, and just a result of implementation details of some DDS vendors. AFAIK, the issue is not with just one DDS vendor, but quite a few of them.

I still consider that the workaround is "good", considering the fact that it will improve performance in many of the current implementations. IMO, it's also conceptually ok to decouple the ROS node from the middleware node (it might be a good thing to do too in other rmw implementations not based in DDS).

Regarding if it's valuable to also apply the change in OpenSplice and CycloneDDS, I'm not sure. If we don't apply the changes in OpenSplice and CycloneDDS, crossvendor support will be broken (as the discovery mechanisms will be different). I'm personally ok with breaking crossvendor support, though I don't know if there is a desire of mantaining it.

leave the mapping unchanged and provide advice on configuring ROS2 for minimal resource usage in cases where there are many nodes in a single process.

Can you share more details? What has to be configured in CycloneDDS in order to use minimal resources when many nodes are created in the same process?

eboasson commented 4 years ago

Regarding if it's valuable to also apply the change in OpenSplice and CycloneDDS, I'm not sure. If we don't apply the changes in OpenSplice and CycloneDDS, crossvendor support will be broken (as the discovery mechanisms will be different). I'm personally ok with breaking crossvendor support, though I don't know if there is a desire of mantaining it.

As things stand now, cross-vendor support seems to be really little used today in the ROS community and if what I have seen in the DDS world translates to the robotics community, it is unlikely that it will matter to many ROS2-based systems in the future. So you could argue that throwing away the interoperability is acceptable.

I would however be careful in doing that: the ecosystem is building up quite nicely and I dare say one of the key reasons for that is that components interact nicely and that it matters to everyone that they do. Fragmenting the DDS support by having two incompatible RMW because of something that is but a minor detail for most users (today anyway) goes contrary to that spirit and might well do more harm to the ecosystem than it would do good by saving some time.

In my view the question is therefore whether the performance improvements using Connext or FastRTPS is sufficient benefit for making the effort on all DDS implementations; or, conversely, whether the group of people that would see a meaningful improvement from it would be happy to use Cyclone (or possibly OpenSplice) instead.

leave the mapping unchanged and provide advice on configuring ROS2 for minimal resource usage in cases where there are many nodes in a single process.

Can you share more details? What has to be configured in CycloneDDS in order to use minimal resources when many nodes are created in the same process?

Sure — all you need to do is use it! 😄 The defaults of Cyclone work fine for this, it's OpenSplice with shared memory where the configurable bits become interesting.

ivanpauno commented 4 years ago

I would however be careful in doing that: the ecosystem is building up quite nicely and I dare say one of the key reasons for that is that components interact nicely and that it matters to everyone that they do. Fragmenting the DDS support by having two incompatible RMW because of something that is but a minor detail for most users (today anyway) goes contrary to that spirit and might well do more harm to the ecosystem than it would do good by saving some time.

I don't think "it's a minor detail for most users" (more on that later). Still, applying the changes in all the DDS based rmw implementations sounds better than just on a few. The hard part is really getting the first one working, migrating the others should be more straightforward.

In my view the question is therefore whether the performance improvements using Connext or FastRTPS is sufficient benefit for making the effort on all DDS implementations; or, conversely, whether the group of people that would see a meaningful improvement from it would be happy to use Cyclone (or possibly OpenSplice) instead.

I don't think it's just a group of people. Most real world ros2 applications will be composed by many nodes, and many of them can probably be executed in the same process (20 nodes in the same process is not crazy). The performance improvement in those cases is really huge. Limiting good performance to some particilar implementations doesn't sound like a good option (there are many reasons to choose on DDS vendor or other, e.g. AFAIK Cyclone doesn't support security nor liveliness/deadline qos).


Sure — all you need to do is use it! smile The defaults of Cyclone work fine for this, it's OpenSplice with shared memory where the configurable bits become interesting.

Sounds great!


To mention other DDS vendors features that might also solve the performance issue, Fast-RTPS has a discovery server, that probably also provides good performance creating many Participants in the same process. I haven't tested that, but sounds like that it could address the problem.

wjwwood commented 4 years ago

Furthermore, the maximum number of Domain participants is rather small. For example, RTI connext is limited to 120 participants per domain.

Again, this is not actually imposed by the specification, it rather follows from the default mapping to UDP port numbers. There is no strict requirement that each participant has its own unique port for unicast traffic, let alone a need for those to be “well-known” because the port numbers are included in the participant discovery data, so multicast participant discovery eliminates most limitations quite naturally.

Can this restriction be lifted (or the cap significantly increased) when using cycloneDDS and still maintain cross-vendor communication?

If not, that seems to be a big advantage of the approach proposed in this pull request. ROS 1 systems have frequently been known to reach and exceed 120 nodes.

This is kind of a killer feature that we need. Without it all the other considerations (e.g. "we don't really need this we just need more efficient DDS implementations") are kind of moot in my opinion.

Given the existence of an (open, free) DDS implementation that is a bit more frugal, there is an alternative workaround involving far less effort: leave the mapping unchanged and provide advice on configuring ROS2 for minimal resource usage in cases where there are many nodes in a single process.

Unfortunately, our default implementation (Fast-RTPS) also has this scaling issue (as far as I understand) and so long as that's the case we need to improve the performance for this use case (many nodes). I'm not aware of a satisfactory suggestion we give our users that will address this, and I don't consider "re-architect your software to use less nodes" to be acceptable.


I'm not yet convinced we should abandon this proposal, even though DDS vendors "may" be more efficient (i.e. the standard doesn't tie their hands), because in practice this change would help many of the vendors and would even help overhead with cycloneDDS (perhaps to a lesser degree).

The argument which seems to essentially boil down to "well just use cycloneDDS" is fine, but there will always be (for the foreseeable future, imo) users that want to use different implementations, e.g. a vendor that supports security like Fast-RTPS, a vendor that provides certification assistance like Connext micro, or a non-DDSI implementation, and I think they would benefit from this change.

I would be interested to see evidence that this change is more expensive than the current approach for implementations like cycloneDDS (i.e. evidence that it is a step backwards).

ivanpauno commented 4 years ago

Updated the document based on comments.

We recently had a new meeting with @hidmic @ruffsl @kyrofa @mikaelarguedas @wjwwood. Please check the updated security section of the document. Planned changes:

I estimate 2 weeks for the changes I have two do. @ruffsl any estimation?

joespeed commented 4 years ago

per @eboasson

  1. If you only need it in a single machine then defer making this change
  2. If 1 process, single computer then cyclonedds already does this use case. Participants are inexpensive in cyclonedds architecture.
  3. We know how to make cyclonedds support 100, 200, 500 DDS participants within a single process and still use the network for discovery and have everything still work without making ANY change to ROS 2.
  4. DDSI discovery is expensive because of interoperable mandate. Can tweak discovery a bit for this use case if needed. ROS would work as-is without any changes. Would need to be a bit creative with DDSI protocol which can cause interoperability challenge.
ivanpauno commented 4 years ago

If you only need it in a single machine then defer making this change

That is a particular use case.

If 1 process, single computer then cyclonedds already does this use case. Participants are inexpensive in cyclonedds architecture.

That's not the case for many other DDS vendors.

We know how to make cyclonedds support 100, 200, 500 DDS participants within a single process and still use the network for discovery and have everything still work without making ANY change to ROS 2.

Again, this doesn't seem to apply to many other DDS vendors.

DDSI discovery is expensive because of interoperable mandate. Can tweak discovery a bit for this use case if needed. ROS would work as-is without any changes. Would need to be a bit creative with DDSI protocol which can cause interoperability challenge.

AFAIU, that tweak would be CycloneDDS specific, and probably break interoperability.


Please, limit the discussion only to solutions that comply with the following requirements:

If most DDS vendors include in the future an standard solution which improves performance without the need of the proposed workaround in the rmw implementations, and without breaking interoperability (i.e. the solution is part of the DDS standard and interoperability standard) that would be great. In that case, we could step back and use a one-to-one node participant mapping again. But I don't see that happening in the future, "use one Participant per process" is a very extended recommendation in differents DDS vendors.

ros-discourse commented 4 years ago

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/improving-ros-2-cpu-by-simplifying-the-ros-2-layers/13808/1

dirk-thomas commented 4 years ago

@ivanpauno Can this be closed? (If there is anything pending it might be better to summarize that in a new ticket.)

ros-discourse commented 4 years ago

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-foxy-fitzroy-released/14495/1

ivanpauno commented 4 years ago

@ivanpauno Can this be closed? (If there is anything pending it might be better to summarize that in a new ticket.)

Closed or merged? I was planning to do some modifications, as the current description isn't supper accurate.

If there's a desire to have a rendered version of this documentation together with the release, feel free to merge and I will follow-up in another PR.

dirk-thomas commented 4 years ago

Closed or merged?

Certainly merged.

I was planning to do some modifications, as the current description isn't supper accurate.

:+1:

If there's a desire to have a rendered version of this documentation together with the release, feel free to merge and I will follow-up in another PR.

Just depends on the time frame. It can wait until next week. If it takes you longer please feel free to merge this and follow up with a separate PR.

ivanpauno commented 4 years ago

@hidmic I've cleaned up the document, can you take another look?

ivanpauno commented 4 years ago

@hidmic PTAL, I think I've addressed all your comments.

ivanpauno commented 4 years ago

@hidmic friendly ping

ivanpauno commented 4 years ago

@hidmic all comments have been addressed, PTAL