open-rmf / rmf_ros2

Internal ROS infrastructure for RMF
Apache License 2.0
73 stars 58 forks source link

Generic Monitor Node and Support Libraries #125

Open arjo129 opened 3 years ago

arjo129 commented 3 years ago

Feature request

Description

Currently we have a number of issues that require fail over capabilities within nodes. Here are a few:

The schedule node already has a very strong fail over support through the use of a specialized monitor node. This is a good starting point for this new feature. There is also the stubborn_buddies library which enables basic fail over support through link time composition.

Implementation considerations

These are just some quick mental notes about this feature:

mxgrey commented 3 years ago

Another con is that every ros2 topic requires us to use a different Message type.

Is this sentence backwards? Every message type needs to be on its own topic, but we don't need each topic to use a different message type.

arjo129 commented 3 years ago

Hmm I think I meant "every node may have a different internal state leading to a proliferation of different message types."

mxgrey commented 3 years ago

Personally I'd like us to finish ad hoc fail-over implementations for at least three very different kinds of nodes before we start worrying too much about how to encapsulate all fail-over into one implementation. I have a pretty strong feeling that the needs for doing efficient and reliable fail-over will be so different between the different types of nodes that trying to encapsulate it all into one implementation may actually add complexity overall with little tangible benefit.

These questions are certainly good to keep in mind as we go forward, but I would avoid putting this goal on the roadmap until we have some more concrete fail-over implementations to account for besides just the schedule node.