Multi-Machine Launching

mlanting commented 5 years ago

This is an initial draft of a design document outlining our plans for adding multi-machine launching capability to ros launch for ROS2. We'd like to put it out early in the design process so we can take community feedback into account as we move forward.

Distribution Statement A; OPSEC #2893

ruffsl commented 5 years ago

Are there any lessons or infrastructure we'd like to integrate/piggyback with? A long while ago we had a similar idea of integrating orchestration software to manage roslaunch for ROS1: https://github.com/ros/ros_comm/issues/643

Rather than reinventing the wheel, perhaps we could open up an interface to make it simpler to plugin roslaunch with kubernetes, swarm, nomad, docker, or other orchestration models outside containers?

ivanpauno commented 5 years ago

I agree that it would be good to build this on top of an existing orchestration tool.

It's not clear to me from the document, if in the same launch file you would be able to run processes in different machines. That was possible in ROS 1 using the machine tag. It has been some discussion here on how to refactor the current ExecuteProcess action to make this easier.

piraka9011 commented 5 years ago

Thanks for the contribution @mlanting

Here are my suggestions:

Let's get this into an outline like the other docs. Suggested outline below:
- Preface/Background
- Goals
  - In Scope
  - Out of scope
- Features/Capabilities
- Proposed Approach
  - Implementation
  - Risks/Issues
- Alternatives? (Docker, Kubernetes...)
Is there a reason you aren't proposing extending the previous launch verb? Wouldn't it be better to extend ros2launch with the ability to read a list package_name/launch_files and pass the attach/detach args is needed:

$ ros2 launch [package_name [launch_file_name] [launch_arguments]]

That or specifying a yaml file configuration with a system tag which specifies where the system should run:

# Command
$ ros2 launch --yaml-file <filename>
# YAML File
launch:
  package: <pkg>
    node: <node_name>
      arg1: [1, 2, 3]
      condition1: ['a', 'b', 'c']
      system: `my_robot.local`

We are releasing a Docker plugin for launch (as suggested by @ruffsl) for the ROScon security workshop. This allows you to run nodes/launch files in Docker containers and specify Docker arguments accordingly. Do you think that would support your ability to orchestrate nodes across systems and simplify the design?

pjreed commented 5 years ago

Providing the ability to work on top of an existing orchestration tool could be very useful, but I'm hesitant about requiring it. Adding something like docker or kubernetes is a very significant dependency that I know some people won't want to be required as part of their ROS installation, and they may not be feasible for some low-resource platforms or available for some architectures.

In ROS1 systems, I've done multi-machine launching in docker environments through abuse of the env-loader attribute on machines, but it would be very neat to have a formalized interface for integrating different orchestration systems.

piraka9011 commented 5 years ago

Providing the ability to work on top of an existing orchestration tool could be very useful, but I'm hesitant about requiring it. ...

+1, I'm totally on board with this, just wanted to bring up the fact that there may be similar implementations that we can leverage so we don't reinvent the wheel.

As before, I suggest writing this out in the design doc as a background/justification as it explains why such a tool is needed.

mlanting commented 5 years ago

I tried pushing an update a couple days ago to my fork, but it doesn't seem to be coming through. Possibly because the roslaunch branch was merged into gh-pages and then removed a few days after I originally submitted this PR?

ivanpauno commented 5 years ago

I tried pushing an update a couple days ago to my fork, but it doesn't seem to be coming through. Possibly because the roslaunch branch was merged into gh-pages and then removed a few days after I originally submitted this PR?

I'm not sure, but that's probably the case. Consider rebasing with master and re-targeting the PR to it.

ros-discourse commented 4 years ago

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2020-07-16/15468/1

mlanting commented 4 years ago

We've created a new version of the design document with some much more concrete ideas to discuss, but since the document has changed entirely I figured it'd be more appropriate to create a new PR: https://github.com/ros2/design/pull/297

ros2 / design

Multi-Machine Launching #255