Add generalized interface generation architecture proposal

hidmic commented 3 years ago

First step towards addressing https://github.com/ros2/rosidl/issues/560 (might be worth going through the discussion there before reviewing). This is a first draft proposal to get some eyes on it. Lacks csp schema extension proposal.

hidmic commented 3 years ago

@azeey @sloretz @IanTheEngineer FYI

hidmic commented 3 years ago

@dirk-thomas if you happen to have a bit of free time (if that even exists :sweat_smile:), would you take a look and share your thoughts? I'd very much appreciate it.

koonpeng commented 3 years ago

Should rosidl_build_configure generate the build system files directly instead of the common build specifiction (CPS) files?

The rational is that node-gyp for example contains many platform specific and other conditional configurations, trying to correctly translate all the conditionals into a CPS does not look like a simple job, not to mention the effort needed to maintain it so that it correctly tracks new configurations node-gyp adds in the future.

If we assume that language bindings are strongly tied to the build system (this seems to be the case for most languages. cython, nodejs, cgo all use specialized build tools afaik), it makes sense for rosidl_build_configure to generate files specific to a language + build system. It makes writing plugins much simpler as you don't have to reverse engineer all the configurations these specialized tools implicitly provides. The downside is that each plugin will only be tailored for a specific language + build system combination.

ivanpauno commented 3 years ago

I second @koonpeng, I think that introducing a common build specification (cbs) makes everything more complex with little extra value. And without introducing that "cbs", I don't see much value of splitting the configuration step from the build step, I would only have one step.

IMO having a non-cmake plugin mechanism would be a huge improvement, but I think it's ok if some plugins continue generating cmake (or files for whatever build system is convenient for that language).

sloretz commented 3 years ago

If we assume that language bindings are strongly tied to the build system (this seems to be the case for most languages. cython, nodejs, cgo all use specialized build tools afaik), it makes sense for rosidl_build_configure to generate files specific to a language + build system

@koonpeng It sounds like that assumption means one language one build system, but I think one of the goals is to enable generation by any build system, like generating C++ bindings with either CMake or with Bazel.

hidmic commented 3 years ago

The rational is that node-gyp for example contains many platform specific and other conditional configurations

Would you mind sharing an example? I'd like to understand where the difficulties show up.

The downside is that each plugin will only be tailored for a specific language + build system combination.

That's precisely the problem that this is trying to address. It is definitely simpler for generators to be coupled with a build system, but then you're forced to port or extend them [1]. As of today, we have three (3) type representation generators and eight (8) type support generators in the core alone.

[1] As I see it, if a given generator allows extension to support other build systems, then some form of intermediate representation must necessarily exist. If we can standardize that representation, we get massive port effort reduction.

I don't see much value of splitting the configuration step from the build step, I would only have one step.

I disagree. Building interfaces outside the package build system requires additional logic to ensure proper incremental builds, to use generated artifacts (e.g. shared libraries) within the package build, and to land them along with their metadata (e.g. CMake exported targets) in the package install space.

koonpeng commented 3 years ago

Would you mind sharing an example? I'd like to understand where the difficulties show up.

I did some research on the build systems for some language bindings.

nodejs: uses node-gyp, lots of conditionals https://github.com/nodejs/node-gyp/blob/master/addon.gypi.
electron: uses node-gyp with special options. https://www.electronjs.org/docs/tutorial/using-native-node-modules#manually-building-for-electron. Arguments like --target and --dist-url might not be possible to represent in CPS.
cpython: https://github.com/python/cpython/tree/master/Lib/distutils/, many platform specific logic, may also insert implicit flags.
cgo: no build system files, compiler flags are defined as comments for each source file and built by the go compiler. The go compiler probably adds implicit flags as well.

I'm sure it can work even if we don't fully follow the logic of each specialized tool but then we won't be doing 1:1 builds.

It sounds like that assumption means one language one build system, but I think one of the goals is to enable generation by any build system, like generating C++ bindings with either CMake or with Bazel.

We can still have multiple build system for each language by having multiple plugins. I'm not sure if there is any worth with the flexibility to generate files for any language bindings to any build system. We rarely want to build nodejs bindings with maven for example.

would it be feasible to generate multiple CPS files that can be selected from based on those conditions, or are there too many conditionals?

the CPS file is not intended to cover all possible configurations of a package; rather, it is meant to be generated by the build system and to describe the artifacts of one or more extant configurations for a single architecture.

I'm inclined to say that there are too many conditions. From what I see, it is not just platform specific conditions, there are often conditions like the version of the interpreter, the existence of certain libraries, the location of libraries, libraries specified by name or relative path or absolute path etc. Given all these conditionals and the footnote on CPS that it's intended use is to be generated by a build system for a particular environment, the best way would be for rosidl_build_configure to generate the build system files, then use the build system to generate the CPS file. But then I would argue that the 2 steps should be different tools.

ivanpauno commented 3 years ago

I disagree. Building interfaces outside the package build system requires additional logic to ensure proper incremental builds, to use generated artifacts (e.g. shared libraries) within the package build, and to land them along with their metadata (e.g. CMake exported targets) in the package install space.

I'm not sure what you mean.

That's precisely the problem that this is trying to address. It is definitely simpler for generators to be coupled with a build system, but then you're forced to port or extend them [1]. As of today, we have three (3) type representation generators and eight (8) type support generators in the core alone.

[1] As I see it, if a given generator allows extension to support other build systems, then some form of intermediate representation must necessarily exist. If we can standardize that representation, we get massive port effort reduction.

By an extension, do you mean for example rosidl_generator_py using the output of rosidl_generator_c? I don't agree that an intermediate representation "must" exist, most build system have a mechanism to import an external dependencies (if not all of them).

@koonpeng It sounds like that assumption means one language one build system, but I think one of the goals is to enable generation by any build system, like generating C++ bindings with either CMake or with Bazel.

But you can still call the tool from bazel (whatever it generates cmake behind the scenes or not), right? I don't understand the value of building the generated code with bazel instead of cmake, what is it?

hidmic commented 3 years ago

I think the design doc could really use a diagram of how all the data flows.

@sloretz Agreed.

It seems like the .cps generation is separated from the code generation. When is each called? In a CMake project when are the three tools called? The cps created called during CMake configure time, the code generated at build time, and ament_meta_build at build time? What does ament_meta_build produce? Who uses it's output?

@sloretz Build specifications use source code generators. In a CMake project, one would invoke rosidl_*_build_configure followed by ament_meta_build during the configuration stage to generate CMake code that in turn generates and builds source code:

In a CMakeLists.txt

execute_process(COMMAND rosidl_*_build_configure OUTPUT_FILE build.spec)
execute_process(COMMAND ament_meta_build -b cmake INPUT_FILE build.spec OUTPUT_FILE build.cmake)
include(build.cmake)

In build.cmake

add_custom_command(
    OUTPUT ...
    COMMAND rosidl_*_generate ...
)

hidmic commented 3 years ago

I did some research on the build systems for some language bindings.

@koonpeng Thank you !

nodejs: uses node-gyp, lots of conditionals https://github.com/nodejs/node-gyp/blob/master/addon.gypi.

It has several conditionals, but they seem rather simple to resolve. The format itself is pretty simple too: the way it lists compiler and linker flags is almost a 1-to-1 mapping to the generic build spec I'm proposing.

electron: uses node-gyp with special options. https://www.electronjs.org/docs/tutorial/using-native-node-modules#manually-building-for-electron. Arguments like --target and --dist-url might not be possible to represent in CPS.

Those arguments seem specific to electron i.e. the build system generator. I don't think that's a requirement on the build specification, but on the translation of that specification. One build specification could be translated to different .gypi files, each suitable for an specific version of electron. The same would apply for different versions of CMake (a translation that's compatible with >CMake 3.5, a translation that requires >CMake 3.10).

cpython: https://github.com/python/cpython/tree/master/Lib/distutils/, many platform specific logic, may also insert implicit flags.

Hmm, for Python I'd think the build specification would be translated to a setup.py suitable for distutils invocation.

cgo: no build system files, compiler flags are defined as comments for each source file and built by the go compiler. The go compiler probably adds implicit flags as well.

This one's interesting. In this case, the build specification would only state that a build system has to run the go compiler on the set of generated files. Nothing more.

Just to be super clear, the build spec is not a substitute for build systems. It's a blueprint to generate what each build system needs (CMake code, Starlark code, Groovy code, setup.py files, .gypi files, etc.).

We can still have multiple build system for each language by having multiple plugins. I'm not sure if there is any worth with the flexibility to generate files for any language bindings to any build system. We rarely want to build nodejs bindings with maven for example.

@koonpeng That's an option. If it turns out we cannot afford a standard build specification. I'd really try to avoid per generator templates (and the combinatorial explosion that comes along).

Given all these conditionals and the footnote on CPS that it's intended use is to be generated by a build system for a particular environment, the best way would be for rosidl_build_configure to generate the build system files, then use the build system to generate the CPS file. But then I would argue that the 2 steps should be different tools.

I think we're mixing things up a bit. CPS is a format that describes package installs. Here I'm proposing a variation of the CPS format to describe package builds. Build configuration generates the latter, not CPS.

By an extension, do you mean for example rosidl_generator_py using the output of rosidl_generator_c? I don't agree that an intermediate representation "must" exist, most build system have a mechanism to import an external dependencies (if not all of them).

@ivanpauno I mean that if we have build system-specific plugins for rosidl_generator_c, it's quite likely we'll have some intermediate representation (e.g. a Python dict) to resolve templates. The build specification I'm proposing is a (non-trivial) generalization of that.

But you can still call the tool from bazel (whatever it generates cmake behind the scenes or not), right? I don't understand the value of building the generated code with bazel instead of cmake, what is it?

@ivanpauno some build system A can delegate to another build system B, sure. But (a) you need to properly embed B in A (ensuring rebuilds keep working; pushing some of A's output into B, specially if a call to B needs to build on top of a previous one; pulling B's output into A such that it can be used during the build and on install, etc.), (b) you force a dependency on B, and (c) there's a performance penalty in doing (a). For Bazel, properly embedding CMake also means staying within the build sandbox.

koonpeng commented 3 years ago

Hmm, for Python I'd think the build specification would be translated to a setup.py suitable for distutils invocation.

What if I run ament_meta_build with --build-system node-gyp? It must know all the conditionals and implicit flags added by distutils to generate the .gypi file, we probably also need to know all conditionals and implicit flags of node-gyp to unset those that are not compatible for a python module. Or it could still generate setup.py and use a gypi that delegates to distutils, but then that would create the same disadvantages discussed between cmake/bazel delegation.

hidmic commented 3 years ago

What if I run ament_meta_build with --build-system node-gyp? It must know all the conditionals and implicit flags added by distutils to generate the .gypi file, we probably also need to know all conditionals and implicit flags of node-gyp to unset those that are not compatible for a python module. Or it could still generate setup.py and use a gypi that delegates to distutils, but then that would create the same disadvantages discussed between cmake/bazel delegation.

@koonpeng Yours is a good example. Correct build spec granularity is key. While you could specify a shared library build for a CPython extension, I'd rather treat it as a component type of its own. That way, we push build details to each ament_meta_build plugin, for which there will be infrastructure (or not, in which case the implementation can delegate, as you say, or not support that component type at all).

koonpeng commented 3 years ago

That way, we push build details to each ament_meta_build plugin, for which there will be infrastructure (or not, in which case the implementation can delegate, as you say, or not support that component type at all).

@hidmic I think I may have misunderstood the goal of build system plugins, I thought it was to support generating build system files for any language with CBS, but is it correct that the goal is actually just to support many code generator for languages they are built for? A build system plugin may choose to support multiple languages but most commonly they would only support one (node-gyp for nodejs, distutils for python). You still get the advantage for being able to generate build system files for multiple code generator plugins of one language, e.g. one python code generator may require c++14, but a new fancy one may require c++20.

But you can still call the tool from bazel (whatever it generates cmake behind the scenes or not), right? I don't understand the value of building the generated code with bazel instead of cmake, what is it?

@ivanpauno An example of the problem I faced is that node-gyp knows how to build a node module, but doesn't know the link libraries. cmake knows the link libraries but doesn't know how to build a node module. There are some ways around this like node-gyp -> cmake -> node-gyp but things start to get ugly.

hidmic commented 3 years ago

I thought it was to support generating build system files for any language with CBS, but is it correct that the goal is actually just to support many code generator for languages they are built for? A build system plugin may choose to support multiple languages but most commonly they would only support one (node-gyp for nodejs, distutils for python). You still get the advantage for being able to generate build system files for multiple code generator plugins of one language, e.g. one python code generator may require c++14, but a new fancy one may require c++20.

I think we're close. Clearly, the design document needs better, more thorough explanations (and examples, many examples).

As I see it, build system adapter plugins (i.e. ament_meta_build plugins that read CBS and output build systems) deal with components. Some components may be language-agnostic, like a binary shared library, an executable, or an arbitrary collection of files [1], and some components may be specific to a given language, like a Python package or a Java jar file.

It is up to each plugin whether and how it supports a given component type. For example:

A node-gyp plugin could choose to deal with Node.js packages and bindings natively, and not support any other component type.
A cmake plugin could choose to deal with binary static/shared libraries, executables (where the supported languages are those that the underlying compiler can handle e.g. C, C++, Fortran, Objective-C to name a few gcc supports), and CPython bindings natively.
A distutils plugin could delegate everything that's not a Python package nor binding to cmake (i.e. literally calling cmake before/during/after the build).

There's no restriction here, only a path for proper integration. Some plugins will be better at it than others.

[1] I have not fleshed out the schema yet, but it is my intention to allow source code generators to force a build system via arbitrary command support in CBS. It completely defeats the purpose of ament_meta_build, but it'll make the transition smoother. It also lowers the bar for folks that just need to get things done (e.g. someone with an unsupported, snowflake build tool and tight schedules).

hidmic commented 3 years ago

An example of the problem I faced is that node-gyp knows how to build a node module, but doesn't know the link libraries. cmake knows the link libraries but doesn't know how to build a node module.

I will say that this design does not fully address that problem. node-gyp will still have to figure out how to get that information on its own. I have another design in mind, to get ament to export CPS files for cross build-system package consumption, but that's orthogonal to the interface generation pipeline.

peterpolidoro commented 3 years ago

Could it make sense to use something like a GNU Guix Package as a generalized interface, using lisp s-expressions rather than JSON to describe the package?

hidmic commented 3 years ago

@peterpolidoro Interesting, I didn't know about Guix. I would have to look into it in more depth. I will say that revamping ROS 2's package management system is ~way~ massively out of scope here, but we can entertain the idea of using code for tool-agnostic package build (and install) descriptions. IMO a good description should be general yet easy to consume (low implementation effort, soft dependencies if any, human readable [1]). Would a build system be able to consume it? Would a tool be able to generate a build system out of it? Can a scrapper consume it?

[1] Which makes me I'm a bit wary about lisp. I don't have anything against it, but most people see parenthesis and run :sweat_smile:.

peterpolidoro commented 3 years ago

Take a good look at Guix before running away screaming from the parentheses. Here is an interesting video about it for example: Guix in the age of containers It may not be appropriate for this particular topic and I would never presume to suggest revamping the entire ROS package management system. Although you may be tempted when you see what it can do. The benefits of functional package management with the ability to rollback versions and create minimal containers are kind of mind blowing once you have used it for a while. It generalizes packages written in any language. Writing package declarations in lisp seems crazy at first, but it is actually super elegant and powerful.

koonpeng commented 3 years ago

I will say that this design does not fully address that problem. node-gyp will still have to figure out how to get that information on its own. I have another design in mind, to get ament to export CPS files for cross build-system package consumption, but that's orthogonal to the interface generation pipeline.

@hidmic how about providing the plugins a function to resolve a message/package's required build flags? This function could use the CPS files or any other methods under the hood. As it is now, only cmake knows the build flags so the only way for any other build system to build rosidl related targets is to delegate to cmake, that wouldn't make this tool very helpful.

hidmic commented 3 years ago

I've address some comments and questions by clarifying the proposal. It's still lacking a nice diagram and examples. I'll add those next.

hidmic commented 3 years ago

how about providing the plugins a function to resolve a message/package's required build flags?

@koonpeng at the package-level, that information can be propagated by the build system. I understand your use case, but as I said in https://github.com/ros2/rosidl/issues/560#issuecomment-756214398 what you describe is not a generator if we go by its current definition (see architecture review in this proposal), and I intend to scope this work to that definition.

Perhaps you can get ahead of me and kickstart a design to better support multi build-system ROS workspaces? E.g. by exporting package information in a tool-agnostic format like CPS.

The benefits of functional package management with the ability to rollback versions and create minimal containers are kind of mind blowing once you have used it for a while. It generalizes packages written in any language. Writing package declarations in lisp seems crazy at first, but it is actually super elegant and powerful.

@peterpolidoro It is quite cool. But I think it doesn't solve the problem we're trying to address here. This generalization is meant to support users that are already committed to (or constrained by) some build-system and now they want to use ROS 2 interfaces. Doing that right now means porting a lot of CMake logic, or just let CMake do the job (which is not nearly as easy nor convenient as it sounds).

Guix (and please correct me if I'm wrong) looks like a very interesting alternative to the typical ROS workspace-based workflow. And yes, it could be used as a vehicle to export package information across build-systems. But that's a solution for a different problem.

That is not to say that it wouldn't be cool if you were willing to propose a design to build ROS packages using Guix. Not sure how the @ros2/team feels about this though.

hidmic commented 3 years ago

FYI I added pipeline diagrams in https://github.com/ros2/design/pull/310/commits/78cd763d9c1be51179ed8b30ebad04b91c95d570. Would a data-flow diagram that shows how the current architecture can be cast into the proposed one help?

jacobperron commented 3 years ago

Would a data-flow diagram that shows how the current architecture can be cast into the proposed one help?

+1 Also matching the proposed tools to the stage in the pipeline would make it more clear I think.

I'm still a little fuzzy on the proposal. In practice, it sounds like we would be replacing all of the CMake logic with a set of Python tools that others can extend via plugins. Here's a list of the proposed tools with inputs and outputs as I understand:

rosidl_generate + rosidl_typesupport_generate
- Input: IDL files
- Output: generated language-specific code
rosidl_build_configure + rosidl_typesupport_build_configure
- Input: IDL files
- Output: generates CBS files
ament_meta_build
- Input: CBS files
- Output: generated build system files

Does that look correct?

As an exercise, I try to map these to plugins necessary for Java support:

rosidl_generate_java
- Generates Java and native C code (.java, .c, *.h)
rosidl_build_configure_java
- Lists expected source files (.java, .c, *.h)
- Describe dependencies of the generated code (e.g. rcljava_common, rcl_interfaces, rosidl_generate_c)
- This seems odd since we're listing dependencies of rosidl_generate_java in this separate package
- I don't know what else would happen here?
ament_meta_build_java-gradle
- Generates Gradle package files for building the generated source files.
Optionally, we could also have ament_meta_build_java-cmake
- Generates CMake packages files for building the generated source files.

Am I somewhat close to what you are imagining?

hidmic commented 3 years ago

Also matching the proposed tools to the stage in the pipeline would make it more clear I think.

+1

Am I somewhat close to what you are imagining?

You're spot on.

This seems odd since we're listing dependencies of rosidl_generate_java in this separate package

Code generation and build specification/configuration are tightly coupled. Today both live on the same packages (as python generators and ament_cmake extensions). I expect plugins to follow the same pattern.

I don't know what else would happen here?

Describe what the build system has to do to generate, build, and install generated artifacts e.g.: generate source files calling the code generation tooling, then build them into a shared library, then install that shared library somewhere.

jacobperron commented 3 years ago

Describe what the build system has to do to generate, build, and install generated artifacts e.g.: generate source files calling the code generation tooling, then build them into a shared library, then install that shared library somewhere.

Okay, thanks. I guess in my example I wouldn't supply an exact build command, since it would be dependent on what system is chosen (CMake or Gradle), but we could supply things like Java version and compile flags. Thanks for clarifying.

hidmic commented 3 years ago

@sloretz @jacobperron I've updated the proposal, added a draft CBS schema, added examples. PTAL!

hidmic commented 3 years ago

I guess in my example I wouldn't supply an exact build command, since it would be dependent on what system is chosen (CMake or Gradle), but we could supply things like Java version and compile flags.

That's the idea. In most cases, one shouldn't specify a command but an appropriate component type. It's the meta builder plugin's responsibility to figure out how to build that component type later on.

niosus commented 3 years ago

@hidmic @jacobperron guys, I hate to hijack the discussion here but I think we at Apex.AI did something very relevant to this and I wanted to convey how important I find this proposal to be. We did an experimental bazelization of our internal ROS2 fork with some pretty neat results. Bazel is not important here though. The main part is that I had do dig through the whole message generation pipeline and compose it in such a way that bazel could build all of the messages natively. So I just want to support the statement that:

In most cases, one shouldn't specify a command but an appropriate component type. It's the meta builder plugin's responsibility to figure out how to build that component type later on.

It would have made my last couple of weeks much easier if there would be a single configurable binary (or script for that matter) that would spit out all the headers, sources and additional files that I would then just need to pipe into my build system of choice.

If you guys need any specific help (I know I'm late to the party) please do not hesitate to tag me.

ros2 / design

Add generalized interface generation architecture proposal #310