Cargo integration in Meson via an "unstable-cargo" module (try 2)

mesonbuild / meson

The Meson Build System

http://mesonbuild.com

Apache License 2.0

5.33k stars 1.53k forks source link

Cargo integration in Meson via an "unstable-cargo" module (try 2) #2617

Closed nirbheek closed 6 years ago

nirbheek commented 6 years ago

This was made primarily so we can build librsvg with Meson, which now works: https://git.gnome.org/browse/librsvg/log/?h=wip/meson, and it actually required fewer hacks than I expected it to.

The integration can and should definitely be better, but I think the best way to get there is to have something that works right now and slowly improve Cargo till we have everything we need. Cargo is not going to magically grow API for giving greater control to build systems; we have to work towards it.

I have been talking to @alexcrichton about the things that Meson would need to integrate better with Cargo, and he has been very receptive and encouraging of our efforts, and in the need for adding features to Cargo that allow Meson to control what it does a bit better.

Here's what you can do right now:

'Extract' build outputs from Cargo.toml files in your source tree with the unstable-cargo module and mostly use them like you would any other target. For example:

cargo = import('unstable-cargo')
# Build the staticlib
librsvg_internals = cargo.static_library('rsvg_internals', toml : 'Cargo.toml')

librsvg = library('rsvg-' + rsvg_api_major, librsvg_sources,
                  dependencies : [librsvg_deps, librsvg_internals])

You can also specify the list of sources if you wish:

cargo.executable('mesonbin',
                 toml : 'Cargo.toml',
                 # This is optional, since ninja keeps track of dependencies on source files.
                 sources : ['src/main.rs'],
                 # Yes, installation works
                 install : true)

The sources list is optional because cargo will output a nice depfile for us and ninja can then keep track of when to rebuild the target.

Meson TODO:

Cross-compilation
Cargo tests and benchmarks
cdylibs or rust-specific library types
ninja dist does not support vendoring crate sources yet (needed for librsvg)

Cargo TODO:

https://github.com/rust-lang/cargo/pull/4711
CARGO_TARGET_DIR should be an argument --target-dir
Options to set the output library/executable name (prefix/suffix); perhaps also in Cargo.toml
In general, the ability to more accurately control build outputs from via cargo arguments
- This is where our work comes in; something half-working is the best way to figure out what is needed to make it work well ;)
Dry-run downloading of sources by cargo vendor so we can ship a list of files instead of the sources themselves (cc @alexlarsson)

Cargo long-term TODO:

Output rustc invocations needed to build a crate so we can run everything ourselves
Overall, Meson should be able to output build artifacts exactly where it wants, and with the rustc flags that we want.

nirbheek commented 6 years ago

@jpakkane I think this is failing because there's no cargo; not sure why the error message is getting eaten up. Could you please add cargo to the docker image?

jpakkane commented 6 years ago

CI image updated.

her001 commented 6 years ago

Besides the one style change in the review, is anything blocking this? Is there something that I can do to help?

I'm not excited to give all of my GitHub profile and repo data to SideCI just to see why it fails, but I assume it is the style change too.

jpakkane commented 6 years ago

The bigger question here is how to join together Rust and other languages. Specifically something like this:

build C library with Meson
  -> build Rust "wrapper" or equiv that links with the library and possibly exposes a C API
  -> build a C executable with Meson that links against the Rust library

This is the minimal set of things we'd need to be able to speak about true interoperability rather than each system working on its own in an isolated universe.

nirbheek commented 6 years ago

Linking to a C library from Rust is something that is not implemented in this PR right now, but it is possible to add. I am just hesitant to add it till I try to port a project that works like that (will happen when I try to build gstreamer rust plugins with meson, but I can't spend time on that if the effort will go nowhere).

This PR implements one-way integration: compiling Rust-only code as a static library which exposes a C API that is used by other code. It was written specifically for librsvg's use-case, and I expect other GNOME projects that are slowly porting their codebases internally to Rust to be able to use this too.

I think this is a good start, and the rest can grow organically as mixed Rust and C projects start using Meson. Currently, they all use hand-rolled makefiles or Autotools because it is impossible to mix Cargo and C with Meson.

You have to start somewhere.

jpakkane commented 6 years ago

This PR implements one-way integration: compiling Rust-only code as a static library which exposes a C API that is used by other code.

But unless I'm missing something the result of those is not actually used to link, they are just installed and presumed working.

But the bigger issue is that the proposed approach breaks the file layout contract we have with users and developers and internally. The most important one of these is that all temporary and work files must go in the target private directory. Scattering them about the build dir is not acceptable.

In this vein the contract is also that if you do, say, an executable call in some dir, then the final output goes to the corresponding build directory. Not to a random subdirectory thereof. Changing this breaks a lot of assumptions and needs changing in e.g. rpath evaluations an all that stuff.

The proper solution to this ties in with this message from above:

I have been talking to @alexcrichton about the things that Meson would need to integrate better with Cargo, and he has been very receptive and encouraging of our efforts, and in the need for adding features to Cargo that allow Meson to control what it does a bit better.

this would mean being able to do something like this:

cargo --work-dir=subdir/foo@exe --final-output-dir=subdir <other args>

putting all the internal temporary stuff to the private dir and final outputs to the build tree.

nirbheek commented 6 years ago

But unless I'm missing something the result of those is not actually used to link, they are just installed and presumed working.

That's what the test does (because I don't know enough Rust yet), but I tested this with librsvg which does indeed implement a C API from Rust and uses it from C examples and that worked just fine. I'll improve the test.

putting all the internal temporary stuff to the private dir and final outputs to the build tree.

The temporary directory is set with the CARGO_TARGET_DIR env variable (it is incorrectly set to target_dir instead of target_private_dir), but yes the final output is placed inside that instead of in the target_dir.

Ideally we should also be able to set the workdir with an argument instead of using an env var, and that will improve things for all build systems.

In this vein the contract is also that if you do, say, an executable call in some dir, then the final output goes to the corresponding build directory. Not to a random subdirectory thereof. Changing this breaks a lot of assumptions and needs changing in e.g. rpath evaluations an all that stuff.

Good point, this would cause issues with shared libraries built with Rust. However, this module only supports static libs and executables (rust cdylibs don't work well right now), does that also cause issues? The implementation is with a CustomTarget(), and the value of @OUTPUT@ is correct, so it should turn out fine, no?

zeenix commented 6 years ago

FWIW,

this is awesome!
I've a wip branch to port geoclue to meson but i've been reluctant to work on that cause i was hoping to also port to rust slowly and I wasn't sure that would work.

danigm commented 6 years ago

That module could be great. We are building Fractal with meson with a custom script [1] and it could be great to have a native support for cargo.

[1] https://gitlab.gnome.org/danigm/fractal/blob/master/meson.build

alatiera commented 6 years ago

I am also using a custom script, pretty much identical to Fractal's, for building gnome-podcasts.

Would be nice to have upstream support and not rely on hacky configs!

federicomenaquintero commented 6 years ago

METOO :smile:

We are about to move librsvg to meson, and any changes to make it easier to combine C and Rust code will be much appreciated.

I'm making a little diagram of our current build process, and where we want to get to. It's a bit convoluted since we are integrating rsvg-rs (the Rust bindings for librsvg) into librsvg's own source tree, so that things like the integration tests and the binaries for utilities can be redone in Rust.

jpakkane commented 6 years ago

If you want to see progress on this issue, please get some comments from Cargo developers on the path issue discussed above.

nirbheek commented 6 years ago

If you want to see progress on this issue, please get some comments from Cargo developers on the path issue discussed above.

I think this is a profoundly unhelpful comment to make. Upstream cargo has no reason to care about Meson, and an approach like this will ensure that they won't, because people will just use some other build system or continue to use ugly hacks till their entire codebase is ported over to Rust and Meson is unnecessary.

Realistically, we need to improve our integration so we get some buy-in for Meson in non-trivial Rust projects, which will lead to pressure on Cargo to make changes. Having something that works for a specific subset of use-cases is how you improve the state of things for all the other cases.

I will be updating this PR by rebasing it on top of current master. That's all I can do.

nirbheek commented 6 years ago

Another thing I forgot to mention: every project is shipping similar hacks, and having a module with a central location for these hacks is useful in general. We can ensure that these projects keep building once the hacks become unnecessary, which is even more useful.

I also don't see any technical objections to this PR, it's literally just wrapping a custom target that does the necessary setup.

zeenix commented 6 years ago

@jpakkane From what I understood, @nirbheek answered your questions/concerns related to this PR in his replies to your specific questions/concerns and there is an unanswered questions to you as well.

I agree with @nirbheek completely that by blocking cargo integration on cargo devs caring about meson, isn't a very pragmatic approach here.

nirbheek commented 6 years ago

Just to be clear, I do agree with @jpakkane that Cargo needs to have better support for controlling where it places build outputs (at the very least). My point is that merging this and having projects use it will actually place us in a better position for working towards that with upstream.

Note that this is exactly what happened with glib tools: gdbus-codegen, glib-compile-resources, gtk-doc, etc. They are much better now after we added hacky support for them in the beginning.

jpakkane commented 6 years ago

My point is that merging this and having projects use it will actually place us in a better position for working towards that with upstream.

On this I disagree. Cargo developers have stated that integrating better with other build systems is an official goal and something they will be willing to spend resources on. Getting them to comment on this issue (does not need to be exhaustive, but should have at least something) is a good way to test how that works in practice.

people will just use some other build system or continue to use ugly hacks till their entire codebase is ported over to Rust and Meson is unnecessary

We have very limited resources. It would seem more prudent to use them to improve the development experience of people who are using Meson rather than on those who are only using it as a temporary crutch and plan to dump it as soon as possible.

And just to be clear: it is the goal of Meson to provide native support for Rust as a first class language and have it work just like any other language even in mixed-language projects without needing any external tooling, even Cargo. The same goes for every other programming language as well. Supporting mixed language projects natively from a single build definition is at the core of what Meson is.

NickeZ commented 6 years ago

We have very limited resources. It would seem more prudent to use them to improve the development experience of people who are using Meson rather than on those who are only using it as a temporary crutch and plan to dump it as soon as possible.

I think we have an opportunity here to become the standard build system for people migrating away from C/C++. There will probably be a lot of projects using mixed code bases for a long while. A bit like it is with C/C++ right now. If meson enables super easy usage of C together with cargo-based rust projects there is no reason to completely get rid of all your C code.

it is the goal of Meson to provide native support for Rust as a first class language and have it work just like any other language even in mixed-language projects without needing any external tooling, even Cargo.

This is a noble goal, but it seems quite useless to me. Anything but toy rust code will use external packages which means that meson has to reimplement cargo (even supporting multiple versions of packages compiled into the same binary).

I think this PR should be merged and issues should be created in the cargo repository. The first CARGO TODO item in the OP seems already merged. Is the rest of that list valid? Should we create a tracking issue here with a checkbox list that links to upstream issues/PRs?

zeenix commented 6 years ago

Just for the record, We've been having these discussions for a long time now and this PR is the only thing put forward in this time to add proper support for Rust in meson for the first time.

Having said that, we should try and get a cargo dev to comment here but no reason that should block this PR.

aruiz commented 6 years ago

On this I disagree. Cargo developers have stated that integrating better with other build systems is an official goal and something they will be willing to spend resources on. Getting them to comment on this issue (does not need to be exhaustive, but should have at least something) is a good way to test how that works in practice.

Wait, what? The Rust community is doing just fine with Cargo, so I think it is up to us to build those bridges, I don't think we need to get them to prove anything.

Do we want to make Meson useful to the Rust community or not? It is us the ones who have to make progress towards that goal, not the other way around.

acfoltzer commented 6 years ago

Wait, what? The Rust community is doing just fine with Cargo, so I think it is up to us to build those bridges, I don't think we need to get them to prove anything.

Do we want to make Meson useful to the Rust community or not? It is us the ones who have to make progress towards that goal, not the other way around.

It can be a bit of both. The Cargo team is definitely interested in how to make the tooling play nice in the context of other tools. Meson is not the only potential client here, and improvements informed by Meson's needs will yield improvements for the others.

That said, expecting Cargo to completely cater to Meson's needs is unrealistic. Implementing cargo meson will not help our work on Cargo CAmkES integration, for example.

jpakkane commented 6 years ago

That said, expecting Cargo to completely cater to Meson's needs is unrealistic.

This is not about that at all. The core issue is about the most basic integration pieces between any two build systems where one is assumed to call the other during build (which is what Cargo devs are advocating people to do to integrate Rust in their existing large projects). Let's start with the two obvious questions:

How does file layout work? If you invoke Cargo it writes out a ton of directories and temp files in the build dir it uses. In Meson these all should go in the target private directory and only the final results should be put in the main build dir. Does Cargo aim to support these kinds of file layouts (or whatever layouts CMake, Bazel, Scons et al would want it to conform to)?
Cargo documentation currently seems to advocate that if any non-Rust dependencies are needed, the user should write a build.rs that invokes a new process to build the external library. In an embedding scenario this would need to be the exact opposite: the main build system builds the library and merely tells Cargo where it is in the build dir and Cargo should pick it up from there. What are the plans to support that? How does it affect the many current projects that call external build systems from their build.rs files?

Cargo CAmkES integration

What is this? Googling gives zero meaningful results.

alexcrichton commented 6 years ago

Hello! I've been working on Cargo for quite some time now so hopefully I can help answer any questions about Cargo or otherwise make sure that any feature requests and such end up in the right locations on Rust side. As has been said here the Rust community is quite eager to improve integration with existing systems. While we in the Cargo team in Rust are still sorting out the best next concrete actions to take place a previously accepted RFC is the most up-to-date in terms of current high-level thinking for how this would work.

As pointed out by @NickeZ any nontrivial Rust project will be pulling in crates from crates.io, but also like @jpakkane mentioned it's often critical for Cargo to closely agree with the surrounding build system to ensure everything works out. To that end I'll reiterate the three main goals we have of build system integration with Cargo:

Make it easy for the outer build system to control the aspects of building that are under its purview (e.g. artifact management, caching, network access).
Make it easy to depend on arbitrary crates in the crates.io ecosystem.
Make it easy to use Rust tooling like rustfmt or the RLS with projects that depend on the external build system.

Also to reiterate though we're still in the relatively early days of build system integration so things may be a bit rough! I'll hope y'all will bear with us while we continue to iterate on our end and make sure that the integration here can be everything it needs to be one day!

In the meantime, though, it may also be helpful to explan what's possible today and how that could solve things.

this would mean being able to do something like this:
cargo --work-dir=subdir/foo@exe --final-output-dir=subdir <other args>
putting all the internal temporary stuff to the private dir and final outputs to the build tree.

To achieve something like this I think what you'll want to do is probably define a wrapper that invokes cargo with the --message-format=json flag. This flag indicates to Cargo that it should print information to stderr (or stdout, I forget!) about the build. For example Cargo will print out all the artifacts that it's finished compiling along the way. Some more information about this can be found here, but as with most projects that documentation is in a perpetual state of "could be improved" :)

To that end you may be able to achieve this workflow by setting CARGO_TARGET_DIR to the private directory (subdir/foo@exe I think this case?) and then the wrapper would listen to the JSON messages coming out of Cargo and copy select artifacts into subdir, for example anything ending in .a (a staticlib).

Writing a wrapper around Cargo isn't always the easiest today, but it's possible to do on stable at least! We're of course always looking for ways to improve this, so this could certainly be an issue for Cargo itself to have a more "official way" of placing final artifacts into a different location.

I hope that helps! I may have missed a question though so please let me know if I did so! And also if y'all have any more questions, don't hesitate to ask!

nirbheek commented 6 years ago

That's a great mid-way solution, @alexcrichton, thanks for suggesting it. These sorts of bootstrapping problems are hard to get going on, and I am glad that you've pitched in with something workable. 😄

I am not sure when I can work on this next, but I'll try to find time between now and RustFest in Paris so I can complain to you again about any bugs/problems I find. 😉

If someone else manages to get to this before I do, I'll be glad to help in any way I can.

matklad commented 6 years ago

The issue for overriding the place where cargo puts final artifacts is https://github.com/rust-lang/cargo/issues/4875.

NickeZ commented 6 years ago

I've done a cargo wrapper that takes --outdir and --target-dir as arguments. I also managed to link in a static rust library and call it from C. It works! but it is not pretty...

https://github.com/NickeZ/meson/commit/cd9cebebdcc8cc3dad811ee1d070cad4d780d885

https://github.com/NickeZ/meson/tree/some-cargo-integration

nirbheek commented 6 years ago

There is a WIP Cargo PR for implementing --out-dir now: https://github.com/rust-lang/cargo/pull/5203

alatiera commented 6 years ago

Btw the PR mentioned above implementing cargo build --out-dir is now merged. https://github.com/rust-lang/cargo/pull/5203

matklad commented 6 years ago

ouput-dir is available in current nightly. So, the

cargo --work-dir=subdir/foo@exe --final-output-dir=subdir <other args>

command line proposed by @jpakkane

would look like this:

CARGO_TARGET_DIR=subdir/foo@exe cargo -Z output-dir build --output-dir=subdir <other args>

The -Z output-dir will go away once this feature is stable.

Note that currently there's no ability to control the names of the artifacts, only the output directory. This is because there's no single name to control: depending on the platform, the same crate may produce different-named artifacts (foo vs foo.exe, libfoo.so vs libfoo.dll), and, in some cases, the artifact can consist of more that one file (for example, debuginfo is separate on windows and macs). Here are the tests which show which files are expected to be produced for which platforms: https://github.com/rust-lang/cargo/blob/b9aa315158fe4d8d63672a49200401922ef7385d/tests/testsuite/out_dir.rs

zeenix commented 6 years ago

Note that currently there's no ability to control the names of the artifacts, only the output directory.

I guess that is not a problem as long as the naming scheme is stable (enough) and meson can easily predict the files that will be generated for each specific target?

matklad commented 6 years ago

I guess that is not a problem as long as the naming scheme is stable (enough) and meson can easily predict the files that will be generated for each specific target?

The naming scheme is absolutely stable. The base name is roughly the name of the package, and prefixes/suffixes depend on the target and crate-type.

So, manually running cargo build --out-dir out once, seeing what is produced for a crate and adding these specific file names to meson rules would work perfectly.

It's also rather straightforward to just look at the crate's manifest, predict which files are going to be produced, and put that into meson.

However, building a fully automated solution, which parses Cargo.toml mechanically and derives the output artifacts name would be doable, but tricky.

jpakkane commented 6 years ago

I have never liked this approach for combining projects. This latest discussion is exactly the reason why. The only way this will ever work is if you have a full serialisation for everything between the two build systems. The internal state of things is inherently unstable, nondocumented and liable to change on a whim, because that is the exact thing you need to have in a build system. No build system developer is ever going to commit to it.

So, manually running cargo build --out-dir out once, seeing what is produced for a crate and adding these specific file names to meson rules would work perfectly.

Except no. That list needs to be known beforehand, before anything is built due to the way Ninja and Meson work. This would require us changing everything which is a massive maintenance burden.

It's also rather straightforward to just look at the crate's manifest, predict which files are going to be produced, and put that into meson.

Except no. This would require us to build a parser and all that stuff which is a maintenance burden.

However, building a fully automated solution, which parses Cargo.toml mechanically and derives the output artifacts name would be doable, but tricky.

This would be a massive development and maintenance burden on us again.

The more time I spend on this the more convinced I become that having two build systems live inside the same build directory is a terrible , unworkable and just plain bad design.

nirbheek commented 6 years ago

So, manually running cargo build --out-dir out once, seeing what is produced for a crate and adding these specific file names to meson rules would work perfectly.

Except no. That list needs to be known beforehand, before anything is built due to the way Ninja and Meson work. This would require us changing everything which is a massive maintenance burden.

I disagree, and I don't think this is too much of a burden. We know where the output will be placed now, and the filenames are predictable. That's enough for us. If cargo changes the names, I expect it will not be a backwards-incompatible change, so it won't be an undue burden on us.

We have a similar situation with pdb files too, and I am sure there are many other instances where we can't control the output of a command but know what it will be.

It's also rather straightforward to just look at the crate's manifest, predict which files are going to be produced, and put that into meson.

Except no. This would require us to build a parser and all that stuff which is a maintenance burden.

That's not not needed for this PR. This PR requires people to keep the Cargo.toml in sync with meson.build manually, and that's fine for now.

However, building a fully automated solution, which parses Cargo.toml mechanically and derives the output artifacts name would be doable, but tricky.

I agree that this is too error-prone. Let's not do this.

The only way this will ever work is if you have a full serialisation for everything between the two build systems.

I completely disagree, of course. Perfect is the enemy of shipping solutions. :)

I am sure that we will get to that point eventually; there are plenty of other people who want that. But in the meantime, we should get something in that doesn't block GNOME and GStreamer from using Rust with Meson.

I expect that more people will send patches to Cargo for improvements once they hit the restrictions that the current command-line syntax place on what people can do from meson build files. That is, after all, how FOSS works, and is what we have observed in the past with other toolchains such as D and GNOME tools.

jpakkane commented 6 years ago

Perfect is the enemy of shipping solutions.

Shipping solutions that are known to be unworkable in the long run is not good either.

To clarify, let's state what we are and are not committed to do. We will:

add functionality needed to build Rust applications at the same level as Cargo but by doing the compilation ourselves, this may include writing a converter from crates to Meson (build.rs nothwithstanding)
support combining Rust with any of the other supported languages natively and transparently (by building all of them ourselves as above)
add support for providing and consuming Rust dependencies from the system in a build system independent method such as pkg-config (but which can be any other such mechanism)

What we will not do is do some hacky integration thingy between the two, especially one that relies, in any way, on undocumented internals or any API or ABI that is not guaranteed to be stable and supported for the foreseeable future.

This is not limited to Rust. We will provide the same integration and support for every language that wants to work together with us. If we did every language the same way as Rust then Meson would compile Java just by calling to Maven because it is the standard, C# just by calling msbuild because it is the standard, D just by calling to dub because it is the standard, C++ just by calling into CMake because it is the standard, C just by calling into Autotools because it is the standard and so on.

matklad commented 6 years ago

@jpakkane could elaborate a bit which current aspects of Cargo make it difficult to integrate it with meson? That is, why can't meson treat Cargo as a compiler, and not as a build system? Regardless of the direction of Rust support in meson, it would be very useful for Cargo team to know constraints of the current Cargo which prevent seamless integration with other build systems!

Re-reading this thread, I've seen the following specific points being raised:

"work files must go in the target private directory". This has always been supported via CARGO_TARGET_DIR environment variable (docs)
"the final output goes to the corresponding build directory, not to a random subdirectory thereof.". This is now possible via the --out-dir flag. The flag is nightly-only because it is a relatively new addition to Cargo. Once we are sure that it works as expected and indeed helps other build systems, we stabilize and document it.
" the main build system builds the library and merely tells Cargo where it is in the build dir and Cargo should pick it up from there." This is supported via build overrides: one can specify the libraries to link to via env vars/configuration files, and Cargo would skip build.rs altogether (docs).

I'd be delighted to hear about and fix more build system integration problems :)

jpakkane commented 6 years ago

it would be very useful for Cargo team to know constraints of the current Cargo which prevent seamless integration with other build systems

If this needed to be summarized into one sentence, it would be this:

Every technical and architectural decision in Cargo seems to be made to be as hostile to cooperation as possible.

They probably aren't, but that is an easy impression to get.

Perhaps the biggest issue is that everything in Cargo is tightly coupled. Everything must be built in one invocation. There is no way of getting dependencies from the system via a mechanism such as, but not necessarily, pkg-config.

When combined with the fact that Rust has no standard library to speak of, every package depends on tens of other crates that depend on other crates and so on. This is perhaps the worst thing about NPM and Cargo chose to copy it blindly. If you wish to use any one single crate, pretty soon you find yourself downloading dozens of them. This by itself could be tolerable, but let's look at Cargo.lock of librsvg.

Some crates are there multiple times (these include rand, syn and winapi) with different versions. This is bad and directly against coding requirements and standards in many organizations. Let's take Google as an example. Obviously I can't and don't speak for them so the following is analysis based on publicly released information. As far as I know Google does not have Rust in their list of approved languages so they would not be using it in any case.

Google has a requirement that any third party dependency they have must be imported in their monorepo. Further, they have a requirement that there can only be one version of any dependency and that everyone in the entire organization must use the same version. In practice this means that Cargo (and by extension the crates it provides) can not be used in such an environment. Trying to tell them to change their entire workflow just to be able to use Cargo (and, by extension, Rust, if Cargo is to be The Only Way to compile Rust) is a losing battle.

The situation is similar in Debian. Embedded dependencies (vendored in Cargo terms) must not be used but instead any shared dependencies must be provided individually by the system. Let's assume that this would be technically possible today. Every new source package that is added to Debian must be placed in the new queue and manually reviewed by a human. When Meson was added to Debian it consisted of one single package and the review took on the order of a few weeks.

For comparison librsvg has ~50 crates it depends on. Thus if we were to do things properly, simply getting the dependencies into Debian would take one year. I don't have personal experience with how e.g. Red Hat Enterprise Linux does things but based on things I have been told the situation is roughly similar. I've also been lead to believe that the people doing these reviews are very much not happy with the dependency explosion that Cargo has wrought upon them.

These are not problems with Meson integration with Rust as such. However they are extremely important points that should be considered when designing such a build and dependency system. As an example having an extensive, high quality standard library would help tremendously because the amount of crates needed for most programs would go down a lot. This is what Python does, for example, and it has worked for them extremely well.

why can't meson treat Cargo as a compiler, and not as a build system

Because that only works under very simple circumstances and even then is not reliable or performant.

The first thing is that there needs to be more metadata available than is exposed currently. In fact all of it to make things actually work.

Let's take the simple case like this:

exe (in C) -> rustlib -> plainlib (in C)

Even in the simplest case of static libraries it is not enough to link the exe against the rustlib. You also need to have plainlib on the link line. Thus saying "here is my output library" is worthless on its own. The build system of exe can not know about plainlib unless Cargo explicitly tells about it. For extra challenge note that plainlib might be built by rustlib using build.rs or it might be built by the same build system that builds exe.

Let's now assume that both rustlib and plainclib need to be built as shared libraries. I know Rust does not do shared libraries well currently, but let's assume that it is only implemented in Rust but exposes (and imports) only a plain C interface. Say a drop-in Rust replacement of libssl3.so How would you run this executable from the build dir directly?

You can't unless you set up rpaths properly. The exe and the two libraries are scattered about the build tree in some way. The only way to make it work is to set rpaths correctly in all of these. Even more importantly those rpath entries must be cleaned out when doing an install (which is a distro requirement) (but not others). You might need to specify an rpath that will be in effect after install. It can't be done only during install, there is some magic that needs to happen beforehand. As an extra challenge this also needs to work on OSX, which does everything in a weird way including install_name shenanigans and all the other things it has.

It is left as an exercise to the reader to calculate the amount of integration points and lines of code needed to make all this work.

The end result is that on the boundary of any two build systems there must be a standardised data serialisation roughly in scope with pkg-config and in practice more since pkg-config does not deal with rpath or any of the other stuff. This immediately brings up two questions:

Who creates the specification for such a serialisation?
Would build systems implement it even if it existed or would they stick to their own silos as they do currently?

Multiple build systems in one are also a performance problem. If you have only one build system, it has a global understanding of all work that needs to be done and can schedule work accordingly. With multiple build systems this is no longer possible. You have to split things into chunks. First you build everything using build system 1 that does not depend on anything built with build system 2. Then those things that build with build system 2 that have their dependencies currently met. Then system 1 again and so on until convergence.

This is equivalent to ye olden Makefile days where Make first goes into one directory, builds everything there, then goes to the next and so on. This is incredibly slow, you can easily get 2x build time speedups merely by having one scheduler that works with global information. Given that Rust is slow to compile, adding extra perf bottlenecks is not a great design.

Then there is Cargo's vendoring approach. It works fine for single self-contained projects but falls down the second you have two vendored subprojects within one superproject. I wrote a long blog post about this.. In this case not only do Cargo bundles contain multiple versions of the same libraries, they get multiplied by every single dependency used in the "just call into Cargo" method.

The final thing is the fact that Cargo as a build system does so very little. Anything that is not a simple "build library or exe" gets shunted to build.rs that each project writes on their own. A design of this type makes sense if you are low on resources, which build systems always are, and want to get something working quickly. The downside is that now you have a Turing complete piece right in the middle of your build definition. This is bad, because if programmers are good at something, it is programming nonstandard square wheels that work for them but not necessarily together with anything else. If you have a case where you need to do something special, such as to use an already built dependency library X somewhere in the build dir instead of having the subproject build it itself, it is no longer a case of simply specifying dependencies. Instead you get to read the build definition code and patch it to make it do the thing you want it to. This is the sort of unfun busywork that no-one should need to do.

In closing, the reason why Meson does not call Cargo and treat it as a compiler is that it is a fundamentally broken concept. It might work for simple cases but becomes unwieldy when things get more complicated. The goal of Meson is that it should be simple to combine projects written in different programming languages in any combination and get a result out reliably with good performance. Farming work out to other tools without cohesion is not a good way of achieving that goal.

Extra footnote: note that all of the above is only an issue if you want to build all of these things in a single invocation in a single build directory. There are other ways of achieving all this. As an example if there was a "pkg-config for Rust" [1] then building could be done by compiling and installing the components one by one into a staging directory like, for example, Flatpak building works.

I'd be delighted to hear about and fix more build system integration problems

Now you have heard. I make no guarantees as to how delighted it made you feel.

[1] For getting Rust dependencies, not for getting C library flags for building sys crates. At least it did not exist the last time I looked, Google does not immediately find anything and it's 3 AM now so forgive me for not doing more thorough research.

sdroege commented 6 years ago

While these points have nothing to do with meson/cargo integration, you brought them up here as arguments.

For comparison librsvg has ~50 crates it depends on. Thus if we were to do things properly, simply getting the dependencies into Debian would take one year.

Did you ever install a Perl (the language used by most Debian infrastructure) package on Debian? That usually pulls in a few dozen other dependencies too, thanks to the popularity of CPAN. Similarly with Haskell, and probably other languages (JS/NPM? C#/.NET also has a huge amount of packages, and you can load different versions of the same library just fine there too) but those are the two I can think of right now.

While not optimal for the way how Debian works, this is not even close to a new problem. (Also with regards to Haskell/GHC: dynamic linking situation is more or less the same, even for more or less the same reasons).

Google has a requirement that any third party dependency they have must be imported in their monorepo. Further, they have a requirement that there can only be one version of any dependency and that everyone in the entire organization must use the same version.

I assume Go is a language Google is using. While Go does not have an official tool for handling dependencies like Rust, making the situation even worse/more manual, generally all Go code out there is vendoring its dependencies. Which leads to exactly the same situation they would be in with Rust, just that in the case of Rust there is a tool and this tool easily gets you the latest compatible/latest version: everything might be using different versions of dependencies and you have to port software to the latest/a common version to following those requirements.

for example, Flatpak building works.

While Flatpak does not encourage vendoring of sources, it certainly does for binaries (for every dependency not part of platform). Which is a different problem but has many of the same problem, just that Flatpak does not give you a tool to manage these dependencies bundled with the binaries.

Generally these are problems with any modern language that values a big ecosystem over a monolithic standard library and making usage of dependencies easy (which arguably at this point, but not historically, also includes the most enterprise of all enterprise languages: C# with nuget and Java with maven/gradle). As compared to C where people often implement things on their own or copy code because using dependencies is hard, resulting in an even worse situation: multiple more or less bad implementations for the same thing, or copied and possibly modified versions without a tool to track them.

We need to find solutions to these problems, so much is clear, but they're not only problems that exists with Rust and especially it's not something that meson can fix alone.

For the actual points related to meson/cargo integration, they are all taken into account in the Cargo build system integration RFC already that was mentioned here a few times. That takes time, and until then providing a imperfect solution to make it easy for people to use meson in a mixed C/Rust project would be useful. Otherwise people would just not use meson for this, which is probably also not what you want.

zeenix commented 6 years ago

it would be very useful for Cargo team to know constraints of the current Cargo which prevent seamless integration with other build systems
If this needed to be summarized into one sentence, it would be this:

Every technical and architectural decision in Cargo seems to be made to be as hostile to cooperation as possible.

I am very confused as to how you got the impression of any hostility from a "Can you please tell us how we can do better to cooperate with you?" gesture.

jpakkane commented 6 years ago

There has obviously been a communication error. The comment I made has nothing, I repeat nothing to do with people participating in this thread. It is merely about technical decisions made by Rust and Cargo several years ago. As far as I know nobody on this discussion has had anything to do with them, nor would it even matter if they did. The point was only and specifically about choices such as:

not committing to a stable ABI (meaning everything must be compiled from source for every project always)
not adding a "pkg-config -like" mechanism for getting prebuilt dependencies from outside the current build tree (meaning mixing different build systems is hard and all projects become one big clump)
not committing to a stable standard library (meaning even the simplest of Rust projects depend on tens of crates)
publicly stating that nobody should call rustc, only Cargo may do that (again making integration hard because ideally you'd want to do executable('foo', 'something.c', 'something.rs') and have it work out of the box with only a single build toolchain)

If anyone has gotten the impression that this is some sort of an attack against people on this thread, then obviously I have miscommuncated. I most sincerely apologize for this and try to be clearer in my communication in the future.

zeenix commented 6 years ago

If anyone has gotten the impression that this is some sort of an attack against people on this thread, then obviously I have miscommuncated. I most sincerely apologize for this and try to be clearer in my communication in the future.

That's very great to hear. Thanks for explaining and apologizing. Hopefully, I was the only one who misunderstood you.

matklad commented 6 years ago

Thanks a lot for a thorough reply @jpakkane! That was a lot to think about.

There are a lot to be said about relative advantages of various approaches to code resue, including "I'll just write my own hash-map" of C, "urllib2 definitely will solve all the problems of urllib" of Python and "who doesn't want a package to left-pad a string" of JavaScript; neither of them is ideal. The sheer number of packages to manage in Rust is definitelyly a drawback, and occasional version duplication is a necessary evil to make huge dependency graphs work out in practice.

However, Rust has chosen to rely heavely on dependencies, and at this point this is a constraint, and not something which can be realistically changed. Any build system for Rust probably needs to support crates.io package format and dependencies, otherwise it would likely be only marginally useful for Rust development.

And let's not forget about the benefits of dependencies as well! The librsvg case is actually an interesting example in this respect. Looks like some of duplication comes from cssparser crate. Finding parser for css in standard library of a language is unlikely, writing one by hand is scary, so a little code duplication seems not as bad in comparison? Another interesting thing to note that, although there are rand 0.3 and rand 0.4 in the lockfile, 0.3 depends on 0.4, and reexports it, adapting API to 0.3, so the actual duplication is smaller than it seems at first.

The point of version duplication across packages is very interesting! I've filed https://github.com/rust-lang/cargo/issues/5332 about it. Today it is sort-of possible to avoid duplication by pretending that all packages are a part of a single workspace, and that's what Google & Facebook are doing. The funniest thing is that Rust itself employs similar hacks to deduplicate dependencies between rustc, Cargo and RLS. We definitely need a first-class solution here.

The build scripts are definitelyly a pain-point of integrating Rust projects into absolutely anything! However, I don't fully agree with

The final thing is the fact that Cargo as a build system does so very little. Anything that is not a simple "build library or exe" gets shunted to build.rs that each project writes on their own.

It is true that build.rs itself is Turing complete. However, the effects of build.rs are extremely restricted, and amount to two things:

build.rs can generate some Rust code,
build.rs can print several key=value pairs to stdout (which then used for finding native libraries).

My understanding is that code-generation use-case is not actually problematic here. And, for native-libraries, Cargo includes a mechanism to completely bypass the build-script and just provide the information about native libraries directly, via build overrides. So, I believe

Instead you get to read the build definition code and patch it to make it do the thing you want it to.

is perhaps not actually necessary? By the same token, exe (in C) -> rustlib -> plainlib (in C) could actually work if the main build system provides plainlib to Cargo? Granted, the problem here is that the human author of the main build system script has to manually specify dependency of rustlib on plainlib, but this seems to be an ergonomics problem, and not a fundamental restriction? We plan to somehow make native dependencies in Cargo more declarative, so that it would be possible to automatically detect them.

Also, I'd like to assure you that

nobody should call rustc, only Cargo may do that

is not the current state of things! Dropbox and Facebook don't use Cargo for building their internal code, and the build system integration RFC specifically includes a path (build plans) to hand all of the work of building the code over to the main build system. However, it is very important that users do have access to crates.io packages, and implementing that without Cargo will be a lot of work.

jpakkane commented 6 years ago

Any build system for Rust probably needs to support crates.io package format and dependencies,

Sure. And if the definition was more declarative (as in without the current heavy dependency on build.rs) this would be straightforward to do.

Today it is sort-of possible to avoid duplication by pretending that all packages are a part of a single workspace, and that's what Google & Facebook are doing.

Having this sort of deduplication as a first class concept is what Meson has been doing for several years now. And we want to do it for all supported languages, and even multi-language projects.

It is true that build.rs itself is Turing complete. However, the effects of build.rs are extremely restricted

You can not be "just a little bit Turing complete" like you can't be "just a little bit pregnant". If Turing completeness is exposed, people will use it for all sorts of nasty things you did not think of. Especially if they are not given "proper" building blocks and tools to solve their problems.

My understanding is that code-generation use-case is not actually problematic here.

I read through the Cargo manual and based on that (not actual experimentation) it would seem that Cargo's support of code generation during build seems to be lacking. It only supports generating some Rust files statically and providing some compiler flags. This seems a bit inadequate for common and complex code generation steps needed for real world projects such as Protocol Buffers. Especially if you don't get the dependencies from the system but instead want to build them in the same build directory at the same time (for extra points: in a different programming language or mixture thereof).

Cargo includes a mechanism to completely bypass the build-script and just provide the information about native libraries directly

Which only works if the only thing the build.rs does is to build a dependency library. And because of Turing completeness you can't know that. You have to manually inspect every dependency.

specifically includes a path (build plans) to hand all of the work of building the code over to the main build system

But at what level is that information? If it has anything like compiler flags or arguments, then it is at the wrong semantic level. What every build system developer wants is basically what is in Cargo.toml. That is, list of sources and dependencies in some sort of a declarative format.

nobody should call rustc, only Cargo may do that

is not the current state of things!

But the message that Rust developers publicly state is that this should not be done. And that if you do, then you are on your own and nothing is supported. This is important mindset issue.

We plan to somehow make native dependencies in Cargo more declarative

Having an "as declarative as possible" definition for dependencies is what Meson was designed for: a minimal non-turing complete declaration good enough to support almost all use cases that modern SW development has.

However, it is very important that users do have access to crates.io packages, and implementing that without Cargo will be a lot of work.

If one ignores the build.rs issue then it is some amount of work, yes. But not a huge amount. For comparison adding proper deduplication and cross-language embedding support to Cargo would take on the order of 10x to 100x more effort. It is also work, based on my experience, that no-one really wants to do. Everyone wants to work on things that are cool and sexy, and this one definitely isn't.

nirbheek commented 6 years ago

Since this PR is not going to be merged, I am closing it. Future work can happen on new PRs. The code in this PR is archived in: https://github.com/nirbheek/meson/tree/cargo-module

In the meantime, the workaround for people wanting to do this is to use the wrapper script noted above.

xclaesse commented 4 years ago

I haven't fully read the proposed code, but something along those lines is definitely needed.

It's sad that this initiative has been shutdown. A few points:

I think the Meson project should be more open to experimental/unstable modules. That's the least you can do when you refuse (for good reason) to have functions in your language.
From a maintenance POV, I think we should be able to assign Meson modules to other maintainers than @jpakkane. I've got nothing against him, and he does a wonderful work, but we have to start delegating more to be able to scale. I've got the feeling that's what we are doing now with the cmake module, and that's a good sign. We should definitely do the same with a potential Rust module, based on this PR or not.
The GNOME module was the worst of the worst, but those tools improved because Meson started to use them. Most of the patches that got merged into glib/gtkdoc were not even conceivable without trying first, to spot exactly what's broken/missing. I don't think a rust module can be even half as bad as the GNOME module was, and still largely is.
Same argument about the new cmake module, it's a HUGE code base, it can't build half the cmake projects, totally fails in cross build cases, etc... But we allowed it because it has a clear maintainer who wants to improve it. And contributions are coming every week, I don't think @jpakkane gets involved, and that's a good and hard thing to be able to delegate.
It's easy to deprecate a Meson module as a whole if a better design is found, we did that with the python3 module.
Things doesn't need to be perfect to be useful.

I'm not reopening this PR, but I strongly encourage @jpakkane to state he would accept a rust/cargo module if someone steps in and propose a new PR as well as dedicating themself to maintain it. Unless this is accepted beforehand, I don't think anyone will want to waste his time like @nirbheek sadly did.

marc-h38 commented 4 years ago

I think the Meson project should be more open to experimental/unstable modules. That's the least you can do when you refuse (for good reason) to have functions in your language.

From a maintenance POV, I think we should be able to assign Meson modules to other maintainers than @jpakkane. [...] we have to start delegating more to be able to scale.

Things doesn't need to be perfect to be useful.

Would it possible to merge here in this repo just enough hooks and changes to make meson's "core" a bit more modular and allow mesonbuild/modules/unstable_cargo.py and its tests to live in a different git repo? Because no matter how many warnings, disclaimers and "arbitration clauses" you put in the way, anything that lives in the main repo will be perceived with some sort of commitment.

My 2 cents, pardon the noise if I'm being too naive and underestimating the minimal, "meson core" changes required.

BTW https://lwn.net/Articles/805840/ (very long thread)

"It seems the Rust community is not serious about shared libraries with API/ABI stability and without vendored dependencies, and therefore I claim it's not serious about competing with C/C++ on Linux."

xclaesse commented 4 years ago

Would it possible to merge here in this repo just enough hooks and changes to make meson's "core" a bit more modular and allow mesonbuild/modules/unstable_cargo.py and its tests to live in a different git repo? Because no matter how many warnings, disclaimers and "arbitration clauses" you put in the way, anything that lives in the main repo will be perceived with some sort of commitment.

That would mean committing to a stable API, which is even more difficult, especially in Python, IMHO. Also it means that when you build a project you have to find/install all those "meson plugins" it uses... I think it's opening a can of worms.

I personally think "unstable" disclaimers are useless. If it's useful, people will use it regardless, and if they have issues/regressions/improvements they submit patches. The wonderful world of Open Source :-)

BTW https://lwn.net/Articles/805840/ (very long thread)

"It seems the Rust community is not serious about shared libraries with API/ABI stability and without vendored dependencies, and therefore I claim it's not serious about competing with C/C++ on Linux."

I'm not actually a big fan of Rust, I agree it has some serious flaws like that one. Mostly not the language, but all the ecosystem around it. From Meson PoV, cargo is not that different from any other external build system, it "just" has to know the configure/build commands, and where to find built libraries/executable. I would love to see a Meson module that would do just that, and cargo would just be a special case of it.

marc-h38 commented 4 years ago

That would mean committing to a stable API,

No, I meant this only for experimental/staging modules. Think "out of tree" Linux drivers which regularly get broken by internal API changes.

which is even more difficult, especially in Python, IMHO.

Yes - even core Python developers don't seem sure what is a stable Python API and what is not. From https://lwn.net/Articles/795019/

Serhiy Storchaka raised the issue by listing the rules that he thought (!!!) governed the public/private question for names in modules.

This confusion is probably the main reason why Python developers seem generally cautious and don't assume an API is stable unless clearly stated/documented.

Also it means that when you build a project you have to find/install all those "meson plugins" it uses...

Again I meant this only for experimental modules, so having to find extra stuff is "The Feature". It's a the "unstable" disclaimer that you cannot pretend you didn't notice.