rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.56k stars 2.38k forks source link

[Request for Experiment] Sysroot building functionality #4959

Open japaric opened 6 years ago

japaric commented 6 years ago

Summary

This RFE proposes adding Xargo's functionality of building a sysroot on the fly to Cargo as a nightly-only feature.

Motivation

Many (all?) no-std / embedded developers have to Install One More Tool because the ability to build core / std from source is missing in Cargo. Apart from the inconvenience, the Xargo wrapper is from for perfect: it can trigger unnecessary sysroot rebuilds because it doesn't replicate Cargo fingerprinting mechanism; it can sometimes fail to trigger a necessary sysroot rebuild (japaric/xargo#189); it doesn't track changes in the sysroot source code (japaric/xargo#139); and it doesn't understand the +nightly command line argument because it's not a rustup shim (japaric/xargo#123) among other deviations from the behavior users expect from a built-in Cargo subcommand.

Apart from all the issues Xargo also ties no-std / embedded development to the nightly channel. This experiment will hopefully be a first step towards enabling no-std / embedded development on the stable channel.

Implementation details

User interface: [sysroot]

A [sysroot] section will be added to Cargo configuration file:

# .cargo/config
[sysroot]
rust-src = "~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src"

[sysroot.core]
path = "libcore" # paths are relative to `rust-src` when not absolute
stage = 1 # see multi-stage builds section below

[sysroot.compiler_builtins]
git = "https://github.com/rust-lang-nursery/compiler-builtins"
features = ["mem"]
stage = 2

There the user can specify the crates that will be included in the sysroot. Only path and git dependencies are allowed. The [sysroot.rust-src] setting is a convenience that lets the user use relative paths.

Behavior

If a [sysroot] setting is found in Cargo configuration file:

Cargo will (re)build the sysroot crates for the target and place the build artifacts in $TARGET_DIR/sysroot/lib/rustlib/$TARGET before executing subcommands that involve invoking rustc or rustdoc. Then when invoking the actual subcommand Cargo will append the argument --sysroot=$TARGET_DIR/sysroot to all its rustc and rustdoc invocations except the ones used to build build scripts (build.rs).

The usual fingerprinting mechanism applies to the sysroot build: for example changes to [profile] in Cargo.toml and changes in the sysroot source code will trigger a sysroot rebuild.

The sysroot crates will always be built using the release profile to not regress the performance and binary size of dev builds when switching to builds that use [sysroot].

Multi-stage builds

There are crates in the std facade, like the test crate, that have implicit dependencies on other members of the facade. Building sysroots that include these crates require multi-stage builds.

In multi-stage builds the sysroot will be build as follows: all the crates in the first stage are build against the default sysroot; then all the crates in the second stage are build using the stage 1 artifacts as a custom sysroot (i.e. --sysroot is passed to rustc); then all the crates in the third stage are build using the stage 1 and 2 artifacts as a custom sysroot; the process continues until all stages are built; finally the artifacts of all the stages are placed in $TARGET_DIR/sysroot/lib/rustlib/$TARGET.

Potential additions / tweaks

Xargo doesn't require the user to specify a rust-src setting because it assumes that both rustup and the rust-src component are installed and it uses rustc --print sysroot to get the path to the Rust source. We could do the same here by either: (a) probing for the existence of rust-src and printing a helpful error message when it's not installed, or (b) committing to always ship rust-src with the toolchain.

Xargo always rebuilds the sysroot in release mode but we could have the Cargo implementation use the dev profile when --release is not passed to the subcommand.

Future steps

Revisit rust-lang/rfcs#1133 to see if there's a desire for the changes proposed there that aren't included in this RFE: eliminating the concept of the sysroot, versioning the crates in the std facade, etc.

UPDATE(2018-01-20): Don't pass --sysroot to rustc when building build scripts (build.rs). Those should be build against the default sysroot because they always run on the host. If the build scripts were to be build against the custom sysroot the sysroot would need to contain the rust-std artifacts of the host and that would require copying (or linking) those into $TARGET_DIR/sysroot/lib/rustlib/$HOST, which is a waste of space.


cc @alexcrichton @nrc @Ericson2314

Ericson2314 commented 6 years ago

I do plan to someday get back to that. This seems like a great step until then! Especially if rustboot can utilize it.

Ericson2314 commented 6 years ago

Also, fwiw, my plan was to draft a sub-RFC based on what I implemented in https://github.com/rust-lang/cargo/pull/2768, which I considered not just a convenient but also meaningful subset. The idea was more (opt-in) ditching sysroots altogether, so only rustc itself need think about bootstrapping.

jdub commented 6 years ago

The tweak to assume the presence and location of the rust-src component is worth adopting. It makes Xargo so easy to use, and I suspect* it's what most users want to build. Using your own sysroot sources is preeeeeeetty advanced.

* no idea how we'd get data for this, besides a survey

nrc commented 6 years ago

I think it would be good to restrict the crates which can be placed in the sysroot to the usual ones we would expect there, unless there is good reason not to?

Are there other parts of the program other than build scripts that should be built with the host profile? Proc macros/custom derives seem like one thing. Are there more?

I wonder if there is a more user-friendly surface syntax we could layer on top of the custom sysroot facility? It seems like for any given platform and version, the sysroot setup will be the same, so it would be good to be able to set that all up once and then point to it from user crates, rather than having to spell it out each time.

SimonSapin commented 6 years ago
rust-src = "~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src"

Could this path be discovered by running rustc --print sysroot, rather than being hard-coded? It’d be nice to be able to switch between cargo +nightly build and e.g. cargo +nightly-2018-01-24 build, or different nightly dates in different projects, without editing a config file.

FenrirWolf commented 6 years ago

That's what's proposed in the potential additions/tweaks section

aidanhs commented 6 years ago

@jdub

The tweak to assume the presence and location of the rust-src component is worth adopting. It makes Xargo so easy to use, and I suspect* it's what most users want to build. Using your own sysroot sources is preeeeeeetty advanced.

For my usecase (wasm) I want to be able to get a copy of the rust std, hack around with it to create my own integrations with JS directly in libstd, and then build it. I'd say that experimenting with libstd (either to experiment with targets, or with the intent of making PRs against the main Rust repo) should not be an advanced use case and any solution should take care to have it as a first-class feature.

That's not to say that we can't make the rust-src assumption! I just want to make sure that in the process we don't throw away the plan to have correct rebuilds of sysroots when files are changed (because we assume rust-src components are immutable) - this is a nasty pain point of xargo today, as the RFE identifies.

nrc commented 6 years ago

Another thing to think about, if we integrate rustup functionality into Cargo, does this sysroot stuff interact with any of the cross-compilation-oriented toolchain features from rustup?

SimonSapin commented 6 years ago

I didn’t mean that Cargo would necessarily have rustup-specific functionality. Rustup could for example set an environment variable that Cargo (and rust-gdb!) would read to know where to find std sources. Or maybe have a conventional path relative to the rustc binary. (Or both.)

nrc commented 6 years ago

It's kind of the plan that we'll merge rustup and Cargo this year

Ericson2314 commented 6 years ago

I brought it up in https://github.com/rust-lang/rfcs/pull/1133#issuecomment-362354174 and @matklad posted a response https://github.com/rust-lang/rfcs/pull/1133#issuecomment-362355002, but for posterity I think it's better I mirror and elaborate my concerns on this proposal here.

  1. The sysroot impedes work on some high priorities such as...

    • ...tools/tool features like incremental compilation and IDE support. Both those involve to storing more meta data and intermediate artifacts for per crate for caching purposes. It's far more easy to implement that if libraries are built the same way. Because then just one build system needs each create, and it's easier to ensure the adjustment will have the desired uniform effect on each crate.

    • ...module system namespacing / lighter extern crate. See thread starting https://internals.rust-lang.org/t/the-great-module-adventure-continues/6678/186?u=matklad . Explicit dependencies are passed by Cargo to rustc directly, so we can automatically import them without being reckless. But it would be reckless to import the entire sysroot so as of today we'd need to do (expensive) traversals of sysroot and source.

    [I always found sysroots and built-in standard libraries aesthetically displeasing and inconvenient in misc ways, but credit to @matklad for those great specific examples!]

  2. Explicit boostrapping stages is...

    • ...cumbersome. Cargo works by looking at the needed dependencies of each create, and then deriving a plan automatically. This roughly requires the user to hand-compute a topological sort, essentially doing Cargo's work.

    • ...error-prone. If the user writes the boostrapping stages wrong, only rustc, not Cargo will catch the mistake. Also, in the Rust code itself, each library may freely link all the libraries in the previous stage which may not be the author's intention.

    • ...brittle. Small changes to the dependency graph can greatly change the topological sort, requiring the user to redo everything all over again.

    • ....unperformant. A full dependency graph exposes more potential parallelism.

matklad commented 6 years ago

@Ericson2314, here's one more argument for, eventually, getting rid of sysroot altogether: currently, we have to use one set of compiler flags for stdlib, for all use cases. Building your own std should allow one to use the precise flags one wants. The particular problems today are:

RalfJung commented 6 years ago

How are the stability guarantees around this? xargo breaks every now and then because something in rustc changes. (Though I assume if rustc CI tests that this works, that would happen less often.)

Also, compiling libstd requires nightly features -- so i we want to permit custom sysroots on stable, and people start patching libstd, they could easily end up in a situation where upgrading the compiler breaks their code.

elichai commented 6 years ago

Sounds like a great idea.

jethrogb commented 5 years ago

“Implementation details” section should be updated to discuss [patch] sections for sysroot building.

lygstate commented 3 years ago

Looks by using https://github.com/rust-lang/wg-cargo-std-aware/issues/51 can resolve this issue

tchernobog commented 9 months ago

I bumped into this issue while trying to solve the problem of building for a target triplet equal to that of the host, but still for a sysroot (as the target is an embedded device).

I pass a custom gcc binary which is built to support the right sysroot paths.

cargo right now misbehaves by always appending -L native=/usr/lib/x86_64-linux-gnu, which in turn will result in rustc receiving "-L" "/usr/lib/x86_64-linux-gnu" as an option, breaking the whole linking as then the linker picks up differently built libraries outside the sysroot and from the host system.

Right now I am working around this via a hack and a rustc + linker wrapper, but it's not great.

Should I open a separate issue for that?