rust-lang / wg-cargo-std-aware

Repo for working on "std aware cargo"
133 stars 8 forks source link

Plan for removal of `rustc-dep-of-std` #51

Open ehuss opened 4 years ago

ehuss commented 4 years ago

This issue is to discuss the strategy for removing the rustc-dep-of-std feature and the rust-std-workspace-* packages which allows standard library crates to use crates from crates.io. This is relatively low priority, but will be a nice-to-have once stdlib dependencies are supported (and possibly stabilized).

Alex sketched out an outline at https://github.com/rust-lang/wg-cargo-std-aware/issues/27#issuecomment-529021295.

I have been working on explicit dependencies, so I've been thinking about this lately. Assuming Cargo has explicit stdlib dependencies, and they are stabilized, then these crates can be modified to use explicit stdlib dependencies instead of rustc-dep-of-std. When built under normal circumstances, Cargo will pass pathless --extern flags to cause rustc to add those dependencies to the extern prelude, and rely on the resolver to load those crates from the sysroot. A complication with how rustbuild works remains: how will Cargo know what to do with these dependencies when building the standard library when there is no sysroot?

There are a few approaches:

Use build-std

Rustbuild would use the build-std mode of Cargo to build the standard library. It would somehow need to tell Cargo the location of the stdlib source (which is in the workspace root, not the sysroot). Cargo would also need to detect when it is building a std dependency to properly draw the dependency edge to the real thing. Rustbuild would also need a base package to build, since it needs something to kick it off. This could be an empty library.

I'm not a big fan of this approach. Overall it seems messy, and much less flexible. I'm also not sure how features could be passed in to the standard library.

Normal build with explicit-std

Rustbuild would do a build of the standard library the same way it does today. The only difference is that it needs to tell Cargo that explicit stdlib dependencies shouldn't use the sysroot (since it is in the process of building it). We instead want Cargo to pass --extern core=deps/libcore.rlib style options with the correct path.

Some suggestions for how to tell Cargo to behave this way:

I'm enticed by the automatic option, since it "just works" and is relatively simple. Alex wasn't too enthusiastic when I mentioned it though 😜 .

roblabla commented 4 years ago

So, my input on this: I work on a custom out-of-tree libstd for a very niche target (a custom OS I'm working on). As part of this, I pull quite a lot of dependencies. I currently use Xargo's multi-stage builds to avoid having to add a rustc-dep-of-std feature to all my deps. However, it slows down the build quite a bit (since it can't run all builds in parallel) and it's, well, not really an official thing.

I'm definitely interested in a system that's "automatic" in the sense of not having to modify the Cargo.toml of my sub-dependencies. Is that something that's possible?

Ericson2314 commented 4 years ago

Sounds great! Just keep in mind the use-case where previously rustc-specific crates migrate to crates.io. I think rather than patch stdlib deps to be normal deps, we should be patching normal deps to be stdlib deps, so for those future versions of the compiler we just drop the patch and the regular thing happens.

alexcrichton commented 4 years ago

My personal preference would be to pursue the "patch" syntax for this. For example the backtrace crate would depend on core via -Z explicit-std, and then the rust-lang/rust workspace would have the following:

[patch.sysroot]
std = { path = "src/libstd" }
core = { path = "src/libcore" }
alloc = { path = "src/liballoc" }

# maybe not necessary, unsure
proc_macro = { path = "src/libproc_macro" }
test = { path = "src/libtest" } 

That way all crates.io crates with explicit deps would get rewrited to the path dependencies, and then the source code in rust-lang/rust would continue using path dependencies everywhere. Additionally this would allow us to build all of rust-lang/rust in one cargo build because rustc, for example, would have an implicit dependency on std (and things like proc_macro) which would be overridden via the [patch.sysroot]

I would prefer to avoid supporting a sort of magical "just match up the names in the crate graph" behavior because Cargo has largely avoided that sort of dependency matching, and I think that [patch] at least conceptually fits the bill pretty nicely here. Implement thing would also support users with custom implementations of the standard library and/or libcore, since they could use [patch] as well to pull in a custom version.

roblabla commented 4 years ago

That way all crates.io crates with explicit deps would get rewrited to the path dependencies, and then the source code in rust-lang/rust would continue using path dependencies everywhere.

This means that we'd still have to modify every crates' Cargo.toml to give them an explicit std = { stdlib = true } or something like that, right? Is there a reason why we couldn't patch implicit dependencies to std? I find it really odd to make a distinction between an explicit and an implicit dependency on std, since they all seem (at least to me) to end up doing the same thing.

ehuss commented 4 years ago

My confusion around [patch] is that I think you don't want it to apply for everything in the rust-lang/rust workspace, right? Or, at least, in the current design of how rustbuild works, since that would cause rustc and all the tools to rebuild the standard library multiple times since they have independent cargo target directories.

It might be nice for rustbuild to share the target directories, though I do not know what challenges there are there. Or maybe [patch.sysroot] would only be enabled in certain cases? Did you have other ideas on how it could avoid recompiling the standard library for things like tools?

Is there a reason why we couldn't patch implicit dependencies to std?

I think a hypothetical [patch.sysroot] could also handle implicit std dependencies. My prototype is designed to handle them for build-std.

alexcrichton commented 4 years ago

That's an excellent point! My "ideal strategy" using [patch] would require one target directory per stage, which means that building the compiler is just one cargo build invocation rather than the two that it is today. Tools are an interesting point I hadn't considered, though. I think you're right that if my patch strategy is to work we'd have to move everything into one target directory and then basically find an actual fix for any bugs that arise from using just one target directory. I forget what the bugs actually are, or if there are any, or if they've just simply been solved via other means by this day and age.

@roblabla ah yeah I agree with @ehuss and it wouldn't require that everyone write out std = { .. }, Cargo will understand that if you don't have sysroot deps then you implicitly depend on today's sysroot (proc_macro, test, std, etc), and those dependency edges would be synthesized by Cargo.

lygstate commented 3 years ago

My personal preference would be to pursue the "patch" syntax for this. For example the backtrace crate would depend on core via -Z explicit-std, and then the rust-lang/rust workspace would have the following:

[patch.sysroot]
std = { path = "src/libstd" }
core = { path = "src/libcore" }
alloc = { path = "src/liballoc" }

# maybe not necessary, unsure
proc_macro = { path = "src/libproc_macro" }
test = { path = "src/libtest" } 

That way all crates.io crates with explicit deps would get rewrited to the path dependencies, and then the source code in rust-lang/rust would continue using path dependencies everywhere. Additionally this would allow us to build all of rust-lang/rust in one cargo build because rustc, for example, would have an implicit dependency on std (and things like proc_macro) which would be overridden via the [patch.sysroot]

I would prefer to avoid supporting a sort of magical "just match up the names in the crate graph" behavior because Cargo has largely avoided that sort of dependency matching, and I think that [patch] at least conceptually fits the bill pretty nicely here. Implement thing would also support users with custom implementations of the standard library and/or libcore, since they could use [patch] as well to pull in a custom version.

Looks good, are this on the road?

Ericson2314 commented 3 years ago

Do we have the ability to write explicit standard library deps yet? I would think that the syntax for that and these overrides would go hand-in-hand.