Formal support for linking rlibs using a non-Rust linker

adetaylor commented 4 years ago

I'm working on a major existing C++ project which hopes to dip its toes into the Rusty waters. We:

Use a non-Cargo build system with static dependency rules (and 40000+ targets)
Sometimes build a single big binary; sometimes lots of shared objects, unit test executables, etc. - each containing various parts of our dependency tree.
Perform final linking using an existing C++ toolchain (based on LLVM 11 as it happens)
Want to have a few Rust components scattered throughout a very deep dependency tree, which may eventually roll up into one or multiple binaries

We can't:

Switch from our existing linker to rustc for final linking. C++ is the boss in our codebase; we're not ready to make the commitment to put Rust in charge of our final linking.
Create a Rust staticlib for each of our Rust components. This works if we're using Rust in only one place. For any binary containing several Rust components, there would be binary bloat and potentially violations of the one-definition-rule, by duplication of the Rust stdlib and any diamond dependencies.
Create a single Rust staticlib containing all our Rust components, then link that into every binary. That monster static library would depend on many C++ symbols, which wouldn't be present in some circumstances.

We can either:

Create a Rust staticlib for each of our output binaries, using rustc and an auto-generated .rs file containing lots of extern crate statements. Or,
Pass the rlib for each Rust component directly into the final C++ linking procedure.

The first approach is officially supported, but is hard because:

We need to create a Rust staticlib as part of our C++ tool invocations. This is awkward in our build system. Our C++ targets don't keep track of Rust compiler flags (--target, etc.) and in general it just feels weird to be doing Rust stuff in C++ targets.
Specifically, we need to invoke a Python wrapper script to consider invoking rustc to make a staticlib for every single one of our C++ link targets. For most of our targets (especially unit test targets) there will be no rlibs in their dependency tree, so it will be a no-op. But the presence of this wrapper script will make Rust adoption appear intrusive, and of course will have some small actual performance cost.
For those link targets which do include Rust code, we'll delay invocation of the main linker whilst we build a Rust static library.

The second approach is not officially supported. An rlib is an internal implementation format within Rust, and its only client is rustc. It is naughty to pass them directly into our own linker command line.

But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.

Because external toolchains are not expected to consume rlibs, some magic is required:

The final C++ linker needs to pull in all the Rust stdlib rlibs, which would be easy apart from the fact they contain the symbol metadata hash in their names.
We need to remap __rust_alloc to __rdl_alloc etc.

But obviously the bigger concern is that this is not a supported model, and Rust is free to break the rlib format at any moment.

Is there any appetite for making this a supported model for those with mixed C/C++/Rust codebases?

I'm assuming the answer may be 'no' because it would tie Rust's hands for future rlib format changes. But just in case: how's about the following steps?

The Linkage section of the Rust reference is enhanced to list the two current strategies for linking C++ and Rust. Either:
- Use rustc as the final linker; or
- Build a Rust staticlib or cdylib then pass that to your existing final linker (I think this would be worth explicitly explaining anyway, so unless anyone objects, I may raise a PR)
A new rustc --print stdrlibs (or similar) which will output the names of all the standard library rlibs (not just their directory, which is already possible with target-libdir)
Some kind of new rustc option which generates a rust-dynamic-symbols.o file (or similar) containing the codegen which is otherwise done by rustc at final link-time (e.g. symbols to call __rdl_alloc from __rust_alloc, etc.)
The Linkage section of the book is enhanced to list this as a third supported workflow. (You can use whatever linker you want, but make sure you link to rust-dynamic-symbols.o and everything output by rustc --print stdrlibs)
Somehow, we add some tests to ensure this workflow doesn't break.

A few related issues:

64191 wants to split the compile and link phases of rustc. This discussion has spawned from there.
@dtolnay's marvellous https://github.com/dtolnay/cxx is not quite as optimal as it could be, because users can't use -Wl,--start-group, -Wl,--end-group on the linker line. (Per https://github.com/rust-lang/rust/issues/64191#issuecomment-629418541)
the difficulties of using the staticlib-per-C++-target model happen to be magnified by #73047

@japaric @alexcrichton @retep998 @dtolnay I believe this may be the sort of thing you may wish to comment upon! I'm sure you'll come up with reasons why this is even harder than I already think. Thanks very much in advance.

bjorn3 commented 2 years ago

But I did notice that Cargo has specific knowledge of the rustc-std-workspace* names baked into it

The only occurences of rustc-std-workspace-core in the cargo source code are for tests and the implementation of -Zbuild-std. It doesn't have any special meaning for regular compilation. https://github.com/rust-lang/cargo/search?q=rustc-std-workspace-core

The patching of rustc-std-workspace-* happens at https://github.com/rust-lang/rust/blob/1103d2e914b67c18b0deb86073c26c6aefda761d/Cargo.toml#L128-L132

The rustc-std-workspace-core and rustc-std-workspace-alloc on crates.io is empty. Docs.rs doesn't render empty files correctly it seems: https://docs.rs/crate/rustc-std-workspace-core/latest/source/src/lib.rs The rustc-std-workspace-std on crates.io re-exports libstd from the sysroot: https://docs.rs/crate/rustc-std-workspace-std/latest/source/src/lib.rs

The patched versions in the rust repo depend on the respective crate and re-export it:

https://github.com/rust-lang/rust/blob/1103d2e914b67c18b0deb86073c26c6aefda761d/library/rustc-std-workspace-core/Cargo.toml#L13-L14

https://github.com/rust-lang/rust/blob/1103d2e914b67c18b0deb86073c26c6aefda761d/library/rustc-std-workspace-core/lib.rs#L1-L2

When compiling the standard library it is not possible to depend on any crate being in the sysroot as the sysroot is empty at that point. It is only filled after building the standard library.

danakj commented 2 years ago

I linked https://github.com/rust-lang/rust/issues/94138#issuecomment-1048077884 here. The summary from there is that I've now hit my first case of a linker error due to linking rlibs. And the error does not occur if you link the --emit=obj object file instead.

mhammerly commented 2 years ago

On the topic of allocator shims: I took a crack at implementing a -C force-allocator-shim per @jsgf's suggestion elsewhere and when I was 90% finished I saw https://github.com/rust-lang/rust/pull/86844 "Support #[global_allocator] without the allocator shim" from @bjorn3 who has been active here. #86844 has been open since July so I put my PR up to see if it'd bump the discussion.

I think either PR can be used to work around the allocator shim issue. I think #86844 would work with a single-.o --emit=obj strategy (right?) while #94389 behaves like a normal allocator shim with a separate .o in the .rlib. If I run my local #94389 rustc with --emit=obj the resulting .o has no allocator symbols in it.

Archives seem nicer than --emit=obj to me, but I'm not really equipped to argue either way.

pcwalton commented 2 years ago

For those interested, I have a minimal test case to demonstrate the duplicate symbols problem here: https://github.com/pcwalton/rust-staticlib-duplicate-symbols-test

The problem can be triggered with just two crates, plus a C program that links them. Crate a exports symbols foo (mangled) and bar (unmangled, so C can call it). Crate b links against a as an rlib (i.e. in Rust) and exports a symbol baz(unmangled) which calls foo internally. The C program c links against a and b as staticlibs, in that order and calls bar and baz.

Here's how the linking works. The linker first goes to find bar and resolves it to the object file inside liba. Then it looks up baz and resolves it to the object file inside libb. The linker then has to resolve the call to foo inside baz, but it can't use the copy inside liba because rustc gave it a different symbol name. So the linker goes to find foo inside libb, but the object file containing libb's copy of foo also contains a definition for bar. This causes a multiple definition error, and ld fails to link.

Note that whether you get a linking error is sensitive to many factors. If you pass -lb -la instead of -la -lb, you get no error, because the linker will find both bar and baz inside libb and won't look inside liba at all. If you remove the call to foo inside the definition of baz, you get no error, because ld won't look at the object file inside libb containing the duplicate copy of baz. There are even cases (not this one) in which changing the relative alphabetical ordering of symbols will cause the linker to resolve names in different orders and can make the difference between linking succeeding and failing.

pcwalton commented 2 years ago

Semantically --emit obj seems fine to me, but the codegen units = 1 problem would be a real practical blocker. Something that emits a plain .a file would be a fine replacement and seems straightforwardly compatible with multiple cgus. In practice this could be a plain .rlib, or an rlib with an extra step of stripping out the rmeta files if that confuses the linker.

There is a flag you can pass to ld—ld -r—that should allow us to continue to use codegen units with --emit obj, by allowing separate object files to be combined into a single relocatable one for subsequent linking. The RFC that comes out of this should probably mention ld -r as a potential alternative solution. That being said, I don't love ld -r simply because in my experience it's not that well tested and can have issues. I also don't know if Windows has such a thing.

pcwalton commented 2 years ago

What would folks here think if I wrote a pre-RFC for a minimal feature that stabilizes some of what rustc does today with rlibs? Effectively this would just state that rlibs are archives in the native system format, containing object files in the native system format, plus some extra .rmeta files, that comprise all the symbols in the crate, but not any upstream dependencies. The RFC would not specify the format of the metadata—that would be considered perma-unstable and subject to change. Effectively this RFC would simply be a minimal commitment that rustc will not move away from emitting native system archives if --crate-type=rlib is specifically passed to the compiler.

bjorn3 commented 2 years ago

For some targets we may need to invent our own format if there exists no standard archive format. For example rust-gpu uses tar for it's spirv target. Furthermore there may be performance benefits to using different archive formats or even emitting a single object file containing both code and crate metadata in case only a single cgu is used. Also without knowing where to find libcore and other parts of the standard libraries rlibs are useless. We would have to add some way to know which crates an rlib needs to link to. And finally it would permamently close the door for things like mir-only rlibs or similar things where codegen would happen more lazily to improve compiler performance.

pcwalton commented 2 years ago

For some targets we may need to invent our own format if there exists no standard archive format.

This is fine. I only care about existing architectures.

Furthermore there may be performance benefits to using different archive formats or even emitting a single object file containing both code and crate metadata in case only a single cgu is used.

Note that I only suggest stabilizing --crate-type=rlib when explicitly specified as such. We could introduce new crate types if needed, and switch them to be the default.

Also without knowing where to find libcore and other parts of the standard libraries rlibs are useless.

That's not a problem in practice; our build system knows where to find them.

We would have to add some way to know which crates an rlib needs to link to.

No, we wouldn't. This feature is intended for non-Rust build systems, which already know all that information.

And finally it would permamently close the door for things like mir-only rlibs or similar things where codegen would happen more lazily to improve compiler performance.

See above; I'm only suggesting stabilizing the contents of explicit --crate-type=rlib. Nothing in this RFC precludes the possibility of introducing --crate-type=mir-rlib or --crate-type=rlib2 or something like that, and making a future rlib2 the default.

Can you explain what solution you would prefer for our use case? We cannot do any of (a) using rustc for final link; (b) avoiding diamond dependencies; (c) sacrificing parallelism by forbidding multiple CGUs per crate.

bjorn3 commented 2 years ago

That's not a problem in practice; our build system knows where to find them.

How does it now for sysroot crates without depending on unstable implementation details which kind of defeats the point of stabilizing this.

No, we wouldn't. This feature is intended for non-Rust build systems, which already know all that information.

They can't know which parts of the standard library are used unpess rustc tells it. You can't simply add all individual crates in the rustlib directory for the target as that may include unused crates or even crates which may only be linked in specific cases. (for example only one of panic_abort and panic_unwind may be linked and one of them must be linked if libstd is used. their existence is not guaranteed though. we may switch to something similar to the allocator shim)

See above; I'm only suggesting stabilizing the contents of explicit --crate-type=rlib. Nothing in this RFC precludes the possibility of introducing --crate-type=mir-rlib or --crate-type=rlib2 or something like that, and making a future rlib2 the default.

Such a feature would likely require every crate to be compiled with it. For as long as we don't have stable -Zbuild-std that means the standard library would have to be compiled with this feature, but that would break build systems depending on rlibs containing useful object files.

Can you explain what solution you would prefer for our use case? We cannot do any of (a) using rustc for final link; (b) avoiding diamond dependencies; (c) sacrificing parallelism by forbidding multiple CGUs per crate.

Have rustc tell you exactly what you need to link and have this rustc invocation generate any necessary support files like the allocator shim.

On a side note: The current compilation model of rustc hasn't been well thought out. It consists of a bunch of features bolted on top of each other creating a bit of a mess. I feel like we should first document the current compilation model and have a unifying vision for a better compilation model before we can think about how to incrementally improve (or maybe even rewrite from scratch) the compilation model of rustc to prevent painting ourself into a corner with suboptimal compilation performance while at the same time having bad integration with the C compilation model and permanently bad dylib support.

bjorn3 commented 2 years ago

By the way I forgot if I asked this already, but would a rustc command to bundle multiple rlibs into a single staticlib without requiring any rust source for this staticlib work? So something like rustc --crate-type staticlib -Zlink-libs liba.rlib libb.rlib -o libcombined.a. This should allow diamond dependencies I think. You could run it just before invoking the system linker.

pcwalton commented 2 years ago

That may work as long as rustc isn't going to magically read any files that the build system doesn't know about. Keep in mind that many large-scale build systems are distributed and so the tool has to precisely know all of the dependent files that rustc is going to look at before invoking the compiler. If it's just fundamentally converting one link line with the system linker into a two-step rust-linker plus system-linker process then it seems OK though.

I'll let @jsgf comment on whether this works for Buck specifically, since I'm not as familiar with its inner workings.

mhammerly commented 2 years ago

@bjorn3 Just making sure I understand your suggestion correctly: before the final link, the non-Rust build system (GN + Ninja, Buck, Bazel, whatever) pulls a flat list of Rust dependencies out of the dependency graph, runs your rustc -Zlink-libs command, and uses the output in the link line?

If that's right, I think a cruder version of that idea was covered in the OP:

Create a Rust staticlib for each of our output binaries, using rustc and an auto-generated .rs file containing lots of extern crate statements. Or,

I'm not positive but I think the downsides listed in the OP would still apply to your idea too. Propagating flags like --target or -Z link-native-libraries=no to the Rust pre-link would be difficult, and I think the build speed impact on a large project of basically double-linking all the Rust code would be rough. Those points were written about gn + ninja, but I think they'd apply to Buck as well from my (limited) experience.

I can't speak to issues with compiler performance, but I think the way to improve support for big multi-language projects is to make rustc's observable behavior less special:

generate an "exportable" unbundled static archive (what @pcwalton is offering to advance here)
generate an unbundled shared library without dylib's baggage (so: skip the allocator shim, and defer errors about missing panic or alloc error handlers to the "boss toolchain")
- this is a separate rabbit hole, just including it for completion

Granted those, there are some gaps we need official solutions for:

generating an allocator implementation since rustc won't do it for us (#86844)
- from @bjorn3's comment we may need this for panic_* later too
providing the standard library to the linker since rustc is no longer bundling it
- users relying on implicit deps on the stock sysroot could expand OP's described rustc --print stdrlibs on the link line
- or, users can pass --sysroot /dev/null and provide explicit sysroot deps. "Unused deps" checks can be handled with #96067

For what it's worth, the unbundled static archive + explicit sysroot deps approach is working fine at Meta. I don't think we're that far from serviceable official support, but I'm sure I'd learn a lot from an RFC discussion.

I feel like we should first document the current compilation model and have a unifying vision for a better compilation model before we can think about how to incrementally improve (or maybe even rewrite from scratch) the compilation model of rustc to prevent painting ourself into a corner with suboptimal compilation performance while at the same time having bad integration with the C compilation model and permanently bad dylib support.

How can we figure out whether beginning this revamp is worth prioritizing soon? The pre-RFC @pcwalton offered to write (or a full RFC) feels like a good way to get wider input on whether that's necessary.

bjorn3 commented 2 years ago

Just making sure I understand your suggestion correctly: before the final link, the non-Rust build system (GN + Ninja, Buck, Bazel, whatever) pulls a flat list of Rust dependencies out of the dependency graph, runs your rustc -Zlink-libs command, and uses the output in the link line?

Indeed

and I think the build speed impact on a large project of basically double-linking all the Rust code would be rough.

Thin archives would make it a simple listing of all files and generating any necessary auxiliary object files rather than copying everything I think.

adetaylor commented 2 years ago

Just making sure I understand your suggestion correctly: before the final link, the non-Rust build system (GN + Ninja, Buck, Bazel, whatever) pulls a flat list of Rust dependencies out of the dependency graph, runs your rustc -Zlink-libs command, and uses the output in the link line?

For what it's worth, this would make it potentially harder for us to introduce Rust into our project. Rust code may appear anywhere in our huge and complex dependency graph, eventually propagating to some final C++ linker target. Our build system discourages global knowledge by design (for speed/scalability reasons) so there's no way to know in advance whether a build target would have Rust anywhere in its dependency tree. We'd have to at least consider running rustc -Zlink-libs for every C++ linker invocation. This would have a small but non-zero impact on the build times of pure-C++ parts of our codebase, which would be difficult for us to justify at this point.

pcwalton commented 2 years ago

Based on feedback I'd like to revise my proposal to introduce a new crate type, --crate-type=rlib0, which is simply "what Rust calls an rlib today". This is intended for use with external build systems that need to be able to count on handling rlibs. The reason for introducing a new crate type is so that --crate-type=rlib can change in the future without breaking these external build systems.

jsgf commented 2 years ago

Woah, lots of activity here.

@bjorn3, I think it's worth restating what we're trying to achieve here, and what the environment and constraints are. I'll speak for @pcwalton and @mhammerly; I can't speak to @adetaylor's environment, but what he says resonates with me and I think there's a lot of commonality here.

So, we're talking about Rust interop with other languages. C++ is the primary one, but we're also interested in at least Python, Java, Ocaml, Haskell. Rust is very appealing in this role, because its execution model is very similar to C/C++ - there's no real operational difference between running a Rust binary and a C++ one from a deployment or monitoring point of view.

We would like to extend this to the compilation and linking point of view - that is, that a given piece of Rust code which exposes an API with a C ABI, we would like to be able to freely integrate that into a build while localizing the knowledge that it's implemented in Rust.

Note that while I'm mostly going to be addressing this in the context of "big builds out of a monorepo", I see all this as solving a much more commonplace problem. We're seeing an increasing number of projects where C/C++ codebases are being incrementally rewritten in Rust (eg curl), so we'll start seeing more and more hybrid Rust/C/C++ build and link cases at all scales. The more that we can make Rust Just Work in all kinds of build systems when interoperating with C, easier it will be to justify RIIR projects. But this means trying to avoid things like special link time steps.

We have very large dependency graphs (as @adetaylor mentions), with arbitrary dependency edges between Rust and C++, in both directions. The top-level might be Rust, C++ or some other language which can bind with a C/C++ ABI. A single top-level dependency (eg executable but not necessarily) could have tens of thousands of transitive dependencies, reachable via hundreds of thousands of dependency paths - that is, a single Rust library could be depended upon via many paths, both via Rust and C++ (std or core being the obvious maximal cases, as every other crate will have a dependency edge to them).

(Also related: We have quite a lot of technologies based on post-processing object files and executables, eg BOLT - so having everything use a uniform representation helps a lot there too. We'd like Rust code to also benefit from those optimizations minimal special effort.)

The goal here is to make sure that every crate is compiled precisely once, regardless of how many times its depended upon, or which or how many dependency paths there are.

While the fundamental problem with staticlib is that it bundles all the transitive dependencies and causes massive symbol duplication (linker errors) and inefficiency (copying) in this compilation environment, it also shares a problem with any proposal which has a different crate type or compilation mode for "Rust for consumption by C++" vs "Rust for consumption by Rust" - it involves building the crate multiple times, which leads to the possibility of link time conflicts (either visible or invisible).

As a result, I've come to the conclusion that we need a single crate type and file format which is suitable for both C++ and Rust consumption, which is the essence of @pcwalton's proposal. rlib, as it currently stands, is actually fine for this.

(Though I'd probably prefer something like --crate-type rlib --emit universal-link=....)

To address some of the specifics:

For some targets we may need to invent our own format if there exists no standard archive format. For example rust-gpu uses tar for it's spirv target.

We're only concerned with language interop, so we would only be concerned with using the same archive or object file format that other languages use in that environment. If there are no other languages or pre-existing formats, then Rust would be free to invent new things.

Furthermore there may be performance benefits to using different archive formats or even emitting a single object file containing both code and crate metadata in case only a single cgu is used.

If that's the case, then using a single object file with the rmeta embedded as a section (or similar) would be fine, and should be completely compatible with the platform linking model.

Also without knowing where to find libcore and other parts of the standard libraries rlibs are useless. We would have to add some way to know which crates an rlib needs to link to. How does it now for sysroot crates without depending on unstable implementation details which kind of defeats the point of stabilizing this.

Yes, that's a big issue - I'll go into it a bit more below. @pcwalton's proposal is necessary but far from sufficient - there are a lot of other details to sort out. But this is a starting point.

And finally it would permamently close the door for things like mir-only rlibs or similar things where codegen would happen more lazily to improve compiler performance.

Not necessarily, but it does make things more complex (because that's intrinsically complex in a polygot build environment). In the polyglot dependency graph I describe above, any point where you have a non-Rust -> Rust dependency would require the crate content to be something that the other language can reference (ie object code, or a proxy like llvm-bc for LTO). Within the connected subgraphs of Rust->Rust dependencies you could use Rust-specific representations, but that's still complex without analyzing the entire dependency graph (which we'd like to avoid if possible).

sysroot dependencies

There's a few problems here:

How do we know which parts of sysroot a given crate requires?
What are the dependencies between sysroot crates? (Edit: ALSO what dependencies do sysroot have on non-Rust libraries.)
How do we handle non-local configuration like allocators and panic handlers?

Fundamentally 1 is awkward because everything in sysroot is implicitly available to all crates, and there's no current requirement to specify any dependencies/requirements explicitly. (Proc-macros being a partial exception, by happenstance.) Distinguishing std from core users is particularly awkward, because the signal for that is buried inside the crate's source rather than specified externally (eg passed in as a rustc command-line argument). Likewise alloc etc are Just There. To solve this we just specify all the likely sysroot crates as dependencies for everything (incl std and core in the prelude, alloc and compiler_builtins as noprelude). An unfortunate consequence is that everything gets std regardless of whether it needs it, which is fine for the server apps I'm mostly interested in, but it doesn't work for mobile/embedded use-cases.

To solve 2, I currently have a hacky script which, given a list of "root" sysroot rlibs (std, core, alloc, and the rustc_std_workplace_ versions), it uses rustc -Zls to walk the dependency graph and generate appropriate build-system rules encoding these dependencies. I hard code knowledge of non-Rust dependencies (libc, pthread, etc), and compile everything with -Zlink-native-libraries=no to suppress #[link] directives.

Hacky, but in practice it works well.

This is obviously unstable in several different ways, and it would be nice if one of the outputs of the Rust toolchain build process would be a formal dependency graph represented in some machine-readable way (eg as json, perhaps extracted from the Cargo metadata).

(See below for more on this.)

3 is very tricky, simply because of the non-local nature.

Allocators need to be genuinely global - all Rust code coexisting in a process must use the same allocator, regardless of how it relates to each other in the dependency graph. The standard approach might be to define some weak symbols for the allocator which can be overridden once. Multiple attempts to override should fail (typically with duplicate symbol linker errors). Rust has a more ergonomic approach to this, but it relies on rustc doing the final link. We'd like to find a mechanism which still allows rustc to give good error messages when it does the link, but is still correct if something else is linking.

Have rustc tell you exactly what you need to link and have this rustc invocation generate any necessary support files like the allocator shim.

(The allocator shim is gone now, right?)

To be honest, I haven't fully grokked the panic-handling side of things. I'm not completely sure what the effect of specifying --panic=abort/unwind is, and what the implications of different crates being compiled with different panic handling modes is, or what the constraints are. But

for example only one of panic_abort and panic_unwind may be linked and one of them must be linked if libstd is used

In practice it looks like it's actually fine to include both, since they have non-conflicting symbols. But it would be preferable to have just one. But it wouldn't surprise me if there's some magic I'm overlooking.

link-time magic

Rustc has a few pieces of link-time magic (that I'm aware of, but I'm certainly missing things and getting details wrong):

allocator shim generation
panic-handler selection
choosing a special dylib for shim injection

As mentioned above, ideally we could avoid having an allocator shim at all. But if we must have one, it would be better to have a way for rustc to generate it as a real rlib, rather than slipping it into the crate as a side-effect - eg rustc --crate-type allocator-shim --emit link=.../liballocshim.rlib <other options>. That way we can explicitly have a rule to generate a shim and then depend on it like normal.

As I mentioned above, I don't fully understand all the constraints around panic-handlers, but it definitely falls into the link-time magic category.

And the "magic dylib" stuff makes dylibs very non-uniform and in effect the specific artifact that gets produced depends on, uh, the specific order the dependency graph is walked? Actually now that I write this, I don't really understand this mechanism either. But it also falls under the category of "dynamic linking / shared objects are not well supported".

(Separate binary builds from links)

This is a bit of an aside, but being able to build a --crate-type bin's Rust code separately from linking it seems closely related to all this, and an independently valuable thing to have.

staticlib link step

By the way I forgot if I asked this already, but would a rustc command to bundle multiple rlibs into a single staticlib without requiring any rust source for this staticlib work? So something like rustc --crate-type staticlib -Zlink-libs liba.rlib libb.rlib -o libcombined.a. This should allow diamond dependencies I think. You could run it just before invoking the system linker.

This is awkward for a few reasons:

one is @adetaylor's concern, that we'd like to avoid having to do O(dep graph) operations at link time, if we can avoid it. The link itself it unavoidably that scale, but we'd still like to avoid more
secondly, we'd need to apply this to all "final" links, but only final links. That is, we have cases where there are sometimes intermediate linked artifacts (eg shared libraries) where we may or may not want to include the output of this "rustc staticlib" step
thirdly, it puts a requirement on all link steps - ie, we'd need to add a Rust special case to C++, Ocaml, Haskell, ... links - wherever we do a link and there's a chance that Rust could end up somewhere in the dependency graph
(thin archives avoid the IO overhead, but they're not universally available)

I'm not saying we couldn't make it work, but it wouldn't be straightforward. And we're getting quite far from "rust objects are just ordinary objects" (ie, the "make RIIR easy" case I mentioned above).

Since Buck (v2) supports some degree of dynamic dependencies, we could have something like --emit sysroot-deps=... for each crate target which would emit the specific sysroot libraries in some form (and maybe shims, etc) which that crate requires. This would have to be abstracted rather than, eg, paths to objects (eg emit std,core,alloc rather than /usr/local/lib/rust/.../libstd-abc123.rlib) so we can map these to actual target names. That would still allow rustc to retain and export knowledge of the internal dependencies on sysroot while allowing us to put them to use them as formal dependencies. (Note that still means we need the static dependency graph within sysroot.)

(Thinking about this more, I don't see how it could work since we don't know what sysroot libs to give to rustc, and it can't do anything with the code - let alone print the sysroot deps - without them. But maybe it would work for the more narrow scope of "things that need to be added to the final link" like shims.)

That said, dynamic dependencies are uncommon among build systems - I'm not sure Bazel or GN/ninja or even plain make could support this, for example. Buck v1 definitely can't.

On a side note: The current compilation model of rustc hasn't been well thought out. It consists of a bunch of features bolted on top of each other creating a bit of a mess. I feel like we should first document the current compilation model and have a unifying vision for a better compilation model before we can think about how to incrementally improve (or maybe even rewrite from scratch) the compilation model of rustc to prevent painting ourself into a corner with suboptimal compilation performance while at the same time having bad integration with the C compilation model and permanently bad dylib support

Very much agreed. While we have a bunch of practical motivations and specific goals, along with point fixes for specific problems, I'd love it if it also helps contribute to a "how can we systematically fix Rust's compilation model" conversation. There are a ton of engineering tradeoffs which need to be taken into consideration, and there's going to be a fair amount of give and take, but there's a lot of scope for improvement.

bjorn3 commented 2 years ago

Thanks for the extensive writeup! A couple of notes:

(The allocator shim is gone now, right?)

Not yet and even with my PR it will only be gone if #[global_alloctator] is used.

In practice it looks like it's actually fine to include both, since they have non-conflicting symbols.

They have conflicting symbols. It may just not look like it because one uses #[rustc_std_internal_symbol] which uses an unmangled name while the other uses a lang item where rustc chooses the same symbol name as the other crate.

I'm not completely sure what the effect of specifying --panic=abort/unwind is, and what the implications of different crates being compiled with different panic handling modes is, or what the constraints are.

If any crate uses -Cpanic=abort, panic_abort must be linked to avoid UB. Otherwise both are technically fine, but panic_unwind is preferd as otherwise compiling object files with unwinding support is kind of pointless.

jsgf commented 2 years ago

Thanks for the clarifications.

They have conflicting symbols. It may just not look like it because one uses #[rustc_std_internal_symbol] which uses an unmangled name while the other uses a lang item where rustc chooses the same symbol name as the other crate.

Right, we'll need to look into that.

If any crate uses -Cpanic=abort, panic_abort must be linked to avoid UB. Otherwise both are technically fine, but panic_unwind is preferd as otherwise compiling object files with unwinding support is kind of pointless.

I see - so we should consider a -Cpanic=abort as a build-wide config option - ie, something you specify on the top-level target (ie binary) and propagate to all its dependencies, along with the corresponding sysroot dependency.

jsgf commented 2 years ago

Oh, I meant to add:

Such a feature would likely require every crate to be compiled with it. For as long as we don't have stable -Zbuild-std that means the standard library would have to be compiled with this feature, but that would break build systems depending on rlibs containing useful object files

@mhammerly has prototyped generating build rules for the sysroot libraries (using reindeer) so that they're built on the fly along with everything else. As far as the build system is concerned, they're just some slightly strange third-party libraries.

So something that requires standard library to be built with specific flags is not necessarily a blocker (though it does mean yet more stuff to stabilize if its required for a stable implementation of all this).

hlopko commented 2 years ago

Thank you all for your inputs, it's great that this issue is receiving attention from people with such varied perspectives! I'll try to represent the Bazel Rust (and C++) community here. On the level relevant to this issue there is no significant difference between Bazel, Buck, or GN. It's actually quite surprising how similarly we ended up doing things in Bazel/Blaze and in Buck (including the hacks :).

I think it's the same in Buck, so I'll only reiterate that discovering new dependencies from the information in the source code or during Rustc invocation in Bazel is inconvenient and hard (maybe actually impossible) to do with acceptable build speed at scale. Therefore we don't plan to support controlling things like #[global_allocator], #[link], or use panic_abort as _; in the user code. Those will be controlled through Bazel command line flags (and other mechanisms that we don't need to think about in this conversation) such as –custom_malloc, or through BUILD files. We also control whether std or only core should be provided on the Bazel level.

Because Bazel knows about these global properties, it could ask Rustc to generate the right allocator shims or pick the right rlib with the panic handler. We're fully in line with Buck, rustc --crate-type allocator-shim solution is ideal for us too. Important detail is that Bazel does this once for the whole build. If we had to ask Rustc to emit shims in a pre-linking step, that would have to happen once for each binary in the build (tests are binaries too, and we have many).

Just like Buck, we also have a bootstrapping build of Rustc and the standard library in Bazel (not yet open sourced). We know it's not a supported scenario, and we are paying the ongoing maintenance costs to keep it working (and to keep Rustc buildable with LLVM@HEAD). It would be great if we could make the bootstrap build and sysroot details more supported eventually, but we definitely don't want to stabilize those things in this issue. Having a stable output format is a great first step. Having a unifying vision for a better compilation model obviously sounds great and I'm happy to contribute.

bjorn3 commented 2 years ago

The allocator shim is an implementation detail and would likely be removed in favor of weak symbols in libstd if weak symbols were universally supported across all platforms. https://github.com/rust-lang/rust/pull/86844 in combination with a single mandatory #[global_allocator] somewhere would be better I think. If necessary this could be automatically provided by the build system in the form of compiling #[global_allocator] static ALLOC: std::alloc::System = std::alloc::System;.

hlopko commented 2 years ago

As long as there is a stabilized way of wiring Rust with the system allocator I think our use case is covered. I don't see any reason why your solution wouldn't work for us, so thumbs up :)

jsgf commented 2 years ago

The allocator shim is an implementation detail and would likely be removed in favor of weak symbols in libstd if weak symbols were universally supported across all platforms

As an implementation detail, it would seem fine to use in the cases where weak symbols are not available, though it would obviously nicer to have a single mechanism rather than having to support two. I guess it depends on how common platforms without weak symbol support are? Oh, that includes obscure little systems like Windows. Though maybe you could do something functionally equivalent at the library level with the library search order and/or /DEFAULTLIB:.

jsgf commented 2 years ago

Oh, another thing I forgot to mention -

The libstd build process copies parts of the llvm runtime into itself - such as clang_rt and the various sanitizer runtimes. This leads to duplicate symbols if you're linking with sanitized C/C++ code (or sometimes not). Instead rustc should link with the llvm-provided libraries when the sanitizer options are enabled. (This assumes the rustc llvm sanitizer is compatible with the clang llvm sanitizer. We guarantee this by compiling rustc with the same llvm that we use for clang, which also makes sure that LTO works.)

durin42 commented 2 years ago

That's also what we're moving towards (rapidly) for rustc/clang compatibility. We'll probably start working on ThinLTO support in the bazel rules for Rust in the next week or two.

The duplicate symbols for mixed C++/Rust sanitizer builds hadn't occurred to me. I'll file an internal tracking bug about that, thanks.

dureuill commented 2 years ago

Hello and apologies for bumping this issue,

We're hitting this big time as we try to scale to multiple independent CXX libraries in our CMake, C++ based project.

Lack of stability promises apart, are there known immediate issues to linking directly the rlibs with e.g. clang++ or g++?

As a way to avoid looking for the system rlibs, I'm experimenting with building a libbaserust.a through an empty cargo project configured to produce a staticlib.

This allows to switch from:

g++ test.cpp -std=c++14 target/debug/build/rs2cpp-72074e3e9d2e0779/out/librsffi.a target/debug/librs2cpp.rlib target/debug/deps/*.rlib /home/tetrane/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/*.rlib -pthread -ldl

to:

g++ test.cpp -std=c++14 target/debug/build/rs2cpp-72074e3e9d2e0779/out/librsffi.a target/debug/librs2cpp.rlib target/debug/deps/*.rlib base-rust/target/debug/libbaserust.a -pthread -ldl

Is that worse or the same as the full rlib based approach?

Thank you for reading

bsilver8192 commented 2 years ago

I've got it working with bazelbuild/rules_rust#1350. That's linking the rlibs, plus a .c file to handle the allocator shims. Seems to be pretty robust for Linux without sanitizers, but it does require Rust-version-specific tweaks to the ordering for the system rlibs and contents of the .c file.

bazelbuild/rules_rust#1238 has some more discussion. Somewhere in the discussion related to that (can't find it now) there was mention of several other projects using similar approaches.

dureuill commented 2 years ago

Ah I don't seem to be encountering the issue with allocators in the libbaserust.a version.

I understand that it is an issue when linking with only the rlibs?

bsilver8192 commented 2 years ago

Hmmm, not sure. Maybe the staticlib includes those symbols? But I'm not sure if it's guaranteed to include all of them or just the ones used by the Rust code in it.

dureuill commented 2 years ago

Just to make sure we're talking about the same issue, when attempting to build with the following command line:

g++ test.cpp -std=c++14 target/debug/build/rs2cpp-72074e3e9d2e0779/out/librsffi.a target/debug/librs2cpp.rlib target/debug/deps/*.rlib /home/tetrane/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/*.rlib -pthread -ldl

I get linker errors like:

/usr/bin/ld: /home/tetrane/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-b8438dc0bcbbcc08.rlib(alloc-b8438dc0bcbbcc08.alloc.bf51ef42-cgu.0.rcgu.o): in function `alloc::alloc::dealloc':
/rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e//library/alloc/src/alloc.rs:107: undefined reference to `__rust_dealloc'
/usr/bin/ld: /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e//library/alloc/src/alloc.rs:107: undefined reference to `__rust_dealloc'
/usr/bin/ld: /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e//library/alloc/src/alloc.rs:107: undefined reference to `__rust_dealloc'
/usr/bin/ld: /home/tetrane/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-b8438dc0bcbbcc08.rlib(alloc-b8438dc0bcbbcc08.alloc.bf51ef42-cgu.0.rcgu.o): in function `alloc::alloc::realloc':
/rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e//library/alloc/src/alloc.rs:126: undefined reference to `__rust_realloc'
/usr/bin/ld: /home/tetrane/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-b8438dc0bcbbcc08.rlib(alloc-b8438dc0bcbbcc08.alloc.bf51ef42-cgu.0.rcgu.o): in function `alloc::raw_vec::finish_grow':
/rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e//library/alloc/src/raw_vec.rs:(.text._ZN5alloc7raw_vec11finish_grow17h0cbfb70809799e45E+0x50): undefined reference to `__rust_alloc'
/usr/bin/ld: /home/tetrane/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-b8438dc0bcbbcc08.rlib(alloc-b8438dc0bcbbcc08.alloc.bf51ef42-cgu.0.rcgu.o): in function `alloc::alloc::handle_alloc_error::rt_error':
/rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e//library/alloc/src/alloc.rs:383: undefined reference to `__rust_alloc_error_handler'
/usr/bin/ld: /home/tetrane/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-b8438dc0bcbbcc08.rlib(alloc-b8438dc0bcbbcc08.alloc.bf51ef42-cgu.0.rcgu.o): in function `alloc::alloc::alloc':
/rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e//library/alloc/src/alloc.rs:89: undefined reference to `__rust_alloc'
/usr/bin/ld: /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e//library/alloc/src/alloc.rs:89: undefined reference to `__rust_alloc'

Is that the issue at hand with the allocators?

If so, when making an additional empty rust project rustbase and setting it to compile with staticlib, and lastly using this static lib on top of the rlibs from my actual project:

g++ test.cpp -std=c++14 target/debug/build/rs2cpp-72074e3e9d2e0779/out/librsffi.a target/debug/librs2cpp.rlib target/debug/deps/*.rlib base-rust/target/debug/libbaserust.a -pthread -ldl

then the link is successful. My (limited) understanding is that whatever "base rust setup" needed must be included even in the simplest staticlib. This behavior might be OS and toolchain dependent, though, and could also have "blind spot".

bsilver8192 commented 2 years ago

Yep, that's the allocator shims. There's more than just __rust_alloc and __rust_dealloc that you might hit with more complex code, and I'm not sure if your staticlib will include those or not.

dureuill commented 2 years ago

Do we know of any way of triggering the need for the missing functions from standard code, so that I could test if the staticlib include these?

nm on the staticlib tells me that it contains ("T" type symbols) __rust_alloc, __rust_alloc_error_handler, __rust_alloc_error_handler_should_panic, __rust_alloc_zeroed, __rust_realloc, rust_oom, __rdl_alloc, __rdl_dealloc, __rdl_realloc, __rdl_alloc_zeroed, __rdl_oom and __rust_dealloc

(it also contains symbols from the std, which is part of the point I guess :-))

hlopko commented 2 years ago

This is the file with definitions that we use with the latest rustc: https://github.com/google/crubit/blob/main/common/rust_allocator_shims.c.

This is where rustc declares them: https://stdrs.dev/nightly/x86_64-unknown-linux-gnu/alloc/alloc/index.html.

I'd expect that staticlib defines all of these. Still, using a mostly empty staticlib and then .rlibs is not officially supported mode, just as it is unsupported to define our own allocator shims :)

bjorn3 commented 2 years ago

I got a PR for removing the need for the allocator shim when using #[global_allocator]: https://github.com/rust-lang/rust/pull/86844 I need to update it some day.

dureuill commented 2 years ago

FWIW, if you're able to relax a bit the requirements of only compiling each lib once, we got a setup working with shared libraries.

Each Rust project is configured to generate a staticlib (rustlib.a), then this rustlib.a is linked with a C++ shim (since we're using cxx we have to compile the C++ part of the bridge anyway, so that's our shim) and into a shared library lib.so.

We pass -Bsymbolic to the linker when building the shared libraries, so that each of them lookup symbols internally first, then globally if missing, so as to limit the impact of diamond dependencies.

This works rather well with cxx, because most of the calls will be through the bridge part, which is typically unique to each .so. We just have to ensure that we use the same version of cxx (and of rust, probably) across all .so when using "vocabulary types" of cxx, such as rust::str from the C++, because it will pick the corresponding symbols (ctor, dtor, ...) in one of the .so, and then pass the resulting object to the .so corresponding to the bridge.

This caveat apart, I believe that this setup should minimize duplicated symbols issues and does not rely on unstable behavior, at the cost of using shared libraries (no more single executable, duplicated code in each shared library for a pretty hefty size).

(I realize that relaxing these requirements is probably not an option for OP, but I'm posting this assuming that others might have different requirements, and in the hope that my experience might be useful)

petrochenkov commented 2 years ago

I wanted to confirm one thing. Scenarios in which people want to use rlibs as "regular static libraries" also assume -Z link-native-libraries=false, right? I.e. the linking of libraries is managed entirely by some outer build system rather than rustc.

In https://github.com/rust-lang/rust/issues/99429 we may want to change representation of native libraries bundled into .rlib files, and it would be easier to do if we knew that third parties do not need to rely on the currently used representation (all individual object files from native libraries are copied to the rlib).

UPD: Hmm, it looks like -Z link-native-libraries is not actually respected when bundling native libraries into rlibs, not sure whether it's intentional or not.

bsilver8192 commented 2 years ago

I wanted to confirm one thing. Scenarios in which people want to use rlibs as "regular static libraries" also assume -Z link-native-libraries=false, right? I.e. the linking of libraries is managed entirely by some outer build system rather than rustc.

+1 for my use case being like this, I want to manage those via Bazel.

jsgf commented 2 years ago

@petrochenkov Yes, that's what I'd expect - the plan here is to make the build system deal with all dependencies uniformly, Rust and non-Rust.

UPD: Hmm, it looks like -Z link-native-libraries is not actually respected when bundling native libraries into rlibs, not sure whether it's intentional or not.

No, I think this is a bug, and I think I got this the wrong way around when I last mentioned it. Right now it has the effect of ignoring the bundled native library references at link time, but they're still included. I think it would be much more useful to make this skip embedding on a crate-by-crate basis, and have a separate option to ignore any bundled libraries at link time.

This would allow a more incremental approach of skipping bundled native libraries on a case-by-case basis, but still allow them for libstd (until it has fully specified dependencies).

pcwalton commented 1 year ago

Hello! I've posted a pre-RFC for minimal stabilization of the rlib format on internals. Please feel free to comment there. Thanks!

bjorn3 commented 1 year ago

As of https://github.com/rust-lang/rust/pull/86844 (scheduled for the 1.71 release) if you are directly linking the rlibs of the standard library rather than letting rustc handle linking, you will now need to define a static named __rust_no_alloc_shim_is_unstable which is at least 1 byte big. In addition if you are using #[global_allocator], you must stop defining __rust_alloc, __rust_dealloc, __rust_realloc and __rust_alloc_zeroed as they are now directly defined by the #[global_allocator] expansion rather than as part of the allocator shim. If you are using the default allocator in libstd you will need to keep defining them though.

Lupus commented 1 year ago

Is there somewhat future-proof workaround available as part of some open-source project that one could leverage maybe? There are some recipes in this discussion, but it's not clear how one could reconstruct the required "ugly hacks" to get going while we waiting for the right solution to make it to the upstream.

I'm building some bindings from Rust to OCaml, everything worked great until I tried to link two such bindings libraries in one binary, which lead me here with a bunch of linker errors at hand...

keith commented 1 year ago

Folks using bazel and rules_rust workaround this today, so you can trace back that code or an example link command there to see the result

danakj commented 1 year ago

The current work around is to build rust rlibs, not staticlibs, and link those as you would .a files. You must explicitly link the stdlib rlibs as well though.

Lupus commented 1 year ago

Is compilation of an empty crate as a staticlib still a decent approach to not hunt individual stdlib rlibs? Bazel rules around that are quite wordy as it seems...

Other than that, one needs to parse cargo manifest, build dependency graph, build a list of rlibs required for particular crate, pass that list to the linker?

durin42 commented 1 year ago

AIUI, it's a decent approach if-and-only-if you can consolidate all your Rust bits into a single unified target. Otherwise you run the risk of pulling stuff in twice, which either bloats your binary or causes linker errors depending on how you do it.

tgross35 commented 11 months ago

@pcwalton did you ever move forward from pre-rfc to rfc with that?

I don't think I've seen this yet (it's a long thread...) but could rustc maybe gain the ability to turn a rlib into a .a? It would be a pretty fast operation that would let us do something before we get to the point of stabilizing rlib (still +1 for the RFC of course) and without needing to figure out a non-rlib distribution of std.

bjorn3 commented 11 months ago

We already have the staticlib crate type for bundling everything into a single .a file. Turning individual rlibs into .a files won't work as it would either duplicate the allocator shim and such between every such .a or or omit them from all and produce unlinkable .a files. One idea I have is to produce a new crate type which is to staticlib as dylib is to cdylib. It would act like a dylib with respect to producing the allocator shim and bundling multiple crates, but produce a .a file instead of a .so file and add the necessary crate metadata to allow consuming it like a regular crate from rustc. See also the end of https://github.com/rust-lang/rust/issues/111594#issuecomment-1550087206

rust-lang / rust