Open adetaylor opened 4 years ago
The second approach is not officially supported. An rlib is an internal implementation format within Rust, and its only client is rustc. It is naughty to pass them directly into our own linker command line.
But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.
rlib
is not a complete package you may need to link. It could very well just be a section containing just rmeta
(serialized "metadata" representing checked Rust code) from which rustc
would generate code at the time it is "forced" by generating a binary output (link/staticlib/cdylib). The fact that rlib
contains any machine code at all is just an optimisation.
What you really want is a staticlib-nobundle
(not an actual thing) or something similar that would not bundle the the dependencies into produced .a
.
Related reading at https://github.com/rust-lang/rust/issues/38913
I'm still don't understand what are the requirements exactly, but if rlibs (as they exist now) work for you, but staticlibs don't, then perhaps you need some variation of staticlibs with different bundling behavior for dependencies, rather than rlibs.
The fact that
rlib
contains any machine code at all is just an optimisation.
Yep.
What you really want is a
staticlib-nobundle
(not an actual thing) or something similar that would not bundle the the dependencies into produced.a
.
I agree - this is exactly what we need.
@adetaylor
Could you elaborate a bit.
Suppose you have a Rust crate root
with some tree of dependencies
imm1 - dup
/
root
\
imm2 - dup
\
indirect - native
where imm1
, imm2
and dup
are other rust crates and native
is some native dependency like C static library.
What the expected result of compiling root
is supposed to be?
Right now, it would be a staticlib
and that's fine.
The problem comes in this scenario:
imm1 - dup
/
root
\
imm2 - dup
root2
\
imm2 - dup
c_plus_plus_binary_a
\
root2 - [...]
c_plus_plus_binary_b
\
root1 - [...]
root1 - [...]
/
c_plus_plus_binary_c
\
root2 - [...]
In this case, we can't build root
and root2
as staticlibs, because they will conflict when we come to build c_plus_plus_binary_c
.
I don't mean right now, I mean what compilation of root
and root2
should produce to fit into your use case.
libroot.a
+ libimm1.a
+ libimm2.a
+ libdup.a
+ libroot2.a
+ some file in json or other machine-readable format describing the dependency trees?
Exactly! Except that the JSON file probably isn't necessary. Our build system (gn + ninja) works extremely hard to establish static dependency relationships in advance, so that a rebuild builds the minimal required set. We would expect to use those relationships in the subsequent linker invocation unless there were some reason that we needed to query it dynamically.
This seems like a reasonable feature to me to support. The goal of rlibs was to reserve space for rustc to "do its thing" so we intentionally stabilize some format rather than accidentally do so. I don't think there's a need for a "staticlib-nobundle" because that's basically what an rlib is. While @nagisa is right in that rlibs can have varying contents, I think that it's fine to say for a use case like this you'd pass some flag to rustc saying "make sure I can pass this to the linker". Rustc, after all, goes to great lengths to pass rlibs directly to the linker to make linking fast.
I think this could work well perhaps with -Z binary-dep-depinfo
(I forget the exact flag name). That'd basically tell you "here's all the other rlibs you need to link for this target" which would locate the standard library for you and other deps (which you'd probably ignore).
The gotchas that I could think of are:
For allocators we'd probably just need to say "no custom allocator" or something similar. You'd probably run rustc with no arguments saying "generate me the allocator *.a
file to pass to the linker", and that'd just run as part of your build system. That's also where we could shove anything else if it comes up in the future.
You'll be on your own for deduplicating Rust dependencies across dynamic libraries. In your example of c_plus_plus_binary_c
it's generally asking for trouble if for root1
and root2
are dynamic libraries and try to share dependencies. The compiler has internal support for this but you'd just be on your own. (this is totally fine, just wanted to point out!)
Overall I think we could get away with this by carefully designing how a "stable" pass-things-to-the-linker process would work. We could make sure to solve the constraints here while also giving ourselves some leeway so rustc can still change things in the future if needed. It seems worthwhile to me to integrate into other build systems like this, and it's definitely a pain point with rlibs since inception.
I think that it's fine to say for a use case like this you'd pass some flag to rustc saying "make sure I can pass this to the linker".
My suspicion is that the best and most holistic implementation of this kind of feature is represented by having a different crate-type alongside rlib
. Implementing functionality that reliably satisfies the requirements outlined by the OP can require a fair number of adjustments to how monomorphizations are generated, where they end up and what their visibility ends to being. I think the answer to these and many other similar such questions are meaningfully different enough for rlib
and the hypothetical libstatic-nobundle
that the two shouldn't be conflated.
EDIT: Regardless of which way it goes, this feels to me like fairly major change in at least stance on what rlib
is, would require extensive design work, a RFC and a dedicated implementer willing to implement said RFC.
Just to say: thanks all for the comments. I'm delighted that this doesn't seem like a completely idiotic request. From our (user's) point of view it is equally easy if this is a crate-type=static-nobundle
as opposed to crate-type=rlib
, either's fine.
We will bear in mind that this seems like a plausible way forward as we continue to think about our linking strategies. We'll go quiet for a bit, but it's possible that in a few months we will pop up here and say "hi, we'd like to implement this, here's an RFC...". Watch this space. Obviously if anyone else gets there first that's even better :)
@adetaylor may I ask: have you found a workable solution? Or did you have to abandon Rust because of this issue?
In my experimentation I have used --emit obj
to produce an object file, and create an archive from this (and other C++ object files part of the same library) using ar
. I then use the system linker to produce the final artifacts. Is this not a viable solution?
In my experimentation I have used --emit obj to produce an object file, and create an archive from this (and other C++ object files part of the same library) using ar. I then use the system linker to produce the final artifacts. Is this not a viable solution?
That doesn't work when using liballoc as the allocator shim is only generated when compiling a crate that needs to be linked and even then doesn't get included in the --emit obj
object file, but a separate object file that gets deleted after linking. It can't be included in the main object file as --crate-type lib --crate-type bin
in which case the allocator shim must be included for the bin, but must not be included for the lib.
@bbjornse We haven't abandoned Rust, but we also haven't made any real progress in this direction.
I know several other teams at various organizations are using the approach of building rlibs then passing them all into the final C++ linker. This still works. It still requires nasty hacks as described in this issue description, and could break at any time.
That doesn't work when using liballoc as the allocator shim is only generated when compiling a crate that needs to be linked and even then doesn't get included in the --emit obj object file, but a separate object file that gets deleted after linking.
Interesting, thanks. Could you please explain: what exactly is the allocator shim, and why is it generated rather than part of a library? Does this relate to the __rust_alloc
=> __rdl_alloc
remapping mentioned in the issue description?
Also, do you know if this is the only challenge with the object file approach? Are there other pieces which might be missing?
@bbjornse We haven't abandoned Rust, but we also haven't made any real progress in this direction.
I know several other teams at various organizations are using the approach of building rlibs then passing them all into the final C++ linker. This still works. It still requires nasty hacks as described in this issue description, and could break at any time
Thank you for your reply. This is somewhat disheartening. I'm still not abandoning hope for the --emit obj
approach though: as far as I could tell the output was similar to that of the rlib, sans rmeta sections, and as you say using the rlibs directly can be made to work. Although passing rlibs directly to the native linker might never be officially supported, it sounds reasonable to me to expect that object files generated with --emit obj
can be used in this way.
Interesting, thanks. Could you please explain: what exactly is the allocator shim, and why is it generated rather than part of a library?
The allocator shim is what forward the allocator calls in liballoc to either a #[global_allocator]
or the default allocator in libstd. Expanding it in place for #[global_allocator]
won't work because if the crate with the #[global_allocator]
and a crate linking to libstd get linked together the libstd default allocator wins as the dylib needs to have an allocator shim to link and the only allocator available when linking the dylib is libstd's default allocator.
Does this relate to the __rust_alloc => __rdl_alloc remapping mentioned in the issue description?
Correct
Also, do you know if this is the only challenge with the object file approach? Are there other pieces which might be missing?
Honestly I am not sure.
Although passing rlibs directly to the native linker might never be officially supported, it sounds reasonable to me to expect that object files generated with --emit obj can be used in this way.
Neither rlibs, nor --emit obj
contain the allocator shim. In fact they can't. For rlibs this is the reason mentioned earlier. For --emit obj
this is because for --crate-type lib --crate-type dylib
--emit obj
will generate a single object file, but the rlib must not have the allocator shim and the dylib must have it.
I have been thinking about the rust compilation model lagely because of this and certain other deficiencies. No concrete actionable improvement yet though.
Thank you for your explanation. Have I understood correctly that
#[global_allocator]
, then that crate defines symbols __rg_alloc
etc, and all references to __rust_alloc
etc should be forwarded to __rg_*
equivalents.__rust_alloc
etc should be forwarded to __rdl_*
equivalents.--crate-type {bin,dylib,cdylib,staticlib} --emit link
). In particular, they are not generated for --crate-type lib --emit rlib,obj
.I realize I don't understand why the rlib cannot include the shim. I see that it's only necessary when producing the final artifact, but I don't see how it would hurt to generate it immediately? (Unless I've misunderstood and #[global_allocator]
is crate-local, of course.)
To be clear, the relation between my questions and this issue is whether --emit obj
can do the trick instead of formal support for linking with rlibs or a hypothetical new static-nobundle
crate type. Although --emit obj
suffers from the same allocator shim problems and standard library dependency discovery problems as noted in the OP, it does not suffer from the problem that rlibs aren't intended to be passed to the system linker. Rather, the object file output strikes me as something intended to be passed to the system linker. And I'm still optimistic that the two other issues can be solved.
I'm not sure what the object file is guaranteed to include and not include, though. Crucially, is it e.g. guaranteed to not include dependencies, thus avoiding diamond dependency problems and being an improvement over the use of staticlib
?
I realize I don't understand why the rlib cannot include the shim. I see that it's only necessary when producing the final artifact, but I don't see how it would hurt to generate it immediately? (Unless I've misunderstood and #[global_allocator] is crate-local, of course.)
If you link to a dylib thaf already has an allocator shim, the rlib must not define an allocator shim, but the allocator shim of the dylib must be used. At the time of generating the rlib it is not yet known if it will be linked against a dylib. Only when compiling the final artifact.
I see, thanks. Is this only when the rlib shim would use the default allocator? Would it hurt to immediately generate the shim for crates which include a #[global_allocator]
?
If crates require a shim at all, this means they use liballoc, right? And the prebuilt libstd already includes an allocator shim forwarding to the default allocator. How come this works, when another dylib might be providing a global allocator? Could a similar technique, linking with a "default allocator shim" dylib, work in a liballoc +no-std setting?
When you dynamically link against libstd or any other dylib that uses liballoc, #[global_allocator]
will be silently ignored. Instead the allocator shim embedder in said dylib will be used. This wouldn't be possible if rlibs that use #[global_allocator]
immediately created the allocator shim.
It seems to me that the --crate-type
switch controls the main output type of rustc
(e.g. rlib
, staticlib
, bin
, etc.) while the --emit
switch can optionally ask rustc
to retain additional, intermediate file types (e.g. obj
, but also mir
, llvm-bc
, dep-info
, etc.). It feels that the --emit
switch can be useful for debugging what is happening during compilation, but should not be depended upon for end-to-end results, since the intermediate/internal files produced when building an rlib
crate may change in the future. From this perspective, depending on the --emit=obj
switch doesn't seem significantly different or less risky than depending on the details of what --crate-type=rlib
produces.
I see that @nagisa has raised concerns that monomorphization within rlib
might be fundamentally incompatible with the requirements here. FWIW, I’ve tried to explore this a bit and didn’t find anything broken so far - please shout if I missed anything.
In particular, I tried building the following things:
pub fn sub_in_crate1<T: Sub<Output = T>>(a: T, b: T) -> T { a - b }
#[no_mangle] pub fn sub_from_7_in_crate2a(x: i32) -> i32 { crate1::sub_in_crate1(7, x) }
#[no_mangle] pub fn sub_from_42_in_crate2b(x: i32) -> i32 { crate1::sub_in_crate1(42, x) }
There would be indeed 2 copies of the monomorphized sub_in_crate1::<i32>
:
$ nm out/rel/rust_odr_test | grep crate1
00000000001f23d7 T _ZN20rust_odr_test_crate113sub_in_crate117h56a8a4a969227185E
00000000001f2422 T _ZN20rust_odr_test_crate113sub_in_crate117hf5a81c2f183f301cE
But this seems fine:
(More concrete and complete example is in a Chromium-specific code review tool: see here)
I guess the next step would be introducing an RFC (as suggested above), but before doing that it is probably desirable to
link_staticlib
to skip adding each_linked_rlib
in presence of a new, hypothetical-Z staticlib-with-no-transitive-deps
switch)WDYT?
(--crate-type vs --emit)
One problem right now is that --emit
is currently hashed into the crate hash, so, for example, --emit metadata
and --emit link
will generate different crate hashes for otherwise identical command lines. I'd really prefer this weren't the case, but I think this particular thing isn't relevant to this issue since by the time we get to linking crate hashes and rmeta should be irrelevant.
But aside from that, I'd agree that crate-type and emit have somewhat overlapping responsibilities (broadly, what kind of file(s) do I want?), but I'd characterize it as "crate-type sets the intrinsic properties of the output file" and "emit selects a number of auxillary outputs which can be cheaply generated along the way". But --emit obj
and --crate-type staticlib
are very close to each other really, ignoring the whole "and link in all the dependants" thing. (Does --emit obj
also force codegen-units = 1, or is that just --emit asm
?)
00000000001f23d7 T _ZN20rust_odr_test_crate113sub_in_crate117h56a8a4a969227185E 00000000001f2422 T _ZN20rust_odr_test_crate113sub_in_crate117hf5a81c2f183f301cE
I'm curious how this looks with -Zsymbol-mangling-version=v0
, so that whatever's causing the difference isn't hidden in the hash.
I'm super interested in this effort. Currently Buck uses staticlib
for C++ -> Rust dep edges, and I'd really prefer to have this be rlib
/staticlib-with-no-transitive-deps
and set up the dependencies to arrange all the other stuff appear on the final (C++) link line.
(--crate-type vs --emit)
I agree that depending on --emit obj
, without formal or documented guarantees about what the output will contain, feels fragile (like depending on linking with rlib). Case in point, when experimenting with this I found that --emit obj
works differently with --crate-type lib
and with --crate-type staticlib
.
I don't feel that --emit
is just a debugging option, though; rather, I understand it as an option to make rustc
a powerful stand-alone tool usable in a non-Cargo context.
Reasons I still think --emit obj
is a reasonable alternative:
--emit obj
is guaranteed to include object code. This is an improvement over rlib, where the presence of object code is an implementation detail (as noted above.)staticlib
without "link in all the dependants" - which is (almost) exactly what we want (i.e. an improvement over --crate-type staticlib
)--crate-type
, and in what contexts the output will be usable.Case in point, when experimenting with this I found that
--emit obj
works differently with--crate-type lib
and with--crate-type staticlib
.
Isn't this expected if staticlib
links in all dependencies and lib
doesn't? Or were there other differences?
--emit obj
doesn't link in dependencies as it doesn't link at all. It just emits an intermediate object file that would be given to the linker otherwise.
Ah. I was imagining a relocatable object file, like what ld -r
produces, for staticlib
. Not sure what behavior is actually desirable in that case.
I found the format of --emit obj
to change w/ --release
.
What do you mean with format? File format? If so from which file format did it change to which?
Oh, and are you using cargo or rustc directly? If cargo, there is no stable way to locate all object file necessary for linking as dependencies are not stored in a stable location.
Working on https://bugs.chromium.org/p/chromium/issues/detail?id=1296284 I have a better idea I think of what is needed or not needed here.
Requirements:
--crate-type=staticlib
.)Current solution: We use an .rlib file as a collection of .o files, and we must link it with -Wl,--whole-archive
.
The proposal to use --emit=obj makes a lot of sense to me, it is a Rust .o file, which is what we need.
However, an .rlib is a collection of .o files because rustc splits the compilation up (via --codegen-units
). So I am left with a question:
Does --emit-obj cause slower compilation by forcing --codegen-units=1
? If so, that would contradict requirement #3.
Otherwise, --emit=obj producing different outputs is not necessarily a problem, they may not be meaningful. Perhaps they are artifacts of different codegen units for example. We'd need to understand what the differences are.
We could easily make a mode that emits an archive of object files, I'm not sure what purpose --emit=obj
has in not doing that. Maybe the option predates having multiple codegen units, and now we need to preserve it for backwards compatibility?
My current understanding of this is that the requirements can be split in three:
Looking at how each of these requirements are addressed by different alternatives:
I suspect this is already satisfied with --emit obj
, with the only missing piece being a documented guarantee.
But, the point about --codegen-units
being forced to 1 is a valid concern. A quick search through rustc sources indicates this is true if we use --emit obj=<dstfile>
- but not if we just use --emit obj
(compiler/rustc_session/src/config.rs, function should_override_cgus_and_disable_thinlto()
). Maybe someone knows if this is correct? EDIT: if this is true, I suspect users needing this functionality could use --emit obj --out-dir=<outdir>
and ar rcs cratename.a <outdir>/*
. Of course, --emit archive=<file>
is preferable :slightly_smiling_face:
library-archives
could be added as a rustup
component containing .a
equivalents to the shipped .rlib
s.
A third option could be to add a new mlib
type (multi-lib) or similar, working just like rlibs do today (i.e. both for linking and for rust compiltaion), and distribute those with the Rust toolchain.-emit=obj
, linking rlibs, or linking with the output of some new crate type staticlib-nobundle
. The decision about what shim to use cannot be taken until link time, as only then do we know what shim should be used.quick search through rustc sources indicates this is true if we use --emit obj=
- but not if we just use --emit obj (compiler/rustc_session/src/config.rs, function should_override_cgus_and_disable_thinlto()). Maybe someone knows if this is correct?
https://github.com/rust-lang/rust/blob/4c55c8362de32be1530b2441c3e3a51e73edeb21/compiler/rustc_session/src/config.rs#L288 forces a single cgu for any --emit obj
I think.
I wonder if instead of allowing rlibs to be linked outside of rustc we could allow C object files to be linked with rustc, so rustc takes the place of the linker, handling any orchestration for rust libraries and then forwarding to the system linker or any other linker provided with -Clinker
.
forces a single cgu for any
--emit obj
I think
I see this function where false
is returned is named is_compatible_with_codegen_units_and_single_output_file()
- and the code using it at line 1537 seems to only look at emit-flags with an explicit path:
let incompatible: Vec<_> = output_types
.0
.iter()
.map(|ot_path| ot_path.0) // This one - only proceed if we actually have an explicit path?
.filter(|ot| !ot.is_compatible_with_codegen_units_and_single_output_file())
.map(|ot| ot.shorthand())
.collect();
Which is what makes me suspect that it only forces cgu=1 with an explicit path?
I wonder if instead of allowing rlibs to be linked outside of rustc we could allow C object files to be linked with rustc, so rustc takes the place of the linker, handling any orchestration for rust libraries and then forwarding to the system linker or any other linker provided with
-Clinker
.
I suspect this is a poorer fit for C++ codebases wanting to integrate Rust code into what they have. The link setup would then have to be recreated exactly as it were before, but indirectly through rustc. This is more complex both to read and write than just writing the desired linker invocation directly. It is also more fragile, as rustc must be trusted to produce the exact same linker invocation across rustc versions (sans differences in rust library names), and of course across runs.
Which is what makes me suspect that it only forces cgu=1 with an explicit path?
ot_path
here has the type (OutputType, Option<PathBuf>)
I think. None
here represents the case where no path is given. That line you mentioned ignores the Option<PathBuf>
and only returns the OutputType
.
Semantically --emit obj
seems fine to me, but the codegen units = 1 problem would be a real practical blocker. Something that emits a plain .a file would be a fine replacement and seems straightforwardly compatible with multiple cgus. In practice this could be a plain .rlib, or an rlib with an extra step of stripping out the rmeta files if that confuses the linker.
Ideally this would also support thin archives in environments which have such a concept, to avoid some needless file copies (which I think is an orthogonal potential improvement for rlibs).
@bjorn3:
I wonder if instead of allowing rlibs to be linked outside of rustc we could allow C object files to be linked with rustc, so rustc takes the place of the linker, handling any orchestration for rust libraries and then forwarding to the system linker or any other linker provided with -Clinker.
I don't think that would work in practice. For the cases I'm interested in, the linking step is immensely complex with lots of variants, the amount of code is enormous and mostly C++, with Rust comprising probably a fraction of a percent. It would be very hard to justify reengineering the build process to do linking via rustc
. (The only way I can see that happening is if rustc had a wrapper to allow it to be a drop-in replacement for the existing linker. But if that were possible, I'm not sure what the benefit would be since it wouldn't be able to do any Rust-specific things.)
For the cases where the top-level binary is Rust, then we do use rustc
as the linker, and inject all the non-Rust objects via -Clink-arg
etc so they can be consumed by the underlying linker.
Somewhat unrelated aside: What's the distinction between --emit
and --crate-type
? For the most part I think the options make sense to me, but using --crate-type
for cdylib
/staticlib
seems like an exception; they seem much closer match for --emit
. But I don't have a crisp mental model for distinguishing when which option is the most appropriate. --crate-type
for "binary" vs "library" makes sense, and --emit
for "deps"/"asm" seem reasonable. But the middle ground of "rlib" vs "dylib" vs the native link variants seem like they could also be emit. And then there's the problem that --emit metadata
's meaning changes depending on what other emit options you have.
That line you mentioned ignores the Option
and only returns the OutputType.
Ah, you are right of course. Feeling stupid now :slightly_smiling_face:
Not giving up though, I'll suggest one more way in which the is_compatible_with_codegen_units_and_single_output_file()
function can live up to its name. Looking at the code using the list of incompatible output types (shortened a bit):
if !incompatible.is_empty() {
match codegen_units {
Some(n) if n > 1 => {
if matches.opt_present("o") {
// <...>
codegen_units = Some(1);
} // ELSE don't touch codegen_units?
}
_ => {
codegen_units = Some(1);
}
}
I.e. it appears cgu is set to 1 only if the -o
option is used, or if it was not set. Testing this:
$ echo 'pub mod mod1 { pub fn f() {} } pub mod mod2 { pub fn f() {} }' | rustc --crate-type lib --emit obj -C codegen-units=2 --out-dir=/tmp/example -
warning: ignoring emit path because multiple .o files were produced
warning: 1 warning emitted
$ ls /tmp/example
rust_out.rust_out.b25ea2e0-cgu.0.rcgu.o rust_out.rust_out.b25ea2e0-cgu.1.rcgu.o
$
So it would appear we can get codegen-units > 1 with --emit obj
, albeit with a warning?
Of course, a mode for emitting archives as @tmandry suggested would in any case be preferable :slightly_smiling_face:
So it would appear we can get codegen-units > 1 with --emit obj, albeit with a warning?
It's a bit awkward if we don't know what their filenames are though. Ideally it's something we could know in advance (eg by specifying the, or they're computable in a straightforward way). I guess we could use artifact notifications or just inspect what was left in a previously empty directory after the fact.
Another vote for an archive output, both .rlib
and .a
are fine for us. FTR In our setup lld links .rlib
even with metadata just fine. I can see how having a --emit=archive=<file>
would be nice to use with pipelining (where metadata and rlib are already split into separate outputs).
Also +1 to @jsgf, in Bazel it's preferable to specify output filenames explicitly as well.
It seems the discussion is converging on the question of the output file. Do folks have opinions on generating allocator shims? @bbjornse wrote:
Unaffected by the choice between -emit=obj, linking rlibs, or linking with the output of some new crate type staticlib-nobundle. The decision about what shim to use cannot be taken until link time, as only then do we know what shim should be used.
In our setup we actually know what allocator to use before link time (we do not support specifying the allocator in the source code), so having a mode where we can ask rustc to emit specific shims would be super useful for us. Something like --emit=system_allocator_shims
. Are there other shims besides the allocator?
I came across an issue with using an archive output. The linker drops object files from an archive when there is no unresolved symbol that the object file satisfies. In particular, this means that in a test binary, a file of unit tests which are registered with static initializers produces object files that get dropped if linked through an archive. This happens because the object files have no external calls into them, so have no unresolved symbols. This gets especially interesting in Rust where a .rs file can be split into many object files, so even hacks to call some method in each .rs file do not resolve it.
We are working around this while we're linking .rlib files by passing --whole-archive to the linker for the rlibs. However not all platforms support --whole-archive. The AIX linker for example does not: https://chromium-review.googlesource.com/c/chromium/src/+/3470980/
This means we need to either:
But I've also noticed that Files cannot be extracted from a thin archive, which was proposed earlier https://github.com/rust-lang/rust/issues/73632#issuecomment-1038604003.
Static initializers don't work reliably even when using rustc as linker either. I think you probably shouldn't use static initializers with rust for the time being until it is fixed.
Good points, thanks @danakj! I apologize for long reply, I had a late evening coffee and now I'm out of control :)
I'll try to reply to your points in the order you wrote them.
Static initializers are problematic also in a purely C++ world, and using --whole-archive
is in many situations the right solution. I understand that in your scenario you have to use whole archive. Would you agree that this is not the most common case though? For example, in our codebase I see an order of magnitude fewer 'whole-archived' libraries (alwayslink in Bazel lingo). Also, we don't want to default to 'whole-archive' in the whole build, because that would have code-size consequences.
Your scenario is maybe related to https://github.com/rust-lang/rust/issues/64191. Assuming we can split compilation and linking, and the result of compilation is an archive, it seems we want to link the archive for "crate-type=bin" as 'whole-archive' (I actually don't know, I'm just guessing). Of course, if the result of compilation is a directory with objects, instead of 'whole-archiving' we could just them on the linker command line directly, to achieve the same effect.
Are you sure AIX linker is not doing an equivalent of "whole-archive" by default? From https://www.ibm.com/docs/en/aix/7.1?topic=l-ld-command I read:
The ld command processes all input files in the same manner, whether they are archives or not. It includes the symbol tables of all objects, discarding only symbol definitions that duplicate existing symbols. Unlike some other versions of the ld command, you do not need to order archive files so references precede definitions. Furthermore, you do not need to list an archive file more than once on the command line
That gives me the impression that AIX doesn't do the link-time dead code elimination (or it does but uses a fixpoint iteration to resolve symbols first)? Later in doc for cdtors
:
The linker gathers information about C++ static constructor or destructor functions and saves this information in the output file. The incl suboption tells the linker which archive members to search when creating the saved information. The possible values are: all Searches all members of all archives for constructor or destructor functions. This is the default. ...
And that is maybe saying that stuff is 'whole-archived' by default. Honestly I don't really understand what that page says, but if you have access to AIX machine, maybe it's worth trying?
I'll ignore the AIX issue for now
Of course both these outputs are equivalent, you can always create an archive from objects, and unarchive an archive to get objects. Maybe the question is what is more convenient more often. I come from the Bazel perspective, and that colors my preference:
@jsgf to make sure I'm not completely wrong. I understand thin-archives as an disk-space-saving mechanism for C/C++ builds, where each .o file is created by a separate compiler invocation. The compiler typically writes the .o file to disk. Then a different tool is used to create an archive, which normally just copies contents of .o files into a new .a file. So at the end each .o file consumes twice the disk space it could. Thin archives allow us to reference the original .o file instead of copying the content. In our workloads this optimization is superseded by --start-lib/--end-lib support in gold or lld (there you don't even need to create the thin archive).
For rustc, it seems that it should keep the object file content in memory and only write the archive to disk. So there is nothing to reference from the thin archive. Unless, and here probably comes @jsgf to teach me a lesson :) we use incremental compilation and rustc is able to only update specific object files, so it would be beneficial to write object files separately.
@bjorn3 I believe that with the recent push for stabilizing native linking modifiers (https://github.com/rust-lang/rust/pull/93901) we may be getting there slowly but steadily :)
Native linking modifiers fixes it for C static libraries. They don't apply to regular rust dependencies.
Yup, what I meant was that with that there is a way to make C/C++ static initializers work correctly when using rustc as the linker driver.
Thanks for the detailed reply @hlopko, what you said makes a lot of sense.
I didn't think that AIX might always link in everything - that's very good news. I agree with your reading of those quotes. Second, thanks for going over thin archives for me, that makes a lot of sense and I am not concerned then, as rustc would not be producing such.
Regarding single-file vs multiple-files, GN also constructs a build graph up front, like Bazel. My understanding of the tool is that it groups by file types however. So a linker line looks something like:
rspfile_content = "{{inputs}} {{solibs}} {{libs}} {{rlibs}}"
Where the variables here come from GN itself, not from the user-written build scripts or templates. The {{inputs}} here are .o files, and come from C++ files in the current library. If Rust were to produce a library file, it gets grouped with all the other libraries, making it difficult to --whole-archive just the single library that is the peer of the C++ .o files for this mixed-language library. So I've been thinking of how to make Rust closer to C++ such that the .o files are peers, and one way would be to have the build work with .o files. However like you say, a dynamic set of files poses challenges. In the end it's just software and we can make anything work with enough effort, so having good properties that lead to efficiencies is what matters. Toward that I can agree that a deterministic (single) output is good.
Last, I don't want to hijack this bug with problems of static initializers themselves - I know there's problems being dealt with elsewhere. The point was just to explain the problems we're having with linking rustc output as a library with C++ output, that would not be present if we could just work with object files across the board. I should note we're not using rustc to make the binary, as it's the C++ linker doing that work, so the --crate-type is never executable for rustc, and we wouldn't benefit from https://github.com/rust-lang/rust/issues/64191 as a result. In Bazel systems my understanding is this problem doesn't occur in C++ because tests are linked into a dynamic library, thus all objects are kept. If a static library is used for the tests, then --whole-archive is needed in the same way.
So tl;dr I see your points and I certainly hope it would reasonable to teach GN to handle treating a "peer" library, for a mixed-language-inter-dependent target, different than the others. But if it failed, I'm also encouraged to know we could extract the objects.
@hlopko
[thin archives]
Yeah, I think you're right. From my perspective, thin archives are mostly a disk IO/copy bandwidth saving rather than specifically for disk space saving, though that's a nice side benefit. If the flow is cc -> .o -> ar -> .a, then it can get expensive if you have, for example, lots of debug info enabled (typically 10x the .text size). But if you're right that llvm directly emits .a without an intermediate .o (and I think you are), then that's moot.
--start-lib
/--end-lib
does give similar benefits to thin archives, but unfortunately they're incompatible with link groups (--start-group/--end-group). Though that's only relevant for gold
since lld
effectively acts as if the entire link line were in a link group.
And for the record, my primary interest is building with Buck which shares many characteristics with Bazel.
An additional constraint with all this is that we need to have explicit dependencies on all the sysroot crates, including all their dependencies, expressed as prebuild_rust_library
rules (for now, though we're also looking into building libstd from source with Buck rules). And in addition, prevent rustc from implicitly using sysroot. (This isn't strictly necessary for pure rust builds, but it is to get the right sysroot libraries to a final C++ linker.)
This turns out to be surprisingly hard - not the dependencies themselves, but working out when a crate has a dependency on a sysroot crate. (I have a little python script which can generate sysroot deps from the crate metadata itself.)
We can generally assume everything depends on std
(or core
for #![no_std]
crates). This is reasonably simple (but awkward because you can't identify no_std crates from the metadata alone). But it seems you can also define a dependency on rustc-std-workspace-(std|core|alloc)
or compiler_builtins
in a Cargo.toml, which cargo recognizes and basically suppresses on the assumption that rustc will just pluck that dependency from its sysroot. We must suppress sysroot access, and replace it with an explicit dependency on the appropriate sysroot rule targets.
This seems reasonably easy to do in an ad-hoc way for small examples, but I've yet to make it work for a full-scale build using a non-trivial number of third-party crates.io deps.
But it seems you can also define a dependency on rustc-std-workspace-(std|core|alloc) or compiler_builtins in a Cargo.toml, which cargo recognizes and basically suppresses on the assumption that rustc will just pluck that dependency from its sysroot.
No, the rustc-std-workspace-* crates are in crates.io with an empty content. In the rust build system there is a [patch.crates-io]
section replacing them with alternative versions which re-export the respective standard library crate. The compiler-builtins version on crates.io is the exact same used by the standard library. (in fact the standard library gets it from crates.io)
No, the rustc-std-workspace-* crates are in crates.io with an empty content. In the rust build system there is a [patch.crates-io] section replacing them with alternative versions which re-export the respective standard library crate. The compiler-builtins version on crates.io is the exact same used by the standard library. (in fact the standard library gets it from crates.io)
I think I see - I'll need to dig into it some more. But I did notice that Cargo has specific knowledge of the rustc-std-workspace* names baked into it, and they don't appear as normal resolved dependencies in the cargo metadata
output. This led me to believe that they're intended to be resolved against whatever sysroot your rustc is using (since quite a few crates.io packages depend on them, I guess as a result of being used within sysroot).
I'm working on a major existing C++ project which hopes to dip its toes into the Rusty waters. We:
We can't:
rustc
for final linking. C++ is the boss in our codebase; we're not ready to make the commitment to put Rust in charge of our final linking.staticlib
for each of our Rust components. This works if we're using Rust in only one place. For any binary containing several Rust components, there would be binary bloat and potentially violations of the one-definition-rule, by duplication of the Rust stdlib and any diamond dependencies.staticlib
containing all our Rust components, then link that into every binary. That monster static library would depend on many C++ symbols, which wouldn't be present in some circumstances.We can either:
staticlib
for each of our output binaries, usingrustc
and an auto-generated.rs
file containing lots ofextern crate
statements. Or,rlib
for each Rust component directly into the final C++ linking procedure.The first approach is officially supported, but is hard because:
staticlib
as part of our C++ tool invocations. This is awkward in our build system. Our C++ targets don't keep track of Rust compiler flags (--target
, etc.) and in general it just feels weird to be doing Rust stuff in C++ targets.rustc
to make astaticlib
for every single one of our C++ link targets. For most of our targets (especially unit test targets) there will be norlibs
in their dependency tree, so it will be a no-op. But the presence of this wrapper script will make Rust adoption appear intrusive, and of course will have some small actual performance cost.The second approach is not officially supported. An
rlib
is an internal implementation format within Rust, and its only client isrustc
. It is naughty to pass them directly into our own linker command line.But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.
Because external toolchains are not expected to consume
rlib
s, some magic is required:rlib
s, which would be easy apart from the fact they contain the symbol metadata hash in their names.__rust_alloc
to__rdl_alloc
etc.But obviously the bigger concern is that this is not a supported model, and Rust is free to break the
rlib
format at any moment.Is there any appetite for making this a supported model for those with mixed C/C++/Rust codebases?
I'm assuming the answer may be 'no' because it would tie Rust's hands for future
rlib
format changes. But just in case: how's about the following steps?rustc
as the final linker; orstaticlib
orcdylib
then pass that to your existing final linker (I think this would be worth explicitly explaining anyway, so unless anyone objects, I may raise a PR)rustc --print stdrlibs
(or similar) which will output the names of all the standard library rlibs (not just their directory, which is already possible withtarget-libdir
)rustc
option which generates arust-dynamic-symbols.o
file (or similar) containing the codegen which is otherwise done byrustc
at final link-time (e.g. symbols to call__rdl_alloc
from__rust_alloc
, etc.)rust-dynamic-symbols.o
and everything output byrustc --print stdrlibs
)A few related issues:
64191 wants to split the compile and link phases of rustc. This discussion has spawned from there.
-Wl,--start-group
,-Wl,--end-group
on the linker line. (Per https://github.com/rust-lang/rust/issues/64191#issuecomment-629418541)staticlib
-per-C++-target model happen to be magnified by #73047@japaric @alexcrichton @retep998 @dtolnay I believe this may be the sort of thing you may wish to comment upon! I'm sure you'll come up with reasons why this is even harder than I already think. Thanks very much in advance.