Closed bjorn3 closed 4 years ago
When replacing getopts/src/lib.rs with #[derive(Debug)] enum Name { Short(char) }
and replacing libtest/lib.rs
with use getopts;
, it still fails to link with:
error: linking with `cc` failed: exit code: 1
|
= note: "cc" "-m64" "-L" "/Users/bjorn/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/bjorn/Documents/rustc_codegen_cranelift/build_sysroot/target/debug/deps/test.4ovhqzay5djkkv45.rcgu.o" "-o" "/Users/bjorn/Documents/rustc_codegen_cranelift/build_sysroot/target/debug/deps/libtest.dylib" "-Wl,-exported_symbols_list,/var/folders/pr/dsmnz7795h18nncdpws5d9h00000gp/T/rustc8q1pCg/list" "/Users/bjorn/Documents/rustc_codegen_cranelift/build_sysroot/target/debug/deps/test.4efq6d1me4ylpya9.rcgu.o" "-Wl,-dead_strip" "-nodefaultlibs" "-L" "/Users/bjorn/Documents/rustc_codegen_cranelift/build_sysroot/target/debug/deps" "-L" "/Users/bjorn/Documents/rustc_codegen_cranelift/build_sysroot/target/debug/build/backtrace-sys-76a10846402b651c/out" "-L" "/Users/bjorn/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "-Wl,-force_load" "-Wl,/var/folders/pr/dsmnz7795h18nncdpws5d9h00000gp/T/rustc8q1pCg/libgetopts-45b52c2aaff9e689.rlib" "-Wl,-force_load" "-Wl,/var/folders/pr/dsmnz7795h18nncdpws5d9h00000gp/T/rustc8q1pCg/libunicode_width-455d06287369bcce.rlib" "-Wl,-force_load" "-Wl,/var/folders/pr/dsmnz7795h18nncdpws5d9h00000gp/T/rustc8q1pCg/librustc_std_workspace_std-c5d4248dcba5f655.rlib" "-L" "/Users/bjorn/Documents/rustc_codegen_cranelift/build_sysroot/target/debug/deps" "-lstd" "/var/folders/pr/dsmnz7795h18nncdpws5d9h00000gp/T/rustc8q1pCg/libcompiler_builtins-825863beebf37c8c.rlib" "-lSystem" "-lresolv" "-lc" "-lm" "-dynamiclib" "-Wl,-dylib"
= note: Undefined symbols for architecture x86_64:
"_$LT$$RF$T$u20$as$u20$core..fmt..Debug$GT$::fmt::hc095cc20a601c01a", referenced from:
l_anon.70a160287dcd40dee570aa25e956b0d9.1 in libgetopts-45b52c2aaff9e689.rlib(getopts-45b52c2aaff9e689.1kczxwl8hmq7fci7.rcgu.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
It fails to link on macOS. It works fine on Linux however. I haven't tested it on Windows.
@rustbot modify labels: +O-macos
triage: P-high for now (until I can prove to myself it deserves lower priority). Removing I-nominated label. Assigning to self.
cc @alexcrichton and @michaelwoerister
Would it be possible to get a reduced test case that involves just poking around rustc to work with?
@alexcrichton https://github.com/rust-lang/rust/issues/64872#issuecomment-536184496 has been my reduced test case for now. I tried to reproduce it with a standalone example, but I failed. I may be able to reduce it more in a few days, when I get to use a macOS again.
Ah true yeah, but I'm also curious if this can be reproduced without using the cranelift backend, because it seems like this could be a bug in that backend vs in the common shared code?
It can be reproduced without the cranelift backend. I thought that it was a bug in the cranelift backend at first, but when I couldn't find a difference between cg_llvm and cg_clif, I tried to use cg_llvm, which failed too. Running cargo build
is enough. I linked to the cg_clif repo, because I know that that specific setup to compile libtest fails.
Ok I want to poke around, but it doesn't currently reproduce because patches fail. Would it be possible to fix that and/or try to reduce further to not require a bunch of rustup stuff and patching sysroot files directly?
I am not sure if those patches are actually required to reproduce the problem. I am currently working on fixing the patches though.
Pushed a new set of patches.
The patches are not necessary. I have uploaded a repro with just the necessary Cargo.toml and a script to fetch the sysroot src at https://github.com/bjorn3/rust_issue_64872_repro.
Sorry I do not have time to continue to investigate this and help minifying, but wanted to be sure to say so.
No problem. I will try to minify it myself once I have time to do so.
I've reproduced the issue via the repository @bjorn3 linked above. I'll see if I can do any mini-fication.
nominating for discussion: I want T-compiler to consider whether, in the short term, we should revert PR #64324 as a way to address this bug.
(It seems to me like the risks of linker errors due to mixed optimization levels are less than the risks presented here in this ticket, and that is an argument for reverting #64324 in the short term.)
In the meantime I will continue to investigate the bug independently, but the higher level bit of whether to revert seems worth discussion.
discussed in yesterdays T-compiler meeting. There was some support for hypothetically reverting PR #64324 in the short term. But there was also concern that no one would do the follow-up work afterward, and also that it would be a shame to lose the various improvements we got from PR #64324
Overall we decided to leave this nominated to be revisited next week, and in the meantime I will try to minimize the bug and maybe identify what its root cause is.
I spent some time today working to minimize the input that stills sees a linker failure.
Here's what I've gotten down to so far:
// libtest/src/lib.rs
#![crate_name = "test"]
extern crate getopts;
// getopts/src/lib.rs
#[derive(Debug)]
enum Name { _Short(u32) }
// libstd/lib.rs
#![no_std]
#![feature(needs_panic_runtime)]
#![needs_panic_runtime]
#![feature(lang_items)]
#![feature(rustc_attrs)]
#![feature(alloc_error_handler)]
#![feature(allocator_internals)]
#![default_lib_allocator]
extern crate alloc as alloc_crate;
use core::panic::PanicInfo;
#[panic_handler]
pub fn rust_begin_panic(_info: &PanicInfo<'_>) -> ! {
loop { }
}
pub mod prelude {
pub mod v1 {
pub use core::prelude::v1::{Debug};
}
}
use alloc_crate::alloc::Layout;
#[alloc_error_handler]
pub fn rust_oom(_layout: Layout) -> ! { loop { } }
#[rustc_std_internal_symbol]
pub unsafe extern fn __rdl_alloc(_size: usize, _align: usize) -> *mut u8 { loop { } }
#[rustc_std_internal_symbol]
pub unsafe extern fn __rdl_dealloc(_ptr: *mut u8,
_size: usize,
_align: usize) { loop { } }
#[rustc_std_internal_symbol]
pub unsafe extern fn __rdl_realloc(_ptr: *mut u8,
_old_size: usize,
_align: usize,
_new_size: usize) -> *mut u8 { loop { } }
#[rustc_std_internal_symbol]
pub unsafe extern fn __rdl_alloc_zeroed(_size: usize, _align: usize) -> *mut u8 { loop { } }
(And I confirmed that this is still tickling the same bug, because when I change the Cargo.toml
for libstd to say crate-type = ["cdylib", "rlib"]
instead of crate-type = ["dylib", "rlib"]
, it compiles without a linker error.)
More data: I've reduced all of libcore
, libstd
, getopts
, libtests,
and the crate using them down to something really small. (I'll put the reduced test case up in its own repo, or a gist, later today...)
It seems like the problem here manifests itself when both libcore
and getopts
have a derive(Debug)
on a struct with a field of the same type. In my particular minification, the two crates both have structs that eventually get to a field of type u32
.
if you eliminate that commonality, then the linking proceeds successfully.
This makes me hypothesize that the bug here might be something where the compilation of the downstream crate (getopts
here) is able to see that there was a monomorphization of fmt::Debug
for u32
(or maybe &u32
, not certain), and so the compiler decides that getopts
should be able to link to that rather than generate its own monomorphization of fmt::Debug
for the same type
&u as &dyn fmt::Debug
, as described in my readme@pnkfelix Can you take a look at the symbols defined in the upstream crate? Is the missing function present there (but maybe as a local symbol). nm
should do the trick.
(And I confirmed that this is still tickling the same bug, because when I change the Cargo.toml for libstd to say crate-type = ["cdylib", "rlib"] instead of crate-type = ["dylib", "rlib"], it compiles without a linker error.)
This sounds like rustc
assuming that it can re-use a monomorphization from a Rust dylib, even though that isn't possible anymore since #64324. Maybe it has something to do with libstd
being available both as rlib and as dylib? (cc @alexcrichton)
@michaelwoerister here is some nm
output:
% nm -m target/debug/deps/libstd.dylib
0000000000000f50 (__TEXT,__text) non-external (was a private external) __ZN4core18real_drop_in_place17h4b4dd2bb45fd3677E
0000000000000f60 (__TEXT,__text) non-external (was a private external) __ZN4core6Object6method17h3c307b2b614e132dE
0000000000000f20 (__TEXT,__text) external __ZN4core6unused17hd61432153ff90e94E
0000000000000ea0 (__TEXT,__text) external ___rdl_alloc
0000000000000f00 (__TEXT,__text) external ___rdl_alloc_zeroed
0000000000000ec0 (__TEXT,__text) external ___rdl_dealloc
0000000000000ee0 (__TEXT,__text) external ___rdl_realloc
0000000000000f70 (__TEXT,__text) non-external (was a private external) ___rust_probestack
0000000000000e80 (__TEXT,__text) external _rust_eh_personality
0000000000001020 (__DATA,.rustc) external _rust_metadata_std_c479832d91af15fa73418ccc139c3e9
(undefined) external dyld_stub_binder (from libSystem)
%
For this reduction, the symbol we are looking for is __ZN4core6Object6method17h3c307b2b614e132dE
, so that does seem to confirm the hypothesis that rustc
is making assumptions about what monomorphizations (or vtables? Take a look at my minimization, linked above) it can re-use.
@michaelwoerister wrote:
Maybe it has something to do with libstd being available both as rlib and as dylib?
I don't think this explains it. I tried removing rlib
from all the crate-type
settings in the Cargo.toml
s (so that they all say crate-type = ["dyllb"]
), and the bug still reproduces then.
There was further discussion with @michaelwoerister and @alexcrichton on zulip. I think I'll have a patch up today that should address the problem here, hopefully without causing any problems anywhere else.
Removing nomination label since it seems like this is in hand.
By the way, @alexcrichton and I managed to construct a very nice minimized version of the bug here: https://gist.github.com/alexcrichton/b3ba3becdf2009973270d2aef3453670
(based on my further testing, this is readily reproducible without using any unstable features, and thus is a stable-to-beta regression. See discussion in #65781 )
Probably regressed in #64324 (Fix mixing crates with different
share_generics
)