Open danieldeankon opened 4 years ago
Transferred the issue to rust-lang/rust. This is a rustc issue, and I don't think there is a lot that Cargo can do about it. rustc is doing work, it just never finishes.
I have bisected this to #73526. cc @cuviper, are there any known issues with the LLVM 11 upgrade causing rustc to go into any kind of infinite loops? When I attached with gdb, it looked like it was deep inside a proc macro (vk-shader-macros).
I'm not aware of any LLVM 11 issues like that.
@cuviper Did you remove I-hang and T-compiler labels intentionally or was it conflicting with the fact that I just added them?
@LeSeulArtichaut I only meant to change the regression label. I used the Android client, so I guess it batched that update, with clobbering. 🤕
Adding them back now, and I’ll mark this conversation as resolved to limit the noise
@rustbot ping llvm
Hey LLVM ICE-breakers! This bug has been identified as a good "LLVM ICE-breaking candidate". In case it's useful, here are some instructions for tackling these sorts of bugs. Maybe take a look? Thanks! <3
cc @camelid @comex @cuviper @DutchGhost @hdhoang @heyrutvik @higuoxing @JOE1994 @jryans @mmilenko @nagisa @nikic @Noah-Kennedy @SiavoshZarrasvand @spastorino @vertexclique
Assigning P-high
as discussed as part of the Prioritization Working Group procedure and removing I-prioritize
.
I couldn't reproduce this issue with my MacOS laptop (x86_64-apple-darwin
). I tried compiling with stable (1.46.0), nightly-2020-09-18
and latest nightly (nightly-2020-09-23
), resulting in a successful compilation in less than 15 minutes. Unless I missed something, I guess this probably means this is a Linux-only issue?
@ehuss Did you also use a Linux machine for your previous bisection?
It would be great if other people could take a look and eventually find an MCVE which would be very helpful to resolve this bug. @rustbot ping cleanup
Hey Cleanup Crew ICE-breakers! This bug has been identified as a good "Cleanup ICE-breaking candidate". In case it's useful, here are some instructions for tackling these sorts of bugs. Maybe take a look? Thanks! <3
cc @AminArria @camelid @chrissimpkins @contrun @DutchGhost @elshize @ethanboxx @h-michael @HallerPatrick @hdhoang @hellow554 @imtsuki @JamesPatrickGill @kanru @KarlK90 @LeSeulArtichaut @MAdrianMattocks @matheus-consoli @mental32 @nmccarty @Noah-Kennedy @pard68 @PeytonT @pierreN @Redblueflame @RobbieClarken @RobertoSnap @robjtede @SarthakSingh31 @senden9 @shekohex @sinato @smmalis37 @spastorino @Stupremee @turboladen @woshilapin @yerke
I can reproduce it on x86_64-unknown-linux-gnu
with rustc 1.48.0-nightly (fb1dc34a8 2020-09-21)
.
I think I found a mvce. (The build is currently running for about ~10 minutes).
Cargo.toml
[dependencies]
vk-shader-macros = "0.2.6"
shader.vert
#version 450
void main() {}
main.rs
const _VERT: &[u32] = vk_shader_macros::include_glsl!("shader.vert");
fn main() {}
I couldn't reproduce this issue with my MacOS laptop [...] and latest nightly (
nightly-2020-09-23
)I can reproduce it on
x86_64-unknown-linux-gnu
withrustc 1.48.0-nightly (fb1dc34a8 2020-09-21)
.
2020-09-21 to 2020-09-23 includes the LLVM 11-rc3 update. @Stupremee can you try with the latest nightly?
Same result
I can confirm, this only happens on Linux. Windows and macOS do not seem to be affected. It doesn't seem to matter if it is --release
or not. Tested with rustc 1.48.0-nightly (8b4085359 2020-09-23)
.
Thanks for making the smaller repro @Stupremee! I imagine it will be difficult to narrow it down more. shaderc-sys is a huge dependency.
Attaching with a debugger, the function to_string
is returning an empty string, causing the code here to go into an infinite loop because it is unable to generate a unique string.
My C++ is fairly rusty, and I don't have any guesses what could cause that apparently straightforward code to misbehave.
Yep, thats what I found out as well. But it's still super weird that it works on stable, which I haven't figured out right now.
My current hypothesis is that LLVM 11 is performing some incorrect optimization of the SPIRV-Tools library, which would explain why it works on stable.
At least, I cannot find any reason that the code linked by @ehuss above would be allowed to have to_string
return an empty string.
(Of course I still need to confirm this hypothesis.)
Another report of shaderc-sys corruption when used as a proc-macro: https://github.com/rust-lang/rust/issues/77532#issuecomment-703302256. Similarly, a seemingly simple ostringstream
construction is returning a corrupt string.
I suspect this may be a C++ symbol clash with rustc's static libstdc++, since proc-macro libraries are loaded in the same process.
Maybe setting RTLD_DEEPBIND
during dlopen
might fix that, rust-analyzer had problems with proc macros until it set that flag
Yeah that fixes it
Long term, we might be better off loading proc macros in a helper process. I think the token stream is already marshalled enough that it could be sent inter-process.
Looks like that wasn't the fix, I was using the wrong LLVM
LD_DEBUG=bindings
may help confirm whether the proc-macro is binding to symbols from our static libstdc++, and which symbols exactly. We can also compare that with a pre-LLVM11 build to see if this issue already existed, perhaps just with less problematic symbols. See man ld.so
for more options -- might want to use a LD_DEBUG_OUTPUT
file too.
I have confirmed this regressed in 7ce71c362be9a89e7897ac066aba6e3e6f747800, at least, which I think we had not done before. That matches our prior assumption of being triggered by an LLVM upgrade.
Currently experimenting with the debug options @cuviper suggested.
https://github.com/Mark-Simulacrum/rust-issue-76980 contains a minimal reproduction with just a cc crate dependency to compile the C++ file. On stable, the proc macro sees 2, on beta it sees 0. I'm still investigating the cause though.
I tried LD_DEBUG
myself on your minimal reproducer, and there are a lot of bindings from /lib64/libstdc++.so.6
to .../lib/libLLVM-1x-rust-1.47.0-nightly.so
, which seems bad. I don't see any new bindings that way -- this was also happening before LLVM 11 -- but there's a sequence of internal bindings that seem to show that we're on a different track:
With e482c86b9 (just before LLVM 11 landed): debug.e482c86b9.477027.gz
binding file /lib64/libstdc++.so.6 [0] to /lib64/libc.so.6 [0]: normal symbol `strcmp' [GLIBC_2.2.5]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9use_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEERKT_RKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9has_facetISt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9use_facetISt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEEERKT_RKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSo9_M_insertImEERSoT_' [GLIBCXX_3.4.9]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSo6sentryC1ERSo' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNKSt5ctypeIcE13_M_widen_initEv' [GLIBCXX_3.4.11]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libc.so.6 [0]: normal symbol `memcmp' [GLIBC_2.2.5]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE13_M_insert_intImEES3_S3_RSt8ios_basecT_' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEE8_M_pbumpEPcS5_l' [GLIBCXX_3.4.21]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libc.so.6 [0]: normal symbol `memcpy' [GLIBC_2.14]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSo6sentryD1Ev' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE10_M_replaceEmmPKcm' [GLIBCXX_3.4.21]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt8ios_baseD2Ev' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt8ios_base17_M_call_callbacksENS_5eventE' [GLIBCXX_3.4.6]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt8ios_base20_M_dispose_callbacksEv' [GLIBCXX_3.4.6]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt3_V214error_categoryD2Ev' [GLIBCXX_3.4.21]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt14error_categoryD2Ev' [GLIBCXX_3.4.15]
With 7ce71c362 (with LLVM 11): debug.7ce71c362.477169.gz
binding file /lib64/libstdc++.so.6 [0] to /lib64/libc.so.6 [0]: normal symbol `strcmp' [GLIBC_2.2.5]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9has_facetISt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSo9_M_insertImEERSoT_' [GLIBCXX_3.4.9]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSo6sentryC1ERSo' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt16__throw_bad_castv' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `__cxa_allocate_exception' [CXXABI_1.3]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `__cxa_throw' [CXXABI_1.3]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `__cxa_get_globals' [CXXABI_1.3]
binding file /lib64/libstdc++.so.6 [0] to /lib64/ld-linux-x86-64.so.2 [0]: normal symbol `__tls_get_addr' [GLIBC_2.3]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `__cxa_init_primary_exception' [CXXABI_1.3.11]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt14get_unexpectedv' [GLIBCXX_3.4.20]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt13get_terminatev' [GLIBCXX_3.4.20]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_RaiseException' [GCC_3.0]
binding file /lib64/libgcc_s.so.1 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_Find_FDE' [GCC_3.0]
binding file /lib64/libgcc_s.so.1 [0] to /lib64/libc.so.6 [0]: normal symbol `dl_iterate_phdr' [GLIBC_2.2.5]
binding file /lib64/libgcc_s.so.1 [0] to /lib64/libc.so.6 [0]: normal symbol `strlen' [GLIBC_2.2.5]
binding file /lib64/libgcc_s.so.1 [0] to /lib64/libpthread.so.0 [0]: normal symbol `pthread_once'
binding file /lib64/libstdc++.so.6 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_GetLanguageSpecificData' [GCC_3.0]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_GetRegionStart' [GCC_3.0]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_GetIPInfo' [GCC_4.2.0]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNK10__cxxabiv117__class_type_info11__do_upcastEPKS0_PKvRNS0_15__upcast_resultE' [CXXABI_1.3]
binding file /lib64/libgcc_s.so.1 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_GetCFA' [GCC_3.3]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_SetGR' [GCC_3.0]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_SetIP' [GCC_3.0]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `__cxa_begin_catch' [CXXABI_1.3]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt9basic_iosIcSt11char_traitsIcEE11_M_setstateESt12_Ios_Iostate' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `__cxa_end_catch' [CXXABI_1.3]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `__cxa_get_globals_fast' [CXXABI_1.3]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libgcc_s.so.1 [0]: normal symbol `_Unwind_DeleteException' [GCC_3.0]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt9exceptionD2Ev' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `__cxa_free_exception' [CXXABI_1.3]
binding file /lib64/libstdc++.so.6 [0] to /home/jistone/.rustup/toolchains/COMMIT/bin/rustc [0]: normal symbol `free' [GLIBC_2.2.5]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSo6sentryD1Ev' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_assignERKS4_' [GLIBCXX_3.4.21]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt8ios_baseD2Ev' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt8ios_base17_M_call_callbacksENS_5eventE' [GLIBCXX_3.4.6]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt8ios_base20_M_dispose_callbacksEv' [GLIBCXX_3.4.6]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt3_V214error_categoryD2Ev' [GLIBCXX_3.4.21]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZNSt14error_categoryD2Ev' [GLIBCXX_3.4.15]
I don't know that we should read anything into that difference -- seems like it could be happenstance. In any case, I think we're on very shaky ground loading dynamic C++ proc macros into the rustc process with our static libstdc++.
This also looks a bit related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42679
Thanks a lot for all these details in the debugging process, it's really enjoyable to read.
My small contribution, which confirms that things can be right if the symbols are actually the same instead of clashing: With rust 1.47.0 (preview release) compiled with the system llvm (11 rc2), itself built with the system libc++ (11 rc2) which is also used as the default system c++ stdlib, the proc macro ends up using the same libc++ as rustc itself, and the example project outputs 2 as expected.
Jonathan Wakely pointed me to one of his Stack Overflow answers: https://stackoverflow.com/questions/46746878/is-it-safe-to-link-c17-c14-and-c11-objects/49119902#49119902
Where you have problems is if you link together objects compiled with different versions of GCC and you have used unstable features from a new C++ standard before GCC's support for that standard is complete.
But we're using GCC 5.5 and building for C++14, which he tells me should be fine in that regard.
There's another caveat about std::string
vs. std::__cxx11::string
, but our build of LLVM does appear to be using the new ABI there, and my Fedora 32 system is also using the new ABI when building the reproducer
I note that the answer (at least as I interpret it) seems to be about linking together a bunch of code (compiled against and potentially inlining header definitions etc), but still linking to a single copy off libstdc++ -- maybe I'm misinterpreting though?
Yeah, that SO situation is not a perfect match.
When I step through the reproducer's ostream<<
in libstdc++-10.2.1-1.fc32.x86_64
, here's where it goes wrong:
src/foo.cpp
1 #include <string>
2 #include <sstream>
3
4 extern "C" int custom_to_string(unsigned int id) {
5 std::stringstream os;
6 os << id;
7 return os.str().length();
8 }
libstdc++-v3/include/bits/ostream.tcc
60 template<typename _CharT, typename _Traits>
61 template<typename _ValueT>
62 basic_ostream<_CharT, _Traits>&
63 basic_ostream<_CharT, _Traits>::
64 _M_insert(_ValueT __v)
65 {
66 sentry __cerb(*this);
67 if (__cerb)
68 {
69 ios_base::iostate __err = ios_base::goodbit;
70 __try
71 {
>72 const __num_put_type& __np = __check_facet(this->_M_num_put);
73 if (__np.put(*this, *this, this->fill(), __v).failed())
74 __err |= ios_base::badbit;
75 }
76 __catch(__cxxabiv1::__forced_unwind&)
77 {
78 this->_M_setstate(ios_base::badbit);
79 __throw_exception_again;
80 }
81 __catch(...)
82 { this->_M_setstate(ios_base::badbit); }
83 if (__err)
84 this->setstate(__err);
85 }
86 return *this;
87 }
At line 72, this->_M_num_put
is NULL
. Then _check_facet
calls __throw_bad_cast()
, which is caught and line 78 sets badbit
. Back in custom_to_string
, I can indeed see that stream state:
(gdb) p os.rdstate()
$15 = std::_S_badbit
That's why the stringstream
is empty.
I still don't know why we get that NULL
, but there was a tell in what I posted earlier -- just the facet lines...
Before:
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9use_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEERKT_RKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9has_facetISt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9use_facetISt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEEERKT_RKSt6locale' [GLIBCXX_3.4]
After:
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9use_facetISt5ctypeIcEERKT_RKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale' [GLIBCXX_3.4]
binding file /lib64/libstdc++.so.6 [0] to /lib64/libstdc++.so.6 [0]: normal symbol `_ZSt9has_facetISt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale' [GLIBCXX_3.4]
I guess that may need stepping through the stream constructor, but that's a mess of locale code...
If I change all dynamic symbols in libLLVM-11-rust-1.48.0-nightly.so
that are of binding STB_GNU_UNIQUE and I change them to binding GLOBAL, the problem goes away.
fn main() {
let mut file = std::fs::File::open("libLLVM-11-rust-1.48.0-nightly.so").unwrap();
let mut elf = elfkit::elf::Elf::from_reader(&mut file).unwrap();
elf.load_all(&mut file).unwrap();
let dynsym = elf.sections.iter_mut().find(|sec| sec.name == b".dynsym").unwrap();
let syms = dynsym.content.as_symbols_mut().unwrap();
for sym in syms {
if sym.bind == elfkit::types::SymbolBind::STB_GNU_UNIQUE {
sym.bind = elfkit::types::SymbolBind::GLOBAL
}
}
let mut f = std::fs::File::create("test.so").unwrap();
elf.to_writer(&mut f).unwrap();
}
I found this by observing (with LD_DEBUG) that symbols with version LLVM_11
are being bound to symbols with version GLIBCXX_3.4
. This is not normal behavior. I'm guessing there are some special rules for STB_GNU_UNIQUE
.
25504: binding file /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0] to /home/jethro/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-11-rust-1.48.0-nightly.so [0]: normal symbol `_ZNSt8numpunctIcE2idE' [GLIBCXX_3.4]
25504: binding file /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0] to /home/jethro/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-11-rust-1.48.0-nightly.so [0]: normal symbol `_ZNSt8numpunctIcE2idE' [GLIBCXX_3.4]
25504: binding file /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0] to /home/jethro/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-11-rust-1.48.0-nightly.so [0]: normal symbol `_ZNSt7__cxx118numpunctIcE2idE' [GLIBCXX_3.4.21]
25504: binding file /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0] to /home/jethro/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/../lib/../lib/libLLVM-11-rust-1.48.0-nightly.so [0]: normal symbol `_ZNSt7__cxx118numpunctIcE2idE' [GLIBCXX_3.4.21]
...
Should we be building the LLVM shared object with some flags to only export LLVM symbols, not other unrelated symbols?
I'm not sure, but we may need libstdc++
interop symbols between librustc_driver.so
(with crate rustc_llvm
's C++ wrappers) and libLLVM.so
.
I do see some STL use in rustc_llvm, but that looks to be all inlined/linked statically based on objdump -TC librustc_driver-cc82839033d44d0f.so|grep LLVM_11
I tried reading the LLVM CMake files but I can't make much out of it. According to https://github.com/llvm/llvm-project/blob/master/llvm/tools/llvm-shlib/CMakeLists.txt#L1-L3 by default the shared object should only contain the C functions, but that's clearly not the case for us and I don't see how we're enabling any deviation from the default.
Alternatively, we might compile LLVM with -fno-gnu-unique
.
FWIW, I can also reproduce this with my own local rust/llvm build configured with [llvm] static-libstdcpp = true
-- so the versions of the static and dynamic libstdc++ are identical.
Hmm, but it also requires [llvm] link-shared = true
. When I set that false
, the reproducer behaves properly.
For this post I consulted with my friendly neighborhood LLVM expert, @petrhosek.
The libstdc++ symbols get exported from libLLVM.so
because we don't compile with -fvisibility=hidden
. If they didn't get exported the dynamic linker wouldn't override them. However, turning on this flag would likely break the rustllvm
bridge, which calls into LLVM's private C++ ABI.
So it looks like our options are one of:
rustllvm
into libLLVM.so
somehow (@petrhosek said we could build a libLLVM.a
and then link that into a rustllvm
shared library, but this would likely require adding support for building a static library to LLVM's build system)I've opened up https://github.com/rust-lang/rust/pull/77634 to switch off ThinLTO in our x86_64-unknown-linux-gnu CI, which I believe would mean that we end up with a statically linked LLVM. It's not something I want to merge into master, since it currently would break downloading LLVM from CI (I think), but it is something I would feel comfortable backporting to stable, presuming the performance hit is not too big. (It's also performance every other platform has, which is not entirely unreasonable).
This is essentially option 3 above.
I suspect that in the long run option 4 is the right thing for us -- it seems feasible to do, at least, and we could also replace rustllvm with a commit directly to LLVM since we maintain a fork anyway which expands the C ABI for us (I think this is true)?
we could also replace rustllvm with a commit directly to LLVM since we maintain a fork anyway which expands the C ABI for us (I think this is true)?
Sort of, but not really. I guess you can say that rustc_llvm
(formerly rustllvm
) is that expanded C ABI, because it wraps C++ calls in an extern "C"
API for Rust FFI. But this stands alone so it can be used with external LLVM too. The only added APIs within our forked LLVM are MCSubtargetInfo::getCPUTable
and getFeatureTable
, which are still C++, and neither are essential. It would be leaving external LLVM users (distros) behind if we made rust-forked APIs mandatory within LLVM.
Okay, looking at results from #77634, it looks like that's a very viable fix for stable (and likely 1.48 beta).
Wall times https://perf.rust-lang.org/compare.html?start=9fdaeb393a16951f6fdef087193fef576e36aba6&end=6f967be1d4ff00423ed16df7cb77f4074a9d6b80&stat=wall-time are pretty good -- definitely not bad enough that we can't ship it.
While the options @tmandry lists would work, they're not necessary. As I already pointed out, the C++ symbols in libLLVM.so have different symbol versions (LLVM_xxx
), so shared objects that were linked against the system libstdc++ (GLIBCXX_xxx
) should never link to them. There's just an oddity with some of the symbols where the dynamic loader is trying to make them globally unique in the process despite the symbol versioning that's applied. Additionally, rustc_llvm doesn't actually link to the STL symbols exported by libLLVM.so.
I don't know why this changed between LLVM 10 and LLVM 11, but here you can clearly see the difference:
$ objdump -T stable-x86_64-unknown-linux-gnu/lib/libLLVM-10-rust-1.46.0-stable.so |grep _ZNSs4nposE
000000000077cfd8 w DO .rodata 0000000000000008 LLVM_10 _ZNSs4nposE
^
weak symbol
$ objdump -T nightly-x86_64-unknown-linux-gnu/lib/libLLVM-11-rust-1.48.0-nightly.so | grep _ZNSs4nposE
00000000008214b8 u DO .rodata 0000000000000008 LLVM_11 _ZNSs4nposE
^
STB_GNU_UNIQUE symbol
So, again, here are the options I propose that are much less “invasive” in the sense that they don't affect linking flows or code generation, just what symbols are exported from the libraries we already build and how:
STB_GNU_UNIQUE
-bound symbols, in one of three ways:
a. Find out what changed between LLVM 10 and LLVM 11 to see if there's a flag we can set to prevent this behavior.
b. Compile LLVM with the -fno-gnu-unique
compiler flag
c. Run the tool from https://github.com/rust-lang/rust/issues/76980#issuecomment-704480880 on libLLVM.so after LLVM is built.Started building a pre-release with ThinLTO disabled in https://github.com/rust-lang/rust/pull/77642, so if that works I think we're covered for 1.47.0. We still need to fix it for 1.48.0 onwards, but for them we have the time to think about and test a proper solution (@jethrogb's proposals are interesting).
The pre-release is out! You can download it following these instructions: https://blog.rust-lang.org/inside-rust/2020/10/07/1.47.0-prerelease-2.html
Problem I am seeing the build process hang at the last few steps when attempting to build this particular repository on Linux using either the nightly or the beta toolchain. I've seen this happen for the repository both when building locally (Linux Mint) and in Github's CI (Ubuntu). This does not happen with the stable toolchain, and I have not seen it happen with any toolchain on Windows.
I expect to see the build process finish in approximately 20 minutes like it does with the stable toolchain, but I observe a hanging build. If there is a problem with the build process, I expect to see an indication of what it is, especially if I pass a verbose flag, but I observe no further output once it gets into the hanging state---the only output I see is from steps that have already finished, which provides no insight into why it hangs.
Steps
Switch to nightly toolchain:
Fetch the following repository:
Attempt to build the binaries:
The last thing I see before it gets into a hanging state:
Notes
I was able to reproduce this behaviour in Github CI here: https://github.com/danieldeankon/hypermine/pull/2/checks?check_run_id=1138809646
CI / test (ubuntu-latest, stable) finished in 17m 39s. CI / test (ubuntu-latest, beta) has been going for two hours and counting at time of writing, and similarly for CI / test (ubuntu-latest, nightly).
Output of
cargo version
: cargo 1.48.0-nightly (8777a6b1e 2020-09-15)My operating system is Linux Mint 19.2 Cinnamon.