Open alkeryn opened 2 years ago
I'm unable to reproduce the issue. I tried stable, beta, nightly, with/without /etc/hosts
, ...
I'm also seeing thanks to your gdb backtrace that the crash seems to be inside /usr/lib/libnss_myhostname.so.2
which is a systemd library. This doesn't indicate that the rust code isn't responsible for the crash but it pass trough the glibc
getaddrinfo
function which should have rejected invalid inputs, so.
What is your current systemd-libs
and glibc
package version installed ? Is your system completely updated ?
Hey, yea, i was unable to reproduce it on my debian based vps.
on the system it does occur,
systemd-libs version is 251.4-1 glibc version is 2.36-2
yes, i updated it yesterday. isn't it odd that it tries to use that library knowing it is a static build however ?
@Urgau oh wait, i did put part of my original issue in a comment block, the condition were missing.
you need to compile with rustc -C target-feature=+crt-static main.rs
sorry, i missed that it was commented out.
Okay, thanks for the info.
isn't it odd that it tries to use that library knowing it is a static build however ?
Well, yes but mostly no. Generally a static build include mostly/every library it is dynamically linking to but sometimes some libraries aren't linked trough at linked time but figured out at run-time and here that's the case for the domain resolution, because there are many different ways it could be done and including all of them isn't possible.
@Urgau oh wait, i did put part of my original issue in a comment block, the condition were missing. you need to compile with rustc -C target-feature=+crt-static main.rs sorry, i missed that it was commented out.
Thanks I was about to ask.
I'm now able to reproduce the crash and I'm almost at 100% sure it's a glibc bug. Unfortunately glibc
advise against static linking, so I'm not sure if reporting the crash to them will help.
I would however advise you to use musl
a glibc
replacement that is known to work with static linking and is supported natively by the Rust compiler. Just install the target rustup +nightly target install x86_64-unknown-linux-musl
and build for the target rustc -C target-feature=+crt-static --target=x86_64-unknown-linux-musl main.rs
@Urgau thanks ! i do wonder why i can't reproduce it on a debian server, but not that important.
i see, still i wouldn't have expected to segfault a rust program without using unsafe, even though it segfault from glibc, couldn't rust handle it gracefully in one way or another ?
anyway, thanks for the tips !
i do wonder why i can't reproduce it on a debian server, but not that important.
I also tested on a debian-based system and couldn't reproduced the crash. The problem probably comes from the recent glibc
upgrade done in archlinux. This may be a recent regression in glibc
, but as I said glibc
advise against static-linking so I don't know if they will do something about it.
i see, still i wouldn't have expected to segfault a rust program without using unsafe, even though it segfault from glibc, couldn't rust handle it gracefully in one way or another ?
The segfault is not in the rust code it's in the systemd
lib probably because glibc
passed some invalid values (speculation). There nothing the rust runtime can do in this situation, we don't have control over glibc
, systemd`, or whatever else.
SIGSEVG
means invalid memory access, this generally means that some piece of code wanted to access a place in memory that it doesn't have the permission to do so. This could leave some state in an invalid state, corrupting other state and maybe even more. The only sensible things to do in this situation is to abort.
Well thank you for all the details ! :) should we close the issue or report it to glibc devs ?
Reproduce on manjaro Linux ww 5.10.136-1-MANJARO
with glibc 2.36, same backtrace
It's probably worth reporting this upstream.
The segfault here is a null pointer dereference on this line: https://github.com/systemd/systemd/blob/b45730389ba025489ec8d445bc91534fef515c28/src/basic/memory-util.c#L12
I suspect that the problem is that thread-locals aren't initialized. Whether that's caused by our unsupported linkage, or it's some other kind of bug in rustc or glibc/systemd is unclear. But I'm a C novice, so that's not saying much.
This is exactly why you should not link glibc statically. Your glibc dlopened the systemd library which probably depends on glibc too and thus brought in a second glibc. That is guaranteed to cause issues.
You should either stop linking glibc statically or switch to a musl target, which supports static linking (and even does so by default today). I don't think upstream glibc would treat this as a bug.
I think it would make sense to print a warning when trying to link glibc statically.
Both of those function will segfault when trying to resolve localhost on any port if the following condition are met :
rustc -C target-feature=+crt-static main.rs
I tried this code:
I expected to see this happen: the address is resolved
Instead, this happened: the program segfault
rustc --version --verbose
:uname -a
: (This is Arch-linux lattest, i could not reproduce the bug on another distro, but still, it shouldn't segfault)For the backtrace, RUST_BACKTRACE=1 did not work and gave the following output :
so here is a backtrace made with gdb (don't mind the gef plugin being installed
Backtrace
``` [ Legend: Modified register | Code | Heap | Stack | String ] ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ──── $rax : 0x0 $rbx : 0x3 $rcx : 0x007ffff7cf838e → 0x310b77fffff0003d ("="?) $rdx : 0x007ffff7ff8490 → <_dl_static_dtv+16> add BYTE PTR [rax], al $rsp : 0x007fffffffcaf0 → "/proc/sys/net/ipv6/conf/all/disable_ipv6" $rbp : 0x007fffffffcc20 → 0x007fffffffcce0 → 0x0000000000000010 $rsi : 0x007ffff7d99dd5 → 0x6225206125000200 $rdi : 0x007ffff79c1c88 → 0x0000000000000005 $rip : 0x007ffff79a5196 → mov r12, QWORD PTR [rax+0x8] $r8 : 0x0 $r9 : 0x0 $r10 : 0x1000 $r11 : 0x206 $r12 : 0x0 $r13 : 0x007fffffffcaf0 → "/proc/sys/net/ipv6/conf/all/disable_ipv6" $r14 : 0x007fffffffccf0 → 0x0000000000000000 $r15 : 0x007fffffffcca0 → 0xe1efbb33d283a048 $eflags: [ZERO carry PARITY adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification] $cs: 0x33 $ss: 0x2b $ds: 0x00 $es: 0x00 $fs: 0x00 $gs: 0x00 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ──── 0x007fffffffcaf0│+0x0000: "/proc/sys/net/ipv6/conf/all/disable_ipv6" ← $rsp, $r13 0x007fffffffcaf8│+0x0008: "s/net/ipv6/conf/all/disable_ipv6" 0x007fffffffcb00│+0x0010: "v6/conf/all/disable_ipv6" 0x007fffffffcb08│+0x0018: "all/disable_ipv6" 0x007fffffffcb10│+0x0020: "ble_ipv6" 0x007fffffffcb18│+0x0028: 0xffffffffffffff00 0x007fffffffcb20│+0x0030: 0x0000000000000000 0x007fffffffcb28│+0x0038: 0x007ffff79a4e33 → lea rdx, [rax+0xb] ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ──── 0x7ffff79a5187 je 0x7ffff79a51d0 0x7ffff79a5189 lea rdi, [rip+0x1caf8] # 0x7ffff79c1c88 0x7ffff79a5190 call QWORD PTR [rip+0x1cc82] # 0x7ffff79c1e18 → 0x7ffff79a5196 mov r12, QWORD PTR [rax+0x8] 0x7ffff79a519d mov r13, rax 0x7ffff79a51a0 test r12, r12 0x7ffff79a51a3 je 0x7ffff79a51e5 0x7ffff79a51a5 sub r12, 0x1 0x7ffff79a51a9 mov eax, 0x3ffffe ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ──── [#0] Id 1, Name: "main", stopped 0x7ffff79a5196 in ?? (), reason: SIGSEGV ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ──── [#0] 0x7ffff79a5196 → mov r12, QWORD PTR [rax+0x8] [#1] 0x7ffff79ad6b1 → jmp 0x7ffff79ad518 [#2] 0x7ffff79a1045 → mov rbx, QWORD PTR [rsp] [#3] 0x7ffff79aa1a6 → _nss_myhostname_gethostbyname4_r() [#4] 0x7ffff7f248ae → getaddrinfo() [#5] 0x7ffff7ed5cf6 → std::sys_common::net::{impl#6}::try_from() [#6] 0x7ffff7ece64c → core::convert::{impl#6}::try_into<(&str, u16), std::sys_common::net::LookupHost>() [#7] 0x7ffff7ece64c → std::sys_common::net::{impl#5}::try_from() [#8] 0x7ffff7ece64c → core::convert::{impl#6}::try_into<&str, std::sys_common::net::LookupHost>() [#9] 0x7ffff7ece64c → std::net::addr::{impl#30}::to_socket_addrs() ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── gef➤ bt #0 0x00007ffff79a5196 in ?? () from /usr/lib/libnss_myhostname.so.2 #1 0x00007ffff79ad6b1 in ?? () from /usr/lib/libnss_myhostname.so.2 #2 0x00007ffff79a1045 in ?? () from /usr/lib/libnss_myhostname.so.2 #3 0x00007ffff79aa1a6 in _nss_myhostname_gethostbyname4_r () from /usr/lib/libnss_myhostname.so.2 #4 0x00007ffff7f248ae in getaddrinfo () #5 0x00007ffff7ed5cf6 in std::sys_common::net::{impl#6}::try_from () at library/std/src/sys_common/net.rs:205 #6 0x00007ffff7ece64c in core::convert::{impl#6}::try_into<(&str, u16), std::sys_common::net::LookupHost> () at library/core/src/convert/mod.rs:590 #7 std::sys_common::net::{impl#5}::try_from () at library/std/src/sys_common/net.rs:190 #8 core::convert::{impl#6}::try_into<&str, std::sys_common::net::LookupHost> () at library/core/src/convert/mod.rs:590 #9 std::net::addr::{impl#30}::to_socket_addrs () at library/std/src/net/addr.rs:961 #10 0x00007ffff7eb91eb in main::main () #11 0x00007ffff7eb9ef3 in core::ops::function::FnOnce::call_once () #12 0x00007ffff7eb9159 in std::sys_common::backtrace::__rust_begin_short_backtrace () #13 0x00007ffff7eb8fc9 in std::rt::lang_start::{{closure}} () #14 0x00007ffff7ecb7bf in core::ops::function::impls::{impl#2}::call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> () at library/core/src/ops/function.rs:280 #15 std::panicking::try::do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> () at library/std/src/panicking.rs:492 #16 std::panicking::try + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> () at library/std/src/panicking.rs:456
#17 std::panic::catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> () at library/std/src/panic.rs:137
#18 std::rt::lang_start_internal::{closure#2} () at library/std/src/rt.rs:128
#19 std::panicking::try::do_call () at library/std/src/panicking.rs:492
#20 std::panicking::try () at library/std/src/panicking.rs:456
#21 std::panic::catch_unwind () at library/std/src/panic.rs:137
#22 std::rt::lang_start_internal () at library/std/src/rt.rs:128
#23 0x00007ffff7eb8fb1 in std::rt::lang_start ()
#24 0x00007ffff7eb9273 in main ()
gef➤
```