Closed ckaran closed 5 years ago
I get same issue on nightly:
Thread 1 "veloren-voxygen" received signal SIGILL, Illegal instruction.
0x00005555559e9f05 in x11_dl::xlib_xcb::Xlib_xcb::open () at /home/o01eg/.cargo/registry/src/github.com-1ecc6299db9ec823/x11-dl-2.18.3/src/link.rs:65
65 }
(gdb) disassemble
Dump of assembler code for function x11_dl::xlib_xcb::Xlib_xcb::open:
0x00005555559e9eb0 <+0>: push %rbx
0x00005555559e9eb1 <+1>: sub $0x30,%rsp
0x00005555559e9eb5 <+5>: mov %rdi,%rbx
0x00005555559e9eb8 <+8>: lea 0x307429(%rip),%rsi # 0x555555cf12e8
0x00005555559e9ebf <+15>: lea 0x4d5c72(%rip),%rcx # 0x555555ebfb38
0x00005555559e9ec6 <+22>: lea 0x8(%rsp),%rdi
0x00005555559e9ecb <+27>: mov $0xa,%edx
0x00005555559e9ed0 <+32>: mov $0x2,%r8d
0x00005555559e9ed6 <+38>: callq 0x5555559e89a0 <x11_dl::link::DynamicLibrary::open_multi>
0x00005555559e9edb <+43>: cmpq $0x1,0x8(%rsp)
0x00005555559e9ee1 <+49>: jne 0x5555559e9f05 <x11_dl::xlib_xcb::Xlib_xcb::open+85>
0x00005555559e9ee3 <+51>: movups 0x10(%rsp),%xmm0
0x00005555559e9ee8 <+56>: movups 0x20(%rsp),%xmm1
0x00005555559e9eed <+61>: movups %xmm1,0x18(%rbx)
0x00005555559e9ef1 <+65>: movups %xmm0,0x8(%rbx)
0x00005555559e9ef5 <+69>: movq $0x1,(%rbx)
0x00005555559e9efc <+76>: mov %rbx,%rax
0x00005555559e9eff <+79>: add $0x30,%rsp
0x00005555559e9f03 <+83>: pop %rbx
0x00005555559e9f04 <+84>: retq
=> 0x00005555559e9f05 <+85>: ud2
Minimal example:
extern crate x11_dl;
use x11_dl::xlib_xcb;
fn main () {
unsafe {
let xlib_xcb = xlib_xcb::Xlib_xcb::open().unwrap();
}
}
It was broken between https://github.com/rust-lang/rust/commit/24a9bcbb7cb0d8bdc11b8252a9c13f7562c7e4ca (nightly-2019-07-05) and https://github.com/rust-lang/rust/commit/481068a707679257e2a738b40987246e0420e787 (nightly-2019-07-06)
Seemed to be caused by https://github.com/rust-lang/rust/pull/62150, and doesn't affect other than xlib_xcb libraries.
I suppose minimum supported rust version is from Debian Jessie. What if left 1.8.x with Debian Jessie minimum, and set new minimum supported rust version with MaybeUninit
in 1.9.x?
Do you know which code on your end is creating the zeroed reference?
A quick search brought up
which looks... very bad.^^ There's not even a comment explaining what this is doing, or under which conditions it is safe to call---something that should be done for every unsafe fn
.
How is it different from transmute_copy
?
@RalfJung I'm still digging into this, but from the minimal example from @o01eg , it seems to be crashing at this line in release mode:
Ah yes, that's incorrect. ManuallyDrop
has no effect on the rule that data must be "valid", such as references not being NULL. You have to use MaybeUninit
for that.
I'm new to MaybeUninit, just heard about it when it was stabilized. Can you initialize parts of the struct at a time or does it have to all be in one go with as_mut_ptr().write()
?
My other thought to work through this would be to generate a struct with all fields as Options, write to the fields as you go, then convert to a struct with the same fields which aren't contained in Options.
Can you initialize parts of the struct at a time or does it have to all be in one go with as_mut_ptr().write() ?
Unfortunately that's not possible yet in Rust. :/ This requires a solution for https://github.com/rust-lang/rfcs/pull/2582.
But you can get close, by getting a raw pointer to a field: &mut (*uninit.as_mut_ptr()).field as *mut _
.
Ahhh okay, that's enough to get me started. I'll take a crack at it tomorrow when I have access to a Linux machine, thanks!
Do you know which code on your end is creating the zeroed reference?
A quick search brought up
which looks... very bad.^^ There's not even a comment explaining what this is doing, or under which conditions it is safe to call---something that should be done for every
unsafe fn
.How is it different from
transmute_copy
?
It's been a long time since I wrote that code, so I don't remember what my reasoning was at the time, but it may be that I only know of transmute
and not transmute_copy
but needed a function that can operate on types of different sized. I'd be afraid to meddle with some of this code myself as I've been away from this project for 2 years, but I believe it would be safe to remove that ugly function and use transmute_copy
instead.
As for this bug coming up again, using mem::uninitialized
instead of mem::zeroed
once fixed it, but some of the Rust devs seem to be doubling down on breaking backwards compatibility and closing issues when it's brought up. I'm not really sure what can be done without breaking library compatibility. If, from the start, the macro had generated an inner struct made up of only function pointers and without a destructor, this bug probably never would have happened, but the library would have been even more awkward to use.
Interesting further discussion here: https://github.com/rust-lang/rust/issues/52898#issuecomment-513052832
I believe it would be safe to remove that ugly function and use transmute_copy instead.
It would be great if someone who's uo-to-date on the project could look into this. transmute_copy
is a very sharp knife only to be used with great care, but at least it is a fairly widely known knife, which makes it much better than an entirely undocumented unsafe function with a mysterious name. (What does the "union" part of it mean?)
some of the Rust devs seem to be doubling down on breaking backwards compatibility and closing issues when it's brought up
I think that is a very unfair characterization of what is going on. But this is being discussed in the other thread mentioned by @bschwind.
Oh and for completeness' sake, the likely cause of this SIGILL is https://github.com/rust-lang/rust/pull/62150.
Btw, my best guess for where the bad function pointers are declared is
If you want to zero/"uninit"-initialize that struct, these should definitely be Option<fn>
. A fn
may never be 0.
@Daggerbot How are the compile times if you have two versions of the struct, one with Option
I realize there is a good chance this isn't helping because you now have two versions of this giant struct. Perhaps you can work on an array of usizes directly instead and transmute that.
The revert PR has just been merged in Rust upstream so fixing this issue isn't as urgent any more: https://github.com/rust-lang/rust/pull/63343 . They will do the change sooner or later though so it should not be ignored.
fixing this issue isn't as urgent any more
No, it's just as urgent. We only reverted this in Rust so that users of this crate don't have to suffer, and we want to un-do the revert as soon as we can. Basically, we granted you a reprieve. We would much appreciate if y'all could help us in this (help us keeping the compiler clean and not accumulate cruft) by fixing this bug in this crate. :)
We will also land a lint on nightly soon (hopefully in time for the beta) that will help find code that uses mem::zeroed
/mem::uninitialized
the wrong way. It won't find all the bugs, but I think it should be able to find the bug in the code here.
@RalfJung it's not urgent for me because my project uses Rust nightly as well as x11-rs. As for helping this crate, I can't because the problem mostly seems to be blocked on maintainers. There is an open PR to fix it but they aren't merging it. And yeah I saw the lint PR as well, it's good to have it (would be good to warn on bad assume_init use too, right now the example right at the top of MaybeUninit docs is linting only for mem::uninitialized, not for MaybeUninit).
We can't do the same warning on general assume_init
, so we'd have to detect the specific case of zeroed().assume_init()
. Sure, would be nice to do, but I think that's way less common. I mean that basically can only happen if people write unsafe code without reading any docs.
Indeed the lint is firing:
warning: the type `std::mem::ManuallyDrop<xrandr::Xrandr>` does not permit being left uninitialized
--> x11-dl/src/link.rs:59:17
|
59 | = ::std::mem::uninitialized();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
::: x11-dl/src/xrandr.rs:16:1
|
16 | / x11_link! { Xrandr, xrandr, ["libXrandr.so.2", "libXrandr.so"], 70,
17 | | pub fn XRRAddOutputMode (dpy: *mut Display, output: RROutput, mode: RRMode) -> (),
18 | | pub fn XRRAllocGamma (size: c_int) -> *mut XRRCrtcGamma,
19 | | pub fn XRRAllocModeInfo (name: *const c_char, nameLength: c_int) -> *mut XRRModeInfo,
... |
88 | | globals:
89 | | }
| |_- in this macro invocation
|
= note: this means that this code causes undefined behavior when executed
= help: use `MaybeUninit` instead
With an ongoing PR, this now even points at the field causing the issue:
warning: the type `std::mem::ManuallyDrop<xrandr::Xrandr>` does not permit being left uninitialized
--> x11-dl/src/link.rs:59:17
|
59 | = ::std::mem::uninitialized();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
::: x11-dl/src/xrandr.rs:16:1
|
16 | / x11_link! { Xrandr, xrandr, ["libXrandr.so.2", "libXrandr.so"], 70,
17 | | pub fn XRRAddOutputMode (dpy: *mut Display, output: RROutput, mode: RRMode) -> (),
18 | | pub fn XRRAllocGamma (size: c_int) -> *mut XRRCrtcGamma,
19 | | pub fn XRRAllocModeInfo (name: *const c_char, nameLength: c_int) -> *mut XRRModeInfo,
... |
88 | | globals:
89 | | }
| |_- in this macro invocation
|
note: Function pointers must be non-null (in this struct field)
--> x11-dl/src/link.rs:30:9
|
30 | $(pub $fn_name: unsafe extern "C" fn ($($param_type),*) -> $ret_type,)*
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
::: x11-dl/src/xrandr.rs:16:1
|
16 | / x11_link! { Xrandr, xrandr, ["libXrandr.so.2", "libXrandr.so"], 70,
17 | | pub fn XRRAddOutputMode (dpy: *mut Display, output: RROutput, mode: RRMode) -> (),
18 | | pub fn XRRAllocGamma (size: c_int) -> *mut XRRCrtcGamma,
19 | | pub fn XRRAllocModeInfo (name: *const c_char, nameLength: c_int) -> *mut XRRModeInfo,
... |
88 | | globals:
89 | | }
| |_- in this macro invocation
= note: this means that this code causes undefined behavior when executed
= help: use `MaybeUninit` instead
Issue #90 seems to have come back again...
The issue is really, really weird though; when I build in debug mode, I don't have this bug, but when I rebuild (completely from scratch) in release mode, it comes back to bite me. Here's the output from the
lldb
. Compiler info, etc., are in the Meta section below.Meta