rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.9k stars 1.56k forks source link

Can we stop generating illegal instructions? #1454

Closed expeditiousRubyist closed 8 years ago

expeditiousRubyist commented 8 years ago

In my experiments in programming Rust, I've noticed that the panic! macro seems to generate an instruction called ud2 on x86-64 -- an instruction that is intentionally invalid. Doing this, I presume, is to intentionally raise a SIGILL and crash the program, a practice I have not seen in any other systems programming language. This should likely be for the same reason that no one terminates a process by intentionally dereferencing a null pointer -- the OS has more graceful ways of stopping it, and nobody wants to add "segmentation fault" or "illegal instruction" (or similar output messages depending on the OS) to their program's output. On Windows, this becomes particularly annoying, as running into either of these errors will cause a window to appear for a few seconds informing the user that the program has stopped working.

Would it be unreasonable to simply print the appropriate error message and call std::process::exit()?

nagisa commented 8 years ago

There’s a bunch of reasons to generate the unreachable (in LLVM, which compiles down to ud2 on x86… most of the time) instruction. Generating this instruction provides hints to the optimiser, that the instruction in question is not reachable. Not generating this instruction at all would result in slower compilation/worse optimisations. Namely:

  1. After a diverging call. Diverging call must not return and returning from one is a UB. Generating unreachable there enables optimisations and ability to discover bugs in user code faster (there has been cases where they’ve been discovered much faster). This is also the kind of unreachable you see generated with panic!. Panic macro calls a diverging function which begins unwinding and must not return;
  2. In matches/switches where “default” branch is unreachable, because the match is provably exhaustive in Rust. You can’t always tell that on the LLVM side. A valid optimisation;
  3. And many more.

If you’re hitting (executing) a ud2/unreachable (other than invoking ::std::intrinsics::unreachable yourself), you’ve invoked some sort of undefined behaviour in your code. It is not guaranteed unreachable instruction will generate ud2; Actually reaching that code is UB and might result in anything ranging from nothing happening to a null pointer dereference, launch of all the nuclears USA possesses and ejection of your CPU out of its socket all at once.

If you can prove there’s no code invoking UB in your code and the libraries you use, its a bug in rustc.

sfackler commented 8 years ago

@nagisa we use unreachable in other contexts, like an OOM, that valid code can hit.

EDIT: we actually specifically use the abort intrinsic instead, which is lowered to a ud2 as well.

mahkoh commented 8 years ago

@nagisa You're confusing abort and unreachable and OP is complaining not because of the existence of abort but because unwinding is not implemented on windows and thus panicking causes an abort.

nagisa commented 8 years ago

@sfackler alloc::oom is an intrinsics::abort. I find it interesting llvm.trap() compiles to ud2 on x86, but otherwise is it the only other intrinsic that results in ud2? Seems fixable by generating llvm.debugtrap() instead?

mahkoh commented 8 years ago

Seems fixable by generating llvm.debugtrap() instead?

There is no need for a fix because nothing is broken.

nagisa commented 8 years ago

@mahkoh

unwinding is not implemented on windows

32bit MSVC only, IIRC. The OP has explicitly mentioned x86-64. Both of x86-64-msvc and x86-64-gnu have implementations of unwinding.

mahkoh commented 8 years ago

Then there is another bug somewhere but this has nothing to do with ud2.

sfackler commented 8 years ago

@mahkoh I'm not sure how I follow. An OOM calls intrinsics::abort which lowers to ud2. A SIGILL may not provide the best user experience, which is exactly what @nagisa said.

mahkoh commented 8 years ago

@sfackler Then OOM should do something else. Changing abort into a breakpoint is nonsensical.

huonw commented 8 years ago

AFAIK, panic!() does not/should not execute an ud2 except on 32-bit windows (https://github.com/rust-lang/rust/issues/25869). Could you post more details about your set-up (e.g. the output of rustc -vV) & the code you're running, @expeditiousRubyist?

expeditiousRubyist commented 8 years ago

@huonw sure. The code I'd been fiddling around with was more or less this: https://gist.github.com/expeditiousRubyist/2997ba449d6ee5cb70dc

And when compiled to object code and disassembled using the following commands

rustc -C opt-level=3 --emit obj cell.rs
objdump -d -M intel cell.o

my code produced the following disassembly: https://gist.github.com/expeditiousRubyist/7f5922cf732bc36c34c0

My result of running rustc -vV is as follows:

rustc 1.4.0 (8ab8581f6 2015-10-27)
binary: rustc
commit-hash: 8ab8581f6921bc7a8e3fa4defffd2814372dcb15
commit-date: 2015-10-27
host: x86_64-pc-windows-gnu
release: 1.4.0
huonw commented 8 years ago

Oh, were you calling cell_get_fixnum from C (or some other non-Rust language)? If so, panicking isn't supported across language boundaries. panic! isn't designed as a robust error handling strategy, but rather the last resort when a program cannot continue, and this is actually one of the big motivations for std::panic::recover (nightly-only atm).

A good way to resolve this is to use some other way to handle errors, e.g. return a worked/failed boolean from the function:

pub unsafe extern fn cell_get_fixnum(cell: *mut Cell, num: *mut i32) -> i32 {
    match *cell {
        Cell::Fixnum(i) => { *num = i; 1 }
        Cell::Double(f) => { *num = f as f64; 1 }
        _ => { 0 }
    }
}

(Incidentally, if you're calling things from other languages you'll need the extern as well as pub etc.)

expeditiousRubyist commented 8 years ago

Well the intention was to "print error message and stop application" upon a bad state, which is why I was using panic!, although it would seem that if its behavior is intended to be undefined when run from another program, I may just have to write my own crash() function to handle this all gracefully (even if it's not necessarily idiomatic Rust). Still, I am confused as to why Rust emits illegal instructions in the first place. Although I am not the most knowledgeable about compilers in general, I did find myself asking a professor today his opinions on when it would be appropriate for a compiler to emit such an instruction. His response was a simple "never," leaving me to question the logic behind these choices.

But anyways, thanks for the advice with regards to an alternative way to handle those errors. Although I was under the impression that extern is for importing functions written in C, and not necessarily needed for exporting functions, as I have been able to get functions to be visible and callable from C without the keyword, so long as I use #[no_mangle].

nagisa commented 8 years ago

Still, I am confused as to why Rust emits illegal instructions in the first place. … leaving me to question the logic behind these choices.

What you see in your disassembly, namely the call 4d <_ZN10sys_common6unwind12begin_unwind21h11851751900855523014E+0x1d> is a call to diverging function (i.e. one in form of fn begin_unwind(…) -> !). Ergo, if implemented properly (and there’s no UB), the function should never return. Putting ud2 there is a safety switch for that invariant.

Consider what would happen, if ud2 wasn’t here. The function, which begins unwinding, would’ve returned and executed completely unrelated code! The resulting issue report would then be not about the ud2 instruction, but rather about bad code generation and would require much more investigation to discover there was some UB at work.

For now I filled https://github.com/rust-lang/rust/issues/30791 resolving which would make panic!() properly diverging in the particular case you encountered.

huonw commented 8 years ago

Yeah, if you want certain exit behaviour writing it yourself will be better than relying on panic! to do something appropriate (which it probably isn't e.g. the error formatting of panic! is not a good diagnostic for people who are using the library/application rather than writing it).

Although I am not the most knowledgeable about compilers in general, I did find myself asking a professor today his opinions on when it would be appropriate for a compiler to emit such an instruction. His response was a simple "never," leaving me to question the logic behind these choices.

It's a little more subtle than never: it's a very effective way to stop execution when there's a problem, e.g. Clang/LLVM sometimes emits a ud2 when it detects undefined behaviour, and the Linux kernel uses it for its BUG macro. Also, apparently it is something CPUs understand, and can use to avoid speculatively reading/decoding code that it doesn't need to:

[ud] is used as a signal to the instruction fetcher that the processor should not waste efforts decoding the following memory since the execution will continue at a different location.

Also, it might be worth noting that rustc isn't directly choosing to emit ud2: depending on the situation is either inserted by the LLVM compiler backend automatically or comes because rustc calls the llvm.trap intrinsic of the LLVM compiler backend, and the LLVM developers have decided ud2 is the appropriate instruction for that. (I think the main alternative for the latter would be calling the C abort function, but this would be less appropriate in a freestanding context like writing an operating system, where libc and hence abort doesn't automatically exist.)

Although I was under the impression that extern is for importing functions written in C, and not necessarily needed for exporting functions, as I have been able to get functions to be visible and callable from C without the keyword, so long as I use #[no_mangle].

There's two uses of the extern keyword:

The latter is what I'm talking about. Both ends of a function call foo() need to know the calling convention: the callsite of foo() needs to know where to put the arguments and the internals of foo itself need to know where to look to find those arguments. By default a fn uses the Rust calling convention which may not/is not the same as the C calling convention. The latter is what the C side of the call will assume and so one needs to ensure the Rust function matches by writing extern fn to switch the ABI (that's a short hand for extern "C" fn). This doesn't affect the visibility of the symbol/whether it can be called, but it can lead to some very confusing bugs, where a call in C like foo(0, 1, 2, 3) doesn't end up with those values on the Rust side.


Thanks for filing, but I don't think there's any action to be taken here, especially now that https://github.com/rust-lang/rust/issues/30791 has been opened: usually, it's a bug in the program (or some library it is using) if a ud2 is executed, and hence Windows' response isn't entirely wrong (in some cases like https://github.com/rust-lang/rust/issues/25869, it is a compiler bug but I don't think this is one of them). As such, I'm closing, but feel free to continue the discussion, although it may be more appropriate to move to https://users.rust-lang.org/ or stack overflow. :smile: