Michael-F-Bryan / rust-ffi-guide

A guide for doing FFI using Rust
https://michael-f-bryan.github.io/rust-ffi-guide/
Creative Commons Zero v1.0 Universal
282 stars 18 forks source link

Catching *nix signals (segfault in particular)? #63

Closed TheDan64 closed 6 years ago

TheDan64 commented 6 years ago

As part of supporting multiple versions of LLVM in inkwell, I ran into a case where perfectly valid input to a certain function for non existent data (similar to looking up a key that doesn't exist in a hash table) would cause a segfault in a small subset of versions of LLVM. This is presumably a LLVM bug since it works fine in previous versions.

Thinking about this, I'm wondering: How terrible of an idea is it to set up a signal handler(IE via the signal crate) to catch that segfault and return the error case that I would have normally returned in LLVM versions that don't segfault?

I'm thinking at best the segfault could just be a null pointer dereference, but in the worst case it could have done something crazy like corrupted the whole stack...

Is this a crazy idea? What if I first look at the LLVM source code and am able to verify that the bug(maybe there's a fix I can look at in a newer version)'s segfault is relatively harmless in each offending version (ie null ptr deref or reading (but not writing) to invalid memory?).

Michael-F-Bryan commented 6 years ago

I believe the common belief is that you shouldn't try to "catch" a segfault or mask it in any way. The way I've heard it described, segfaults are just another way to find out there's a programming bug somewhere so the correct thing to do would be to crash loudly so someone can fix the bug.

That said, because you want to still support the segfaulting versions you'll probably want to do something so users don't encounter segfaults during normal use. I can think of a couple ways to deal with the issue while still supporting the offending versions, listed from least to most hacky:


EDIT: I did a little googling and came across this C++ thread. I think one of their comments summarizes things quite well.

You can't catch segfaults. Segfaults lead to undefined behavior - period (err, actually segfaults are the result of operations also leading to undefined behavior. Anyways, if you got a segfault, you also got undefined behavior invoked, so it doesn't really matter...). And the OS takes control from your program ASAP, which actually is a Good Thing.

TheDan64 commented 6 years ago

Thanks for looking into it! I've thought about those ideas too, and I think the problems are as follows:

re: C++ thread: The way I understood it, "You can't catch segfaults" is in reference to the try/catch mechanics of C++. You can't catch it with that because it's not a C++ exception. But you can totally set up a signal handler for SIGSEV and "catch" it that way, which is what the signal library does but in rust. The point about segfaults always being UB is probably a good point, though.

I guess there's just no good solution...

Michael-F-Bryan commented 6 years ago

But you can totally set up a signal handler for SIGSEV and "catch" it that way, which is what the signal library does but in rust.

Signal handlers are just callbacks that get fired when that particular signal is received, so how do you plan to resume code flow? As far as I can tell, you'd need to stash away a return pointer and then jump to it in the signal handler if a segfault was encountered. I guess you could always execute the thing in another thread and use a "segfault encountered" flag to return the error, but I'm not sure if that'll work because LLVM isn't Send or Sync.

TheDan64 commented 6 years ago

So I was able to work around this issue for the LLVM versions in question by calling a similar function that took the exact same input params and early returned if it returned an Err, discarding an Ok. I was lucky that other function existed