catch_unwind and unwinding catch guarantees

eaglgenes101 commented 4 years ago

While thinking of proposals for delimited continuations for Rust, it occurred to me that catch_unwind is actually rather underspecified in what guarantees it makes, some of which have soundness implications in code out in the wild right now.

Is it guaranteed to catch all unwinding panics? And once it does, is it guaranteed that "normal" execution resumes from the end of the catch_unwind call?
Will it catch stack unwinds originating from language features other than panics (say, if escape continuations are introduced into the language)?
Does catch_unwind have well-defined interactions with unwinds originating from other languages? Or is letting such a thing happen straight up undefined behaviour? (This might not be in the scope of the unsafe code guidelines.)

All the documentation says right now is that catch_unwind may not catch all panics, as panics may cause process termination instead, which is not much to go by even to makes guesses based on the "spirit" of catch_unwind.

gnzlbg commented 4 years ago

Is it guaranteed to catch all unwinding panics?

Yes, catch_unwind catches all unwinding Rust panics.

Will it catch stack unwinds originating from language features other than panics

That's for the RFCs introducing those features to say. Right now, unwinding in Rust is only possible with panic! or with resume_unwind, and catch_unwind is guaranteed to catch those. The behavior of unwinding in any other way is undefined.

Does catch_unwind have well-defined interactions with unwinds originating from other languages?

Right now, the behavior of unwinding in any other way that's not via a panic! or a resume_unwind is undefined. The WG-FFI-Unwind is working on defining the behavior for more cases, but no RFCs have been merged yet.

All the documentation says right now is that catch_unwind may not catch all panics, as panics may cause process termination instead, which is not much to go by even to makes guesses based on the "spirit" of catch_unwind.

catch_unwind guarantees that it catches all unwinding Rust panics that reach it. A panic that's configured to abort the program will never reach any catch_unwind, so... sure... catch_unwind won't catch it, but... there is nothing to catch, program execution never reaches the catch_unwind in that scenario.

EDIT: I agree that the docs of catch_unwind could be clearer in this respect. What that part of the docs tries to say is that you shouldn't use catch_unwind + panic for "control flow" because that relies on the assumption that panics always unwind, and that assumption is incorrect, e.g., if -C panic=abort. Saying that it doesn't catches all panics is a bit weird, because aborts cannot be caught (the API is called catch_unwind, not catch_panic, which kind of hints that it only catches unwinding).

eaglgenes101 commented 4 years ago

Primarily, what I'm trying to determine is that if the assumption that the execution that makes it out of a catch_unwind call is always of the non-exceptional kind, is one that can be relied on by unsafe code for soundness. This assumption may break if other unwinding sources are introduced into the Rust execution model, such as the ones I mentioned. And at least from what I know, this assumption is a common one to make implicitly.

An RFC that tries to introduce new sources of unwinding is going to run into problems from inertia if this assumption is quietly ossified into the Rust ecosystem, so I am trying to make it either explicitly spelled out as something that can be relied on, or something that end users should avoid assuming.

Ixrec commented 4 years ago

Could you give an example of code that would like to rely on this?

eaglgenes101 commented 4 years ago

Okay, personally, I'd prefer to have code not rely on this. However, it's something that could reasonably be thought up of independently by someone unfamiliar with RAII trying to make their code unwind safe, and I want an official word about whether it's something that future unwinding proposals will have to deal with.

gnzlbg commented 4 years ago

Primarily, what I'm trying to determine is that if the assumption that the execution that makes it out of a catch_panic call is always of the non-exceptional kind, is one that can be relied on by unsafe code for unsoundness.

There is no catch_panic, only catch_unwind, and right now for catch_unwind the only way in which anything else can happen is if your code has undefined behavior, in which case we provide no guarantees. So right now, the only way in which the execution can make it out of a catch unwind is via a return, not via unwinding. You cannot, however, write code that relies on undefined behavior remaining undefined, and if we were to add, e.g., a longjmp-type of facility for Rust, there might be other ways in which one can ""return"" from a catch_unwind (or any other Rust function). It's up to the RFCs adding these language features to make sure that doing so does not break existing Rust code.

bjorn3 commented 4 years ago

Introducing longjmp is impossible without making rayon unsafe.

let mut v = Vec::new();
let mut env;
if(setjmp(&mut env)) {
    // UB v is mutated without joining the scoped thread which has a reference to v.
    v.push(0);
    return;
}
rayon::scope(|s| {
    s.spawn(|_| {
        println!("{:?}", v);
    });
    longjmp(&env);
});

gnzlbg commented 4 years ago

@bjorn3 that would be true if we were to make longjmp safe, but I haven't seen any proposals for that.

If longjmp is unsafe, there would be certain requirements for when it is safe to longjmp or not, and one could be that you can't longjmp over frames containing values with destructors that rely on you not deallocating them without running destructors for safety - the s: scope value in your example would be one of those, another would be Pin, etc. A much simpler model would be to say that longjmp is only safe if no values with destructors are deallocated by it (i.e. there is a range of designs that would make sense here).

(EDIT: in general, a longjmp just deallocates memory without running destructors, similar to mem::forget, and there are already types for which that is unsound, but achieving that right now in Rust requires unsafe code, so we just have to make sure that for longjmp unsafe code is required as well).

RalfJung commented 4 years ago

You cannot, however, write code that relies on undefined behavior remaining undefined

I am not sure if that clause is sufficient here... this is not just about other kinds of unwinding being undefined; in my view, other kinds of unwinding don't exist in the Abstract Machine. So there's nothing to even take into account.

There's also nothing else that unsafe Rust code currently could do to protect itself against unwinding, should that be necessary.

It's up to the RFCs adding these language features to make sure that doing so does not break existing Rust code.

Fully agreed. Thus much of what we are talking about here is mere speculation.

gnzlbg commented 4 years ago

I am not sure if that clause is sufficient here... this is not just about other kinds of unwinding being undefined; in my view, other kinds of unwinding don't exist in the Abstract Machine. So there's nothing to even take into account.

I agree that other kinds of unwinding don't exist in the Abstract Machine today. Right now, we acknowledge that these other kinds of unwinding exist when you leave the Rust AM (e.g. through FFI, inline assembly, etc.), but we just say that if these other types of unwinding interact with the Rust AM, the behavior is undefined (e.g. in the language reference about unwinding through FFI).

The question is whether users can rely on this other kinds of unwinding never becoming part of the Rust AM. We do not currently guarantee anywhere that these other kinds of unwinding will never be allowed, and I tend to think that we would have to guarantee that for users to be able to rely on it. This is confusing enough that different users reading the same docs will understand things differently. I think it should be part of the job of the definition of undefined behavior to clarify things like this.

eaglgenes101 commented 4 years ago

Yes, that's pretty much it. Reasoning about the abstract machine as it is now gives conclusions about what will work now. What I'm seeking is particular information about what the abstract machine will be in the future so I can reason about what will hold in the future.

RalfJung commented 4 years ago

I think the question here has been answered in terms of what happens for panics in Rust itself. For other kinds of unwinding, see https://github.com/rust-lang/rfcs/pull/2945.

rust-lang / unsafe-code-guidelines

catch_unwind and unwinding catch guarantees #223