follow-up: longjmp annotations and optimization

nikomatsakis commented 4 years ago

As discussed in this week's meeting, we realized that permitting longjmp out of a "C" function will lose optimization potential in a case like this:

fn foo(x: &mut u32) {
    *x += 1;
    bar();
}

At least at present, we should be able to move the x down below the call to bar(), so long as we do it also on unwinding. However, longjmp would make that observable.

We discussed an idea where people annotate functions that "may longjmp" in some way -- two ideas were #[longjmp] and #[pof], though neither is ideal. It would be UB for a function that is to be deallocated unless it carries this annotation. Further:

The compiler can warn if a #[longjmp] function calls another longjmp fn (or a fn that may be longjmp, i.e., by fn pointer) with pending destructors in scope. This carries a risk of false warning since the fn may in fact not longjmp.
The compiler can warn if a non-longjmp fn calls a longjmp one. This carries a risk of missed warnings since calls by pointer could target a longjmp fn.

We would also suppress reordering optimizations around function calls in longjmp-functions.

Amanieu commented 4 years ago

It occurs to me that this optimization might be unsound even without longjmp. Consider the case where x points to a memory-mapped file and bar calls exit(). You would expect the write to x to be reflected in the file on exit, but that won't happen if the write is moved after the call.

nikomatsakis commented 4 years ago

@Amanieu Yes, so I brought up the idea that &mut references into shared memory were simply not compatible with this optimization, and I thought the conclusion might be that you should not create &mut references into shared memory (you should instead prefer raw pointers or &Cell). But memory mapped files are a good example of shared memory in practice, not sure if avoiding &mut in such cases is really practical -- maybe it's widespread practice?

BatmanAoD commented 4 years ago

I also suggested #[cancelable]. #[cancel-safe] may be more descriptive. The Linux pthreads man page uses the term "async-cancel-safe" for some functions, but I'm not sure what the "async" part means.

We discussed whether an annotation is sufficient and agreed that it probably is. This prevents annotating function pointers as #[cancelable], but that is acceptable (and in any case, annotations for function pointers may eventually be added to the language).

We also agreed that when the annotation is introduced, we can specify that using longjmp or pthread_exit to skip destructors in functions that are not annotated with #[cancelable] is always UB.

petrochenkov commented 4 years ago

One common case for longjumps is jumping from code cache (jitted code produced by any system that does interpretation and needs to speed up it in common cases, e.g. simulator) to regular code on exceptional situations (e.g. some access violation in the simulated system).

The handler for exceptional situations normally resides somewhere at the top of the code tree, so if functions that can terminate with longjump need to be annotated explicitly, then pretty much whole codebase will have to be annotated.

EDIT: Unless some inference is done for code that we see and know, and annotations are needed only for the cases of "external" code, which includes function jumping into the code cache.

EDIT2: The longjump in this case is required to be a "teleportation" rather than unwinding in this case, since there's no common stack between the code cache and regular code, so you have to tweak the lingjump behavior on platforms where it unwinds by default. However the requirement that *x += 1 must not be moved over bar() still holds.

BatmanAoD commented 4 years ago

I was actually thinking about that: a global annotation like #![cancelable] could make all the functions longjmp-safe, couldn't it?

On Fri, Jun 12, 2020 at 2:18 AM Vadim Petrochenkov notifications@github.com wrote:

One common case for longjumps is jumping from code cache (jitted code produced by any system that does interpretation and needs to speed up it in common cases, e.g. simulator) to regular code on exceptional situations (e.g. some access violation in the simulated system).

The handler for exceptional situations normally resides somewhere at the top of the code tree, so if functions that can terminate with longjump need to be annotated explicitly, then pretty much whole codebase will have to be annotated.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rust-lang/project-ffi-unwind/issues/30#issuecomment-643141091, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARU4TZW3SHRGKKZFMXHFLDRWHQG3ANCNFSM4N3UY6KQ .

bjorn3 commented 4 years ago

That won't make dependencies longjmp-safe, while making it much easier to forget about longjmp-safety when you use it.

BatmanAoD commented 4 years ago

Per Niko's suggested warning scheme, it would emit a warning for every single call into a non-longjmp-safe dependency. So in practice, it would only be convenient to use in isolation or with other dependencies designed with longjmp-safety in mind.

nikomatsakis commented 4 years ago

The Linux pthreads man page uses the term "async-cancel-safe" for some functions, but I'm not sure what the "async" part means.

I imagine the "async" in "async cancel safe" refers to whether an asynchronous signal could cancel the function at any point (versus saying that it can be canceled at each point where it invokes another function). The lint we were proposing (check that no dtors are in scope at each function call) would therefore make things "cancel safe" but not "async cancel safe".

nikomatsakis commented 4 years ago

I feel like the "annotate a ton of code" use case might also be handled by procedural macros at the module level, though I don't know how well that works. It feels like a bit of an edge case to me I guess. Still, we could permit it at the module level for sure.

rust-lang / project-ffi-unwind

follow-up: longjmp annotations and optimization #30