gtk-rs / gtk-rs-core

Rust bindings for GNOME libraries
https://gtk-rs.org/gtk-rs-core
MIT License
272 stars 103 forks source link

Not catching panics across FFI boundaries #74

Open SimonSapin opened 5 years ago

SimonSapin commented 5 years ago

https://github.com/gtk-rs/cairo/pull/257 added bindings for “user data” owned by cairo objects. Each user data entry has a destructor function pointer that cairo calls when the object is destroyed. This helps solve life time issues, since with reference counting it can be hard to predict when an object will be destroyed exactly.

The bindings are generic and accept user data of any type. They forward destructor function pointer calls to Drop, which potentially panic and unwind. However, until https://github.com/rust-lang/rust/issues/58794 is resolved, unwinding into C is undefined behavior.

A possible fix is using std::panic::catch_unwind in the destructor function. However even if we stash the panic payload somewhere at that point, there is no good place in the code to call resume_unwind. So the best we can do might be to abort the process.

sdroege commented 5 years ago

Abort seems the best, yes.

SimonSapin commented 5 years ago

Or wait until the compiler does this abort for us? I think that’s the expected eventual resolution of https://github.com/rust-lang/rust/issues/58794 (though with an opt-out that I’m not sure is implemented yet.)

sdroege commented 5 years ago

Personally I'd wait for the compiler, I don't want to change everything back yet another time only to revert the change again in a bit.

sdroege commented 3 years ago

Made this issue a bit more generic. This applies to all gtk-rs code currently.

jhg commented 3 years ago

Are there news about if the compiler implement it or not yet?

sdroege commented 3 years ago

There's a whole WG about that: https://rust-lang.github.io/rfcs/2797-project-ffi-unwind.html / https://github.com/rust-lang/project-ffi-unwind . It's not finished yet.

jhg commented 3 years ago

Thanks for the information @sdroege 🤗

SimonSapin commented 3 years ago

The status quo is that unwinding across FFI is Undefined Behavior. Therefore all callbacks called from C should be wrapped into something that uses catch_unwind + abort.

Only if and when that WG propose languages changes that end up implemented, maybe that wrapping will become unnecessary.

jhg commented 3 years ago

@SimonSapin I found rust-lang/rust#58794 is closed and in rust-lang/rust#58760 link to https://github.com/rust-lang/rust/pull/55982 merged PR. Is that also UB when the FFI is Rust function to be used from C or that fix the UB? (even if it's in nightly only yet and it is not unwind but at least is not undefined, it will abort).

And thanks about the information about FFI and unwind.

sdroege commented 3 years ago

Currently any unwinding across extern "C" functions is UB, even if all those functions happens to be implemented in Rust. That's part of what that WG is working on solving.

For example this adds support for an extern "C-unwind" ABI that explicitly allows unwinding (and AFAIU causes unwinding through extern "C" to abort as it should).

bilelmoussaoui commented 3 years ago

@sdroege this should be moved to gtk-rs-core

piegamesde commented 2 years ago

I don't understand one thing in here: the initial issue description is specifically about Cairo "user data". But doesn't the problem also apply in any signal callbacks? I've already panicked in my code more than once and it's never blown up badly, so was it just by luck?

sdroege commented 2 years ago

Yes, it's about any callbacks that go through FFI.

piegamesde commented 2 years ago

So if I'm getting a stack trace like

   2: glib::source::fnmut_callback_wrapper_local::{{closure}}
             at ~/.cache/cargo/registry/src/github.com-1ecc6299db9ec823/glib-0.14.0/src/source.rs:180:9
   3: glib::source::trampoline
             at ~/.cache/cargo/registry/src/github.com-1ecc6299db9ec823/glib-0.14.0/src/source.rs:90:5
   4: <unknown>
   5: g_main_context_dispatch
   6: <unknown>
   7: g_main_context_iteration
   8: g_application_run
   9: <O as gio::application::ApplicationExtManual>::run_with_args
             at ~/.cache/cargo/registry/src/github.com-1ecc6299db9ec823/gio-0.14.0/src/application.rs:30:13

then that's totally not guaranteed to work in the general case?


If I understand the linked threads above correctly, then this ought to work in a defined matter some time in the future. Nevertheless, I'd argue for running signal callbacks in a catch_unwind. This would allow at least a minimum of clean shutdown within Gtk. Because the current behavior straight up aborts the Gtk main loop, leaving any resources open.

How do other language bindings handle exceptions?

sdroege commented 2 years ago

If I understand the linked threads above correctly, then this ought to work in a defined matter some time in the future. Nevertheless, I'd argue for running signal callbacks in a catch_unwind.

Something like that existed in gtk-rs for a long time. Then Rust changed to make it defined behaviour (directly abort the process) and we removed it. Then this change was reverted again and we reverted the removal. Then Rust again made it defined behaviour and we removed the gtk-rs code. And again it was reverted and promised to be added again in the near future.

And here we are now :) It seems like it's really close (I hope for real) now and actually works like that in nightly already if I understand correctly.

The behaviour in any case would be to kill the process because that's the only thing you can safely do in such situations.

How do other language bindings handle exceptions?

Unwinding across language boundaries is generally a bad idea and I'm not aware of any language binding where that is handled in any other way than UB or killing the process.

sdroege commented 2 years ago

There are some updates in https://github.com/rust-lang/rust/pull/86155#issuecomment-860278358 . It looks like this is finally going to be solved in Rust in the foreseeable future.

A6GibKm commented 11 months ago

"C-unwind" was stabilized in 1.71, see https://blog.rust-lang.org/2023/07/13/Rust-1.71.0.html#c-unwind-abi.

sdroege commented 11 months ago

That's not sufficient. What we need is that the "C" ABI aborts on panics across FFI boundaries again, which is what it's supposed to do.

With "C-unwind" we could allow panics across FFI boundaries in certain selected cases. Like, the callback to g_list_foreach() can panic and unwind from Rust to C to Rust without problems.

sdroege commented 3 days ago

With 1.81 this should be solved finally: https://github.com/rust-lang/rust/pull/116088