PyO3 / pyo3

Rust bindings for the Python interpreter
https://pyo3.rs
Apache License 2.0
11.99k stars 740 forks source link

PyO3 is incompatible with `ctypes` #1683

Open LegionMammal978 opened 3 years ago

LegionMammal978 commented 3 years ago

For a personal project, I'm trying to provide a Rust callback function to a Python module as a raw function pointer. However, if I attempt to acquire the GIL within the callback function, it causes the program to panic due to out-of-order GIL dropping. As a minimal example:

use pyo3::prelude::*;

extern "C" fn callback() {
    println!("begin callback");
    let _ = Python::acquire_gil();
    println!("end callback");
}

fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        PyModule::import(py, "ctypes")?
            .call_method1("CFUNCTYPE", ((),))?
            .call1((callback as *const () as usize,))?
            .call0()?;
        Ok(())
    })
}

The program panics before end callback can be printed:

$ target/debug/gil_test 
begin callback
thread 'main' panicked at 'The first GILGuard acquired must be the last one dropped.', /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/gil.rs:290:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'main' panicked at 'The first GILGuard acquired must be the last one dropped.', /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/gil.rs:290:17
stack backtrace:
   0:     0x5599ade9a3b0 - std::backtrace_rs::backtrace::libunwind::trace::h04d12fdcddff82aa
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/../../backtrace/src/backtrace/libunwind.rs:100:5
   1:     0x5599ade9a3b0 - std::backtrace_rs::backtrace::trace_unsynchronized::h1459b974b6fbe5e1
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x5599ade9a3b0 - std::sys_common::backtrace::_print_fmt::h9b8396a669123d95
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x5599ade9a3b0 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he009dcaaa75eed60
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:46:22
   4:     0x5599adeb383c - core::fmt::write::h77b4746b0dea1dd3
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/fmt/mod.rs:1078:17
   5:     0x5599ade98252 - std::io::Write::write_fmt::heb7e50902e98831c
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/io/mod.rs:1518:15
   6:     0x5599ade9c515 - std::sys_common::backtrace::_print::h2d880c9e69a21be9
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:49:5
   7:     0x5599ade9c515 - std::sys_common::backtrace::print::h5f02b1bb49f36879
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:36:9
   8:     0x5599ade9c515 - std::panicking::default_hook::{{closure}}::h658e288a7a809b29
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:208:50
   9:     0x5599ade9c1b8 - std::panicking::default_hook::hb52d73f0da9a4bb8
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:227:9
  10:     0x5599ade9cc51 - std::panicking::rust_panic_with_hook::hfe7e1c684e3e6462
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:593:17
  11:     0x5599ade81d4e - std::panicking::begin_panic::{{closure}}::h67f425c0631152f9
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:522:9
  12:     0x5599ade81bd9 - std::sys_common::backtrace::__rust_end_short_backtrace::h726b4232d880f43b
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:141:18
  13:     0x5599ade81c7b - std::panicking::begin_panic::h4b52fd2d813c0948
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:521:12
  14:     0x5599ade60896 - <pyo3::gil::GILGuard as core::ops::drop::Drop>::drop::{{closure}}::h8e6c3302036811a8
                               at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/gil.rs:290:17
  15:     0x5599ade62dc1 - std::thread::local::LocalKey<T>::try_with::hd7fb52379d7bc09d
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/thread/local.rs:272:16
  16:     0x5599ade60798 - <pyo3::gil::GILGuard as core::ops::drop::Drop>::drop::h5070dea984a3f707
                               at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/gil.rs:285:17
  17:     0x5599ade6c57f - core::ptr::drop_in_place::hca1b1d785a584a01
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ptr/mod.rs:175:1
  18:     0x5599ade6c2c0 - core::ptr::drop_in_place::h7d2dc00e35e21d91
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ptr/mod.rs:175:1
  19:     0x5599ade6c5ff - core::ptr::drop_in_place::hddef70a15a5993ae
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ptr/mod.rs:175:1
  20:     0x5599ade5e144 - pyo3::python::Python::with_gil::ha183e49a7426795a
                               at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/python.rs:158:5
  21:     0x5599ade5dc91 - pyo3_test::main::h0624e64f2681f1e6
                               at /home/lm978/nested/pyo3_test/src/main.rs:10:5
  22:     0x5599ade5e642 - core::ops::function::FnOnce::call_once::h7c181d0d3d6c9cdf
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:227:5
  23:     0x5599ade5dbaa - std::sys_common::backtrace::__rust_begin_short_backtrace::haa621927d913d7d2
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:125:18
  24:     0x5599ade5deb6 - std::rt::lang_start::{{closure}}::he8806ad14534304d
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/rt.rs:66:18
  25:     0x5599ade9d167 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h57e2a071d427b24c
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:259:13
  26:     0x5599ade9d167 - std::panicking::try::do_call::h81cbbe0c3b30a28e
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:381:40
  27:     0x5599ade9d167 - std::panicking::try::hbeeb95b4e1f0a876
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:345:19
  28:     0x5599ade9d167 - std::panic::catch_unwind::h59c48ccb40a0bf20
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panic.rs:396:14
  29:     0x5599ade9d167 - std::rt::lang_start_internal::ha53ab63f88fee728
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/rt.rs:51:25
  30:     0x5599ade5de87 - std::rt::lang_start::h1e6a375f16e4e70e
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/rt.rs:65:5
  31:     0x5599ade5dcca - main
  32:     0x7f930dcc2bf7 - __libc_start_main
  33:     0x5599ade5d4ca - _start
  34:                0x0 - <unknown>
thread panicked while panicking. aborting.
Illegal instruction (core dumped)

However, it can clearly be seen that the GILGuards are dropped in the correct order. Is this panic caused by a bug in PyO3, or am I doing something incorrectly?

davidhewitt commented 3 years ago

Hmm. Thank you for reporting, this is 100% a bug in PyO3.

For a bit of context of what's going on: we have some internal tracking inside PyO3 which ensures that the user cannot get into unsafety with two particular thorns:

Unfortunately in your example the mitigation we've got in place for the first point is directly clashing with the mitigation for the second point.


PyO3 manages these defences by tracking the GIL acquire and release calls internally. It also uses this internal tracking to make some optimizations, e.g. to the implementation of Clone and Drop for Py<T>. An assumption PyO3 makes as part of this is that if PyO3 has acquired the GIL and calls into Python, when Python again calls Rust code the GIL is still held. This assumption is broken by the ctypes module as you demonstrate, because as ctypes says in its docs:

The function will release the GIL during the call.

Unless a way can be found to fix the internal tracking, I guess this implies that it's not sound. Although ctypes is a proven example I wouldn't be suprised if this means that other ways could also be found to innocently break the internal tracking.

If it's not sound:


For the foreseeable future, I think that we should document PyO3 is incompatible with ctypes.

We also need to remove a couple of optimizations based on PyO3's internal tracking:

I'll open some issues to track these in the morning.

davidhewitt commented 3 years ago

... I think if we remove Python::acquire_gil (which is the problematic API) so that this particular check is not necessary, along with the optimizations mentioned above, then your example is sound.

This will take at least a couple of deprecation-and-release cycles to allow the ecosystem to adjust.

LegionMammal978 commented 3 years ago

Thank you for your help; in my actual use case, the callback isn't called directly by ctypes but indirectly by a C library loaded via ctypes, and I was hoping to manipulate certain Python objects which I had stored in a static thread-local variable. Also, I have a few minor questions I hope you could answer:

ChillFish8 commented 3 years ago

I think depreciating Python::with_gil and Python::acquire_gil would cause a considerable challenge where we use it as and when it's needed rather than passing the GIL through many, many functions to get there. In this case, it would essentially involve completely re-designing the system due to the lifetime constraints.

When playing around with cytpes with Rust and PyO3 I ended up having the core functionality be set up agnostic and creating separate public APIs to Ctypes and Pyo3 to avoid crossing between the two bounds. (although I can safely say unless you really, really, really want performance PyPy support Ctypes is considerably more hassle than its worth in a lot of cases.

mejrs commented 3 years ago

I think depreciating Python::with_gil and Python::acquire_gil

I think he meant: deprecate acquire_gil and gilguard, and possibly remove some optimizations in with_gil.

davidhewitt commented 3 years ago

Yep absolutely. Python::with_gil is both incredibly useful and safe, because it doesn't have the drop order problems of acquire_gil.

mejrs commented 3 years ago

Do we want to get rid of acquire_gil usage in pyo3, like in doc examples, tests, internal code etc? I can do some of that next week, I think.

davidhewitt commented 3 years ago

Do we want to get rid of acquire_gil usage in pyo3

Yes, especially removing it from examples and the guide would be a great first step towards it's deprecation. Thanks ☺️