Open LegionMammal978 opened 3 years ago
Hmm. Thank you for reporting, this is 100% a bug in PyO3.
For a bit of context of what's going on: we have some internal tracking inside PyO3 which ensures that the user cannot get into unsafety with two particular thorns:
GILGuard
must be dropped in reverse order to acquisition (as seen in the error message)GILGuard
with different lifetimes used to be able to lead to use-after-free in safe code (see #864). This is because we have an internal GILPool
construct to deal with the "GIL-bound references".Unfortunately in your example the mitigation we've got in place for the first point is directly clashing with the mitigation for the second point.
PyO3 manages these defences by tracking the GIL acquire and release calls internally. It also uses this internal tracking to make some optimizations, e.g. to the implementation of Clone
and Drop
for Py<T>
. An assumption PyO3 makes as part of this is that if PyO3 has acquired the GIL and calls into Python, when Python again calls Rust code the GIL is still held. This assumption is broken by the ctypes
module as you demonstrate, because as ctypes
says in its docs:
The function will release the GIL during the call.
Unless a way can be found to fix the internal tracking, I guess this implies that it's not sound. Although ctypes
is a proven example I wouldn't be suprised if this means that other ways could also be found to innocently break the internal tracking.
If it's not sound:
For the foreseeable future, I think that we should document PyO3 is incompatible with ctypes
.
We also need to remove a couple of optimizations based on PyO3's internal tracking:
Python::with_gil
Clone
and Drop
for Py<T>
I'll open some issues to track these in the morning.
... I think if we remove Python::acquire_gil
(which is the problematic API) so that this particular check is not necessary, along with the optimizations mentioned above, then your example is sound.
This will take at least a couple of deprecation-and-release cycles to allow the ecosystem to adjust.
Thank you for your help; in my actual use case, the callback isn't called directly by ctypes
but indirectly by a C library loaded via ctypes
, and I was hoping to manipulate certain Python objects which I had stored in a static thread-local variable. Also, I have a few minor questions I hope you could answer:
Python::with_gil
and Python::acquire_gil
look like?acquire_gil
at the start of my program and store an Rc<GILGuard>
within my context object and wrapper objects. (My program is entirely single-threaded.) Would it be more correct to re-acquire the GIL at all the points I need to use it? Would it be much more computationally expensive to do it that way?Clone
and Drop
for Py<T>
? I currently depend on my wrapper objects being directly Clone
, although I suppose I could write a clone
function that acquires the GIL.I think depreciating Python::with_gil
and Python::acquire_gil
would cause a considerable challenge where we use it as and when it's needed rather than passing the GIL through many, many functions to get there. In this case, it would essentially involve completely re-designing the system due to the lifetime constraints.
When playing around with cytpes with Rust and PyO3 I ended up having the core functionality be set up agnostic and creating separate public APIs to Ctypes and Pyo3 to avoid crossing between the two bounds. (although I can safely say unless you really, really, really want performance PyPy support Ctypes is considerably more hassle than its worth in a lot of cases.
I think depreciating Python::with_gil and Python::acquire_gil
I think he meant: deprecate acquire_gil and gilguard, and possibly remove some optimizations in with_gil.
Yep absolutely. Python::with_gil
is both incredibly useful and safe, because it doesn't have the drop order problems of acquire_gil
.
Do we want to get rid of acquire_gil
usage in pyo3, like in doc examples, tests, internal code etc? I can do some of that next week, I think.
Do we want to get rid of acquire_gil usage in pyo3
Yes, especially removing it from examples and the guide would be a great first step towards it's deprecation. Thanks ☺️
For a personal project, I'm trying to provide a Rust callback function to a Python module as a raw function pointer. However, if I attempt to acquire the GIL within the callback function, it causes the program to panic due to out-of-order GIL dropping. As a minimal example:
The program panics before
end callback
can be printed:However, it can clearly be seen that the
GILGuard
s are dropped in the correct order. Is this panic caused by a bug in PyO3, or am I doing something incorrectly?