Open adr26 opened 1 year ago
Thanks for the detailed analysis (and PR), @adr26! I'll take a look.
Thank you @ericsnowcurrently:
I set the PR up as draft, as I wasn't sure whether this would be the right way to fix the issue — I would be very open to any direction on this.
I have re-run the test_threading
mentioned in bpo-9901 a couple of times locally (on Windows) and it passed, so there is some evidence that this provides a fix that doesn't regress the reason that the call to _PyEval_FiniGIL()
was deferred in the first place. I would value any comments on whether any further directed test coverage is needed to ensure specifically that we don't regress the original reason this change was put in.
Just to be clear about my wording above: the sequence of Py_Initialize()
/ Py_FinalizeEx()
calls fails on multiple 3.x versions I've tested with. In each case, there's a lack of GIL cleanup before it's re-initialised. The only difference with the per-interpreter GIL changes is the mechanism that _PyEval_FiniGIL()
uses to conclude it doesn't need to call destroy_gil()
—
_PyEval_FiniGIL()
called gil_created()
and saw gil->locked == uninitialized
and so bailed before calling destroy_gil()
,_PyEval_FiniGIL()
sees interp->ceval.gil == NULL
and bails before calling destroy_gil()
.In both cases, the result was the same (incorrect) behaviour of destroy_gil()
not getting called in the deferred _PyEval_FiniGIL()
. Given you'd noted this deferred call to _PyEval_FiniGIL()
was suspect, I thought you would be the right person to ask, so this isn't impugning gh-104210 — the problem was there both before and after it 😄!
Finally, if the code seems logically correct, then there is a stylistic question as to whether now that _PyEval_FiniGIL()
is local to ceval_gil.c
, its name should be changed and the external declaration in pycore_ceval.h
removed... I'd welcome feedback on that too.
Thanks, Andrew R
@adr26, I started taking a look at this, but realized that there have been quite a few changes in this area recently for the free-threading build. Could you take a look and see if you can still reproduce the problem using the 3.13 branch?
Hi @ericsnowcurrently, I'm on a vacation just now but will have a look at this once I'm back & let you know if I can reproduce the problem still: thank you for looking into this!
The following simple balanced calls to
Py_Initialize()
/Py_FinalizeEx()
trigger an error when running under Microsoft's Application Verifier on Windows:The above sequence causes there to be two calls to
create_gil()
for the main GIL, without an intervening call todestroy_gil()
: the mismatched use ofPyMUTEX_INIT()
/PyMUTEX_FINI()
(create_gil()
callsPyMUTEX_INIT()
anddestroy_gil()
callsPyMUTEX_FINI()
) translates into mismatched calls to Win32 APIsInitializeCriticalSection()
andDeleteCriticalSection()
, which are then detected by Application Verifier.[Note, however, that this error is not specific to Windows and the GIL is being incorrectly re-initialised without a call to
PyMUTEX_FINI()
on all platforms.]The first
Py_Initialize()
causes a call tocreate_gil()
oninterp->ceval.gil
for the main interpreter (_PyRuntime._main_interpreter
), but as a result of bpo-9901,finalize_interp_delete()
(as called fromPy_FinalizeEx()
) currently defers calling_PyEval_FiniGIL()
(and therebydestroy_gil()
) until the nextPy_Initialize()
:https://github.com/python/cpython/blame/41aff464cef83d2655029ddd180a51110e8d7f8e/Python/pylifecycle.c#L1740
Unfortunately, the second call
Py_Initialize()
doesn't calldestroy_gil()
before callingcreate_gil()
a second time on the main interpreter's GIL, causing this error.@ericsnowcurrently's comment on
init_interp_create_gil()
('XXX This is broken with a per-interpreter GIL
'):notes that the call to
_PyEval_FiniGIL()
ininit_interp_create_gil()
doesn't work with per-interpreter GIL, but it's worse than this as it's broken with just re-creating the main interpreter!The reason for all this appears to be that the GIL is re-initialised by
_PyRuntime_Initialize()
, before_PyEval_FiniGIL()
has a chance to get its hands on it. Prior to GH-104210 from @ericsnowcurrently landing yesterday (and on previous Python versions),_PyEval_FiniGIL()
would detect that the GIL is reinitialised by a prior call to_gil_initialize()
within_PyRuntime_Initialize()
. However, now the main interpreter (_PyRuntime._main_interpreter
) has a GIL pointer (interp->ceval.gil
) which is cleared when_PyRuntimeState_Init()
resets the runtime to_PyRuntimeState_INIT
:I have reproduced this with recent cpython commit
bf89d4283a28dd00836f2c312a9255f543f93fc7
and have attached a log of the callstacks (avrf-gil.txt).Linked PRs