Open EthanSteinberg opened 1 month ago
PyDict_SetDefaultRef appears to be the key for making this actually safe. It enables a conditional set on a dictionary.
Extra background (to the best of my knowledge, please correct if necessary):
get_internals()
is called from here (PYBIND11_MODULE()
macro) during module import (which is still protected by the GIL, even with free-threading enabled): https://github.com/pybind/pybind11/blob/fc59f4e6e508f28c0e2896937660b9eae0f5ea51/include/pybind11/detail/common.h#L489
I believe it would be good to work this into the PR description, and to explain the situations in which get_internals()
might be called without being protected either by a mutex or the module import protections.
Required prerequisites
What version (or hash if on master) of pybind11 are you using?
a1d00916b26b187e583f3bce39cd59c3b0652c32
Problem description
https://github.com/pybind/pybind11/blob/a1d00916b26b187e583f3bce39cd59c3b0652c32/include/pybind11/detail/internals.h#L498 is not thread safe under free-threading.
internals is a global singleton struct shared across all pybind11 modules. get_internals() either retrieves that current global singleton or creates a new one.
The problem is double initialization, where two copies of internals would be created by multiple modules being imported/initalized in different threads.
https://github.com/pybind/pybind11/blob/a1d00916b26b187e583f3bce39cd59c3b0652c32/include/pybind11/detail/internals.h#L520-L524 are the key problematic lines where we retrieve the current global, and initialize it if necessary.
The problem is that if two threads hit those lines at the same time, both threads could see that internals is null and both threads could initialize internals, which would lead to two copies.
Currently this is protected via the GIL (https://github.com/pybind/pybind11/blob/master/include/pybind11/detail/internals.h#L505), but that is problematic with the new python free threading setup.
In the short term this is still OK because the GIL is re-enabled during module import, but that is not a good long term strategy.
Reproducible example code
No response
Is this a regression? Put the last known working version here if it is.
Not a regression