Open ngoldbaum opened 1 month ago
Thanks for moving this forward. I will pick this up next week so we can merge that MR.
I think the closure needs to be similar to for_each
which takes each element in turn, so that that way we get to call PyDict_Next
within our own code and can be sure that users can't nest critical sections.
e.g. something like this:
fn locked_for_each<F>(&self, closure: F) -> PyResult<()>
where
F: Fn(Bound<'py, PyAny>, Bound<'py, PyAny>) -> PyResult<()>
... that said, now that I say this, I realise that what we should probably do is override Iterator::fold
and Iterator::try_fold
to lock the critical section inside of them, which would automatically optimize many uses of PyO3 iterators (e.g. .iter().map(...).sum()
).
Hmm, can we actually use try_fold
? I tried to write an implementation for PyList
and I think in order to actually implement it, you need to use the Try
trait, which isn't stable yet.
Doh, of course not. We can still implement fold
, and maybe on our nightly
feature we could implement try_fold
.
See https://github.com/PyO3/pyo3/pull/4439 and https://github.com/PyO3/pyo3/pull/4539 for context.
pyo3 follows Python's behavior for multithreaded dict and list iteration and allows race conditions (see my experiment here: https://github.com/PyO3/pyo3/pull/4539#discussion_r1763722120).
Unfortunately as a consequence in order to preserve this behavior on the free-threaded build, we will need to use slower owned reference APIs for lists (#4539) and apply a critical sections for dicts in each loop iteration (#4439) since there are no equivalent locking owned reference iteration APIs for dicts.
It would be nice in both cases if we could simply lock the dict or list while we are iterating over it, which would allow us to use the faster APIs that access list and dict internals. However this would make the semantics for iteration via pyo3 different than via python.
Instead, I think we should add a new
locked_iter
function toPyDictMethods
andPyListMethods
. MaybePyAnyMethods
too but users can add their own locking if they want too for the generic case, we need to do it for dict and list for in PyO3 for performance reasons.Instead of directly returning an iterator, instead users would pass in a closure that accepts an iterator and work with the iterator inside the closure. My understanding from @davidhewitt is that using a closure would ensure exactly one
Py_END_CRITICAL_SECTION
follows eachPy_BEGIN_CRITICAL_SECTION
, even if a panic happens and even if the critical sections are recursive.I'm not terribly experienced with writing rust APIs that take closures, but I think the API I'm looking for is this?
There
PyDictLockedIterator
would be likePyDictIterator
, but its use would implicitly imply a critical section is held and the dict is locked. And of course a similar API for lists.This is something we could add for PyO3 0.24, and for 0.23 we'd merge the two open PRs related to this and only have the slow, safe iteration for the free-threaded build.
Ping @bschoenmaeckers