Closed PeterJCLaw closed 3 months ago
Thanks :D
For clarity: I'm keeping this on hold as I've started seeing test failures on my python3.13
branch having rebased on top of this branch (and the changes here now rebased on master
). I'm not sure of the cause just yet as the failures are new and didn't show up in https://github.com/PeterJCLaw/jedi/pull/3 despite that PR ought to having been functionally the same code. The only difference between the two I think is my linting PR #2006, which I hadn't expected to change anything.
I'll hopefully have some time this week to pin down the issue.
For clarity: I'm keeping this on hold as I've started seeing test failures on my
python3.13
branch having rebased on top of this branch (and the changes here now rebased onmaster
). I'm not sure of the cause just yet as the failures are new and didn't show up in PeterJCLaw#3 despite that PR ought to having been functionally the same code.
I believe these are down to some changes in Python 3.13.0-beta.3 (where previously CI was using beta.2), and I'm happy that that's not related to the changes in this PR.
That makes sense. The pull request itself feels very simple and I am not surprised it's not your fault :)
Summary
There was a race condition due to the combination of Python's object ids being re-usable and Jedi persisting such ids beyond the real lifeteime of some objects. This could lead to the subprocess' view of the lifetime of
InferenceState
contexts getting out of step with that in the parent process and resulting in errors when removing them. It is also possible that this could result in erroneous results being reported, however this was not directly observed.The race was specifically:
InferenceState
A created, gets id 1InferenceStateSubprocess
A' created, usesInferenceState
A which it stores as a weakref and an idInferenceStateSubprocess
A' is used, the sub-process learns about anInferenceState
with id 1InferenceState
A goes away,InferenceStateSubprocess
A' is not yet garbage collectedInferenceState
B created, gets id 1InferenceStateSubprocess
B' created, usesInferenceState
B which it stores as a weakref and an idInferenceStateSubprocess
B' is used, the sub-process re-uses its entry for anInferenceState
with id 1At this point the order of operations between the two
InferenceStateSubprocess
instances going away is immaterial -- both will trigger a removal of a state with id 1. As long as B' doesn't try to use the sub-process again after the first removal has happened then the second removal will fail.This change resolves the race condition by coupling the context in the subprocess to the corresponding manager class instance in the parent process, rather than to the consumer
InferenceState
.See inline comments for further details.
Code review
Suggest review by commit. There's some small refactors in the early commits. The actual fix is the last commit on the branch.
Testing
I'm open to ideas on how to add a test for this.
Currently this is relying on Python 3.13 tests reproducing the issue fairly easily, though not entirely consistently.
Links
Currently builds on https://github.com/davidhalter/jedi/pull/2003 (as it was discovered there), though it doesn't really need to be part of the upgrade process.