python / cpython

The Python programming language
https://www.python.org/
Other
60.06k stars 29.09k forks source link

gh-118362: Fix thread safety around lookups from the type cache in the face of concurrent mutators #118454

Closed DinoV closed 2 weeks ago

DinoV commented 3 weeks ago

Lookups from the type cache can be exposed to some races. One of those is that _PyType_Lookup isn't increfing the result, so the resulting value can become invalid at any point. This adds a new _PyType_Fetch API which does the incref. Most places are updated to use the new API except for the specializer which isn't safe in free-threaded builds yet.

But there are also issues around when we do a store into the dict and signal that the type is modified. A race can occur where the old value is removed from the dict and before we invalidate the type version. This can be seen today w/ the GIL when the value in the dict has a finalizer - accessing the type cache and the type's mapping proxy return inconsistent results.

This fixes these issues by making the update to the dict and the type modified update an atomic operation. We also capture the previous value in the dictionary and don't free it until after we've made the atomic update. This basically inlines a much simplified version of _PyObject_GenericSetAttrWithDict into type_setattro which can lock mutation against types as well as against the type's dict.

When updating we first clear the type version so concurrent reads of the type cache won't hit. We then replace the existing value and finish the type modified process.

DinoV commented 2 weeks ago

Some questions comments below. Biggest questions is that now that we are accessing the dict directly instead of through _PyObject_GenericSetAttrWithDict, how do we know that the other cases handled by _PyObject_GenericSetAttrWithDict are not necessary in _Py_type_getattro?

This is part of the reason why I assert:

    assert(!_PyType_HasFeature(metatype, Py_TPFLAGS_INLINE_VALUES));
    assert(!_PyType_HasFeature(metatype, Py_TPFLAGS_MANAGED_DICT));

In type_setattro. Type objects will always have a dictptr, so we'll either go to _PyObjectDict_SetItem if the dict doesn't exist yet or do the get/set if they do. _PyObjectDict_SetItem is just going to do a get/set as well. And because types don't have a managed dict they don't have shared keys.

Also, I think this merits a NEWS entry.

DinoV commented 2 weeks ago

Also, I think this merits a NEWS entry. I opened #118492 to track the issue that is fixed when the GIL is present and blurb'd it.

bedevere-bot commented 2 weeks ago

:warning::warning::warning: Buildbot failure :warning::warning::warning:

Hi! The buildbot iOS ARM64 Simulator 3.x has failed when building commit 5a1618a2c8c108b8c73aa9459b63f0dbd66b60f6.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/1380/builds/238) and take a look at the build logs.
  4. Check if the failure is related to this commit (5a1618a2c8c108b8c73aa9459b63f0dbd66b60f6) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/1380/builds/238

Failed tests:

Failed subtests:

Summary of the results of the build (if available):

==

Click to see traceback logs ```python-traceback Traceback (most recent call last): File "/Users/buildbot/Library/Developer/XCTestDevices/2E72F7FC-E4ED-40AC-B022-F9C50E4B0A1E/data/Containers/Bundle/Application/4CB60694-FBAA-49BE-969E-073C108C659C/iOSTestbed.app/python/lib/python3.13/test/test_free_threading/test_type.py", line 37, in test_attr_cache with Pool(NTHREADS) as pool: ~~~~^^^^^^^^^^ File "/Users/buildbot/Library/Developer/XCTestDevices/2E72F7FC-E4ED-40AC-B022-F9C50E4B0A1E/data/Containers/Bundle/Application/4CB60694-FBAA-49BE-969E-073C108C659C/iOSTestbed.app/python/lib/python3.13/multiprocessing/dummy/__init__.py", line 124, in Pool return ThreadPool(processes, initializer, initargs) File "/Users/buildbot/Library/Developer/XCTestDevices/2E72F7FC-E4ED-40AC-B022-F9C50E4B0A1E/data/Containers/Bundle/Application/4CB60694-FBAA-49BE-969E-073C108C659C/iOSTestbed.app/python/lib/python3.13/multiprocessing/pool.py", line 930, in __init__ Pool.__init__(self, processes, initializer, initargs) ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/buildbot/Library/Developer/XCTestDevices/2E72F7FC-E4ED-40AC-B022-F9C50E4B0A1E/data/Containers/Bundle/Application/4CB60694-FBAA-49BE-969E-073C108C659C/iOSTestbed.app/python/lib/python3.13/multiprocessing/pool.py", line 196, in __init__ self._change_notifier = self._ctx.SimpleQueue() ~~~~~~~~~~~~~~~~~~~~~^^ File "/Users/buildbot/Library/Developer/XCTestDevices/2E72F7FC-E4ED-40AC-B022-F9C50E4B0A1E/data/Containers/Bundle/Application/4CB60694-FBAA-49BE-969E-073C108C659C/iOSTestbed.app/python/lib/python3.13/multiprocessing/context.py", line 113, in SimpleQueue return SimpleQueue(ctx=self.get_context()) File "/Users/buildbot/Library/Developer/XCTestDevices/2E72F7FC-E4ED-40AC-B022-F9C50E4B0A1E/data/Containers/Bundle/Application/4CB60694-FBAA-49BE-969E-073C108C659C/iOSTestbed.app/python/lib/python3.13/multiprocessing/queues.py", line 361, in __init__ self._rlock = ctx.Lock() ~~~~~~~~^^ File "/Users/buildbot/Library/Developer/XCTestDevices/2E72F7FC-E4ED-40AC-B022-F9C50E4B0A1E/data/Containers/Bundle/Application/4CB60694-FBAA-49BE-969E-073C108C659C/iOSTestbed.app/python/lib/python3.13/multiprocessing/context.py", line 67, in Lock from .synchronize import Lock File "/Users/buildbot/Library/Developer/XCTestDevices/2E72F7FC-E4ED-40AC-B022-F9C50E4B0A1E/data/Containers/Bundle/Application/4CB60694-FBAA-49BE-969E-073C108C659C/iOSTestbed.app/python/lib/python3.13/multiprocessing/synchronize.py", line 17, in import _multiprocessing ModuleNotFoundError: No module named '_multiprocessing' ```