python / cpython

The Python programming language
https://www.python.org
Other
63.58k stars 30.46k forks source link

Abort in free-threaded build due to mutation of `ChainMap` of a `Counter` in threads #126366

Open devdanzin opened 2 weeks ago

devdanzin commented 2 weeks ago

What happened?

The code below, in a no-gil debug build with PYTHON_GIL=0, results in the following abort:

python: ./Include/internal/pycore_stackref.h:99: _PyStackRef_FromPyObjectSteal: Assertion `obj != NULL' failed.
Aborted
from threading import Thread
from collections import ChainMap, Counter

counter = Counter(range(100))
chainmap = ChainMap(counter)
chainmap2 = ChainMap(counter)

def mutate_removing():
    for x in range(10):
        chainmap.pop(list(chainmap.keys()).pop())
        chainmap2.pop(list(chainmap2.keys()).pop())

def mutate_adding():
    for x in range(50):
        chainmap[max(chainmap) + x + 1] = x
        chainmap2[max(chainmap2) + x + 1] = x

alive = []

for x in range(55):
    alive.append(Thread(target=mutate_removing, args=()))
    alive.append(Thread(target=mutate_adding, args=()))

for obj in alive:
    try:
        print('START', obj)
        obj.start()
    except Exception:
        pass

Found using fusil by @vstinner.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.14.0a1+ experimental free-threading build (heads/main:556dc9b8a7, Nov 3 2024, 10:09:47) [GCC 11.4.0]

Linked PRs

ZeroIntensity commented 2 weeks ago

It looks like this is some sort of 3.14 regression. On 3.13, it doesn't crash; instead, it just spams RuntimeErrors (which might also be a bug, but let's focus on this first).

This is probably a data race, but that scares me because nearly everything here is pure Python. Valgrind is pointing at native generators being the culprit, I'm looking into it.

ZeroIntensity commented 2 weeks ago

Yeah, there's a thread safety issue with native generators (or possibly with list). Here's a smaller reproducer:

from threading import Thread

def my_generator():
    for i in range(100000):
        yield i

def main(gen):
    list(gen)

gener = my_generator()

ts = [Thread(target=main, args=(gener,)) for _ in range(1000)]
for t in ts:
    t.start()

for t in ts:
    t.join()
ZeroIntensity commented 2 weeks ago

I had to split the fix up into two PRs because of backporting: gh-126369 and gh-126371

vstinner commented 2 weeks ago

See also https://github.com/python/cpython/issues/120321 issue and https://github.com/python/cpython/pull/120327 fix.

ZeroIntensity commented 2 weeks ago

Yeah, Sam mentioned that in my (closed) PR. The main concern with merging gh-120327 is that it will have a heavy performance hit on generators. IIRC, the relation with generators in this issue is that collections.abc.Iterable (which ChainMap inherits from) uses a generator in its __iter__.