Open efimov-mikhail opened 1 month ago
Backtrace on MacOS:
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x0000000000000000
frame #1: 0x00000001001d0c5c python.exe`_PyEval_EvalFrameDefault(tstate=<unavailable>, frame=0x0000000100bcf898, throwflag=0) at generated_cases.c.h:3628:36 [opt]
frame #2: 0x00000001000a48e0 python.exe`gen_send_ex2 [inlined] _PyEval_EvalFrame(tstate=0x00000001005a8738, frame=<unavailable>, throwflag=0) at pycore_ceval.h:119:16 [opt]
frame #3: 0x00000001000a48d4 python.exe`gen_send_ex2(gen=0x0000000100bcf850, arg=0x0000000000000000, presult=0x000000016fdfe378, exc=0, closing=<unavailable>) at genobject.c:245:24 [opt]
frame #4: 0x00000001000a2f28 python.exe`gen_iternext(self=<unavailable>) at genobject.c:612:9 [opt]
frame #5: 0x00000001000ba0a4 python.exe`list_extend_iter_lock_held(self=0x00000001010ca490, iterable=0x00000001010ca490) at listobject.c:1194:26 [opt]
frame #6: 0x00000001000b6fb8 python.exe`_list_extend(self=<unavailable>, iterable=<unavailable>) at listobject.c:1367:15 [opt] [artificial]
frame #7: 0x00000001000bd578 python.exe`list___init___impl(self=0x00000001010ca490, iterable=0x0000000100bcf850) at listobject.c:3391:13 [opt]
frame #8: 0x00000001000b8ef0 python.exe`list_vectorcall(type=<unavailable>, args=0x00000001009dc690, nargsf=<unavailable>, kwnames=<unavailable>) at listobject.c:3415:13 [opt]
frame #9: 0x00000001000855f8 python.exe`_PyObject_VectorcallTstate(tstate=0x00000001005a8738, callable=0x0000000100530f30, args=0x00000001009dc690, nargsf=9223372036854775809, kwnames=0x0000000000000000) at pycore_call.h:167:11 [opt]
The crash is in the FOR_ITER instruction; it blindly accesses tp_iternext
without verifying that the object being iterated over has this attribute.
The repro case is a bit dubious as it's mutating f_locals, but probably we should indeed support this. The fix would be as simple as inserting a NULL check.
Thank you for such a concrete comment! It seems like I made fix by myself after it.
I think there are two reasonable approaches to fixing this:
Given that setting an internal variable like .0
is obviously a suspect thing to do, I'd be fine with (2). However, it might break code that stores additional, possibly non-identifier, keys in the locals()
dict.
I can segfault 3.12 and earlier with a variant of this.
g = (x for x in range(10))
gen_expr_func = types.FunctionType(g.gi_code, {})
list(gen_expr_func(range(20)))
print("No segfault")
The root cause is the same: the generator expression does not check that the value passed to it is an iterator.
So it looks like (1) is the correct fix. See https://github.com/python/cpython/pull/125178#issuecomment-2402332695 for a way to do this.
IMHO, additional GET_ITER before FOR_ITER for genexpr is a best way to prevent all such crashes. I will try to provide code changes by myself.
The initial fix is merged into main, though we'll want to eventually implement a different fix.
I thought it was worth investigating whether a similar crash was possible with async genexps. However, it seems that the GET_ANEXT impression is implemented more robustly and I couldn't find a crash by manipulating the locals of an asyncgen frame.
However, it seems that the GET_ANEXT impression is implemented more robustly and I couldn't find a crash by manipulating the locals of an asyncgen frame.
Yes, it seems so.
Actually, function _PyEval_GetANext
from Python/ceval.c
contains NULL
check for type->tp_as_async->am_anext
method. TypeError will be raised if no such method presents.
This code looks very similar to one of the previous variants of fixing crash in my PR.
Maybe, we can remove NULL
check from the GET_ANEXT
bytecode implementation.
And place GET_AITER
instruction before GET_ANEXT
, in similarity with "sync" generators.
This change could possible improve perfomance a little bit.
But I'm not sure that it's worth it.
On the other side, TypeErrors in GET_AITER
and GET_ANEXT
slightly differs in case of am_anext == NULL
.
Maybe we should use the same text in those functions.
cc @JelleZijlstra, @markshannon
See #125723 for another generator f_locals
related crash that may be confounding the investigation attempts here (I don't think it's a duplicate, but it's a segfault where we have no idea of the root cause, so that assessment is purely based on the reproducer having a very different structure from the ones described here)
Crash report
What happened?
There is a segmentation fault with simple code snippet:
There is a SIGSEGV on my linux machine (Debian GNU/Linux 10) with both main branch Python and 3.13 version. Message "No segfault" is printed on Python 3.7.3 (default, Oct 31 2022, 14:04:00).
CPython versions tested on:
3.13, CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.14.0a0 (heads/peg_parser_remove_redundant_functions:e1b62e5cf79, Oct 3 2024, 14:38:46) [GCC 8.3.0]
Linked PRs