PyO3 / pyo3

Rust bindings for the Python interpreter
https://pyo3.rs
Apache License 2.0
12.07k stars 744 forks source link

`python: Objects/unicodeobject.c:15052: intern_static: Assertion '_Py_IsImmortal(s)' failed.` in various projects with pyo3-0.21.2 and Python 3.13.0b3 #4309

Closed mgorny closed 3 months ago

mgorny commented 3 months ago

Bug Description

After upgrading to Python 3.13.0b3, various projects that previously worked with Python 3.13.0b2 suddenly started crashing with assertion error. For example, pyproject-fmt-rust:

$ python -c 'import pyproject_fmt_rust'
python: Objects/unicodeobject.c:15052: intern_static: Assertion `_Py_IsImmortal(s)' failed.
Aborted (core dumped)

I've been able to bisect it to python/cpython@9769b7ae064a0546a98cbcbec2561dbaba20cd23. Unfortunately, I've been only able to test it against pyo3-0.21.2, as porting that package to 0.22.0 is above my current skills. However, a quick search through the repository doesn't reveal anything obviously relevant. If I missed something, I'm really sorry about that.

Steps to Reproduce

  1. Build Python 3.13.0b3 --with-assertions.
  2. git clone https://github.com/tox-dev/pyproject-fmt-rust/
  3. cd pyproject-fmt-rust
  4. Create and enter a venv (e.g. ~/git/cpython/python -m venv .venv, . .venv/bin/activate).
  5. pip install -e .
  6. python -c "import pyproject_fmt_rust"

Backtrace

(gdb) bt
#0  0x00007fc5112935ac in ?? () from /usr/lib64/libc.so.6
#1  0x00007fc51123c816 in raise () from /usr/lib64/libc.so.6
#2  0x00007fc5112248fa in abort () from /usr/lib64/libc.so.6
#3  0x00007fc51122481e in ?? () from /usr/lib64/libc.so.6
#4  0x00007fc511234cd6 in __assert_fail () from /usr/lib64/libc.so.6
#5  0x000055748fc9903e in intern_static (interp=<optimized out>, s=0x557490060910 <_PyRuntime+36400>) at Objects/unicodeobject.c:15052
#6  0x000055748fc99181 in intern_common (interp=0x55749006d340 <_PyRuntime+88160>, s=0x557490060910 <_PyRuntime+36400>, 
    immortalize=immortalize@entry=false) at Objects/unicodeobject.c:15146
#7  0x000055748fccafb1 in _PyUnicode_InternMortal (interp=<optimized out>, p=p@entry=0x7ffeea2375d8) at Objects/unicodeobject.c:15267
#8  0x000055748fc55531 in PyObject_SetAttr (v=0x7fc51057cfe0, name=<optimized out>, value=0x7fc510577000) at Objects/object.c:1330
#9  0x00007fc5102efc51 in <pyo3::instance::Bound<pyo3::types::any::PyAny> as pyo3::types::any::PyAnyMethods>::setattr::inner ()
   from /tmp/pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so
#10 0x00007fc5102e64d8 in <pyo3::instance::Bound<pyo3::types::module::PyModule> as pyo3::types::module::PyModuleMethods>::index ()
   from /tmp/pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so
#11 0x00007fc5102ef614 in <pyo3::instance::Bound<pyo3::types::module::PyModule> as pyo3::types::module::PyModuleMethods>::add::inner
    () from /tmp/pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so
#12 0x00007fc5102e6734 in <pyo3::instance::Bound<pyo3::types::module::PyModule> as pyo3::types::module::PyModuleMethods>::add_function
    () from /tmp/pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so
#13 0x00007fc5101a9057 in _lib::<impl _lib::_lib::MakeDef>::make_def::__pyo3_pymodule ()
   from /tmp/pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so
#14 0x00007fc51017cd0b in pyo3::sync::GILOnceCell<T>::init () from /tmp/pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so
#15 0x00007fc5102ea771 in pyo3::impl_::pymodule::ModuleDef::make_module ()
   from /tmp/pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so
#16 0x00007fc5101a8ef9 in PyInit__lib () from /tmp/pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so
#17 0x000055748fd63629 in _PyImport_RunModInitFunc (p0=p0@entry=0x7fc5101a8e10 <PyInit__lib>, info=info@entry=0x7ffeea237b40, 
    p_res=p_res@entry=0x7ffeea237ab0) at ./Python/importdl.c:423
#18 0x000055748fd60306 in import_run_extension (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, 
    p0=p0@entry=0x7fc5101a8e10 <PyInit__lib>, info=info@entry=0x7ffeea237b40, spec=spec@entry=0x7fc51055cf50, modules=<optimized out>)
    at Python/import.c:1951
#19 0x000055748fd60fb5 in _imp_create_dynamic_impl (module=module@entry=0x7fc5111ac9a0, spec=0x7fc51055cf50, file=<optimized out>)
    at Python/import.c:4683
#20 0x000055748fd6102d in _imp_create_dynamic (module=0x7fc5111ac9a0, args=args@entry=0x7fc510580658, nargs=nargs@entry=1)
    at Python/clinic/import.c.h:485
#21 0x000055748fc501a9 in cfunction_vectorcall_FASTCALL (func=0x7fc5111aed90, args=0x7fc510580658, nargsf=<optimized out>, 
    kwnames=<optimized out>) at Objects/methodobject.c:425
#22 0x000055748fc05489 in _PyVectorcall_Call (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, 
    func=0x55748fc5014b <cfunction_vectorcall_FASTCALL>, callable=callable@entry=0x7fc5111aed90, tuple=tuple@entry=0x7fc510580640, 
    kwargs=kwargs@entry=0x7fc510577100) at Objects/call.c:273
#23 0x000055748fc05788 in _PyObject_Call (tstate=0x55749009cb80 <_PyRuntime+282784>, callable=callable@entry=0x7fc5111aed90, 
    args=args@entry=0x7fc510580640, kwargs=kwargs@entry=0x7fc510577100) at Objects/call.c:348
#24 0x000055748fc057c4 in PyObject_Call (callable=callable@entry=0x7fc5111aed90, args=args@entry=0x7fc510580640, 
    kwargs=kwargs@entry=0x7fc510577100) at Objects/call.c:373
#25 0x000055748fd175cf in _PyEval_EvalFrameDefault (tstate=0x55749009cb80 <_PyRuntime+282784>, frame=0x7fc5114e76c0, throwflag=0)
    at Python/generated_cases.c.h:1353
#26 0x000055748fd23e26 in _PyEval_EvalFrame (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, frame=<optimized out>, 
    throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:119
#27 0x000055748fd23f49 in _PyEval_Vector (tstate=0x55749009cb80 <_PyRuntime+282784>, func=0x7fc5111d0400, locals=locals@entry=0x0, 
    args=0x7ffeea237ee0, argcount=2, kwnames=0x0) at Python/ceval.c:1819
#28 0x000055748fc0393c in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, 
    kwnames=<optimized out>) at Objects/call.c:413
#29 0x000055748fc03c6d in _PyObject_VectorcallTstate (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, 
    callable=callable@entry=0x7fc5111d0400, args=args@entry=0x7ffeea237ee0, nargsf=nargsf@entry=2, kwnames=kwnames@entry=0x0)
    at ./Include/internal/pycore_call.h:168
#30 0x000055748fc04a4d in object_vacall (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, base=base@entry=0x0, 
    callable=0x7fc5111d0400, vargs=vargs@entry=0x7ffeea237f60) at Objects/call.c:819
#31 0x000055748fc04b81 in PyObject_CallMethodObjArgs (obj=0x0, name=<optimized out>) at Objects/call.c:880
#32 0x000055748fd5e65e in import_find_and_load (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, 
    abs_name=abs_name@entry=0x7fc510576bf0) at Python/import.c:3651
#33 0x000055748fd620e3 in PyImport_ImportModuleLevelObject (name=name@entry=0x7fc5105803f0, globals=<optimized out>, 
    locals=locals@entry=0x7fc5104f4600, fromlist=fromlist@entry=0x7fc510576d40, level=level@entry=1) at Python/import.c:3733
#34 0x000055748fd0fac7 in import_name (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, frame=frame@entry=0x7fc5114e7378, 
    name=0x7fc5105803f0, fromlist=fromlist@entry=0x7fc510576d40, level=level@entry=0x55749005b0a8 <_PyRuntime+13768>)
    at Python/ceval.c:2675
#35 0x000055748fd1bdc7 in _PyEval_EvalFrameDefault (tstate=0x55749009cb80 <_PyRuntime+282784>, frame=0x7fc5114e7378, throwflag=0)
    at Python/generated_cases.c.h:3199
#36 0x000055748fd23e26 in _PyEval_EvalFrame (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, frame=<optimized out>, 
    throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:119
#37 0x000055748fd23f49 in _PyEval_Vector (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, func=func@entry=0x7fc51055ccc0, 
    locals=locals@entry=0x7fc5104f4600, args=args@entry=0x0, argcount=argcount@entry=0, kwnames=kwnames@entry=0x0)
    at Python/ceval.c:1819
#38 0x000055748fd24012 in PyEval_EvalCode (co=co@entry=0x7fc510506930, globals=globals@entry=0x7fc5104f4600, 
    locals=locals@entry=0x7fc5104f4600) at Python/ceval.c:599
#39 0x000055748fd0c693 in builtin_exec_impl (module=module@entry=0x7fc511196e80, source=0x7fc510506930, globals=0x7fc5104f4600, 
    locals=0x7fc5104f4600, closure=0x0) at Python/bltinmodule.c:1145
#40 0x000055748fd0c7e3 in builtin_exec (module=0x7fc511196e80, args=<optimized out>, args@entry=0x7fc51054e658, nargs=nargs@entry=2, 
    kwnames=kwnames@entry=0x0) at Python/clinic/bltinmodule.c.h:556
#41 0x000055748fc5003c in cfunction_vectorcall_FASTCALL_KEYWORDS (func=0x7fc511197380, args=0x7fc51054e658, nargsf=<optimized out>, 
    kwnames=0x0) at Objects/methodobject.c:441
#42 0x000055748fc05489 in _PyVectorcall_Call (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, 
    func=0x55748fc4ffe5 <cfunction_vectorcall_FASTCALL_KEYWORDS>, callable=callable@entry=0x7fc511197380, 
    tuple=tuple@entry=0x7fc51054e640, kwargs=kwargs@entry=0x7fc51054c900) at Objects/call.c:273
#43 0x000055748fc05788 in _PyObject_Call (tstate=0x55749009cb80 <_PyRuntime+282784>, callable=callable@entry=0x7fc511197380, 
    args=args@entry=0x7fc51054e640, kwargs=kwargs@entry=0x7fc51054c900) at Objects/call.c:348
#44 0x000055748fc057c4 in PyObject_Call (callable=callable@entry=0x7fc511197380, args=args@entry=0x7fc51054e640, 
    kwargs=kwargs@entry=0x7fc51054c900) at Objects/call.c:373
#45 0x000055748fd175cf in _PyEval_EvalFrameDefault (tstate=0x55749009cb80 <_PyRuntime+282784>, frame=0x7fc5114e72f0, throwflag=0)
    at Python/generated_cases.c.h:1353
#46 0x000055748fd23e26 in _PyEval_EvalFrame (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, frame=<optimized out>, 
    throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:119
#47 0x000055748fd23f49 in _PyEval_Vector (tstate=0x55749009cb80 <_PyRuntime+282784>, func=0x7fc5111d0400, locals=locals@entry=0x0, 
    args=0x7ffeea2387d0, argcount=2, kwnames=0x0) at Python/ceval.c:1819
#48 0x000055748fc0393c in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, 
    kwnames=<optimized out>) at Objects/call.c:413
#49 0x000055748fc03c6d in _PyObject_VectorcallTstate (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, 
    callable=callable@entry=0x7fc5111d0400, args=args@entry=0x7ffeea2387d0, nargsf=nargsf@entry=2, kwnames=kwnames@entry=0x0)
    at ./Include/internal/pycore_call.h:168
#50 0x000055748fc04a4d in object_vacall (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, base=base@entry=0x0, 
    callable=0x7fc5111d0400, vargs=vargs@entry=0x7ffeea238850) at Objects/call.c:819
#51 0x000055748fc04b81 in PyObject_CallMethodObjArgs (obj=0x0, name=<optimized out>) at Objects/call.c:880
#52 0x000055748fd5e65e in import_find_and_load (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, 
    abs_name=abs_name@entry=0x7fc51054d270) at Python/import.c:3651
#53 0x000055748fd620e3 in PyImport_ImportModuleLevelObject (name=name@entry=0x7fc51054d270, globals=<optimized out>, 
    locals=locals@entry=0x7fc5104f4580, fromlist=fromlist@entry=0x5574900330e0 <_Py_NoneStruct>, level=level@entry=0)
    at Python/import.c:3733
#54 0x000055748fd0fac7 in import_name (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, frame=frame@entry=0x7fc5114e7020, 
    name=0x7fc51054d270, fromlist=fromlist@entry=0x5574900330e0 <_Py_NoneStruct>, level=level@entry=0x55749005b088 <_PyRuntime+13736>)
    at Python/ceval.c:2675
#55 0x000055748fd1bdc7 in _PyEval_EvalFrameDefault (tstate=0x55749009cb80 <_PyRuntime+282784>, frame=0x7fc5114e7020, throwflag=0)
    at Python/generated_cases.c.h:3199
#56 0x000055748fd23e26 in _PyEval_EvalFrame (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, frame=<optimized out>, 
    throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:119
#57 0x000055748fd23f49 in _PyEval_Vector (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, func=func@entry=0x7fc5111d14e0, 
    locals=locals@entry=0x7fc5104f4580, args=args@entry=0x0, argcount=argcount@entry=0, kwnames=kwnames@entry=0x0)
    at Python/ceval.c:1819
#58 0x000055748fd24012 in PyEval_EvalCode (co=co@entry=0x7fc510549b50, globals=globals@entry=0x7fc5104f4580, 
    locals=locals@entry=0x7fc5104f4580) at Python/ceval.c:599
#59 0x000055748fd85e74 in run_eval_code_obj (tstate=tstate@entry=0x55749009cb80 <_PyRuntime+282784>, co=co@entry=0x7fc510549b50, 
    globals=globals@entry=0x7fc5104f4580, locals=locals@entry=0x7fc5104f4580) at Python/pythonrun.c:1292
#60 0x000055748fd8601e in run_mod (mod=mod@entry=0x5574cae39fb8, filename=filename@entry=0x7fc5104f46f0, 
    globals=globals@entry=0x7fc5104f4580, locals=locals@entry=0x7fc5104f4580, flags=flags@entry=0x7ffeea238e00, 
    arena=arena@entry=0x7fc51111bcf0, interactive_src=0x7fc5104f83f0, generate_new_source=0) at Python/pythonrun.c:1377
#61 0x000055748fd8678e in _PyRun_StringFlagsWithName (str=str@entry=0x7fc5104f4690 "import pyproject_fmt_rust\n", 
    name=name@entry=0x7fc5104f46f0, start=start@entry=257, globals=globals@entry=0x7fc5104f4580, locals=locals@entry=0x7fc5104f4580, 
    flags=flags@entry=0x7ffeea238e00, generate_new_source=0) at Python/pythonrun.c:1176
#62 0x000055748fd87ffc in _PyRun_SimpleStringFlagsWithName (command=0x7fc5104f4690 "import pyproject_fmt_rust\n", 
    name=name@entry=0x55748fe14bd3 "<string>", flags=flags@entry=0x7ffeea238e00) at Python/pythonrun.c:516
#63 0x000055748fdab3f5 in pymain_run_command (command=<optimized out>) at Modules/main.c:252
#64 0x000055748fdabfb6 in pymain_run_python (exitcode=exitcode@entry=0x7ffeea238ea4) at Modules/main.c:631
#65 0x000055748fdac346 in Py_RunMain () at Modules/main.c:719
#66 0x000055748fdac3c0 in pymain_main (args=args@entry=0x7ffeea238f00) at Modules/main.c:749
#67 0x000055748fdac497 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:773
#68 0x000055748fba68e6 in main (argc=<optimized out>, argv=<optimized out>) at ./Programs/python.c:15

Your operating system and version

Gentoo Linux amd64

Your Python version (python --version)

Python 3.13.0b3

Your Rust version (rustc --version)

rustc 1.79.0 (129f3b996 2024-06-10) (gentoo)

Your PyO3 version

0.21.2

How did you install python? Did you use a virtualenv?

Reproduced both with Python build from source with --with-assertions (as part of git bisect) and 3.13.0b3 from Gentoo ebuild. Virtualenv for testing as noted above.

Additional Info

No response

mgorny commented 3 months ago

CC @encukou, @ericsnowcurrently in case they have any insight what's wrong.

encukou commented 3 months ago

Thank you for testing with assertions! It looks like Rust code is increasing the refcount of a static immortal string object (the Python string "__all__"). Not being familiar with Rust I haven't found the source; in GDB you can find it here:

(gdb) watch ((PyObject*)(&_PyRuntime.static_objects.singletons.strings.identifiers._py___all__)).ob_refcnt
Hardware watchpoint 1: [...]

(gdb) run
Starting program: /venv/bin/python -c import\ pyproject_fmt_rust
[...]
Hardware watchpoint 1: ((PyObject*)(&_PyRuntime.static_objects.singletons.strings.identifiers._py___all__)).ob_refcnt

Old value = 4294967295
New value = 4294967296
0x00007f9072ab1cc5 in <pyo3::instance::Bound<pyo3::types::module::PyModule> as pyo3::types::module::PyModuleMethods>::index ()
   from /pyproject-fmt-rust/src/pyproject_fmt_rust/_lib.abi3.so

A stable ABI extension is allowed to do that (before 3.12 stable ABI; this uses 3.8 AFAIK). The assertion in CPython is wrong. I'll send a CPython PR soon.

davidhewitt commented 3 months ago

Thanks for investigating @encukou!

I think it's correct in your analysis that we are calling Py_INCREF on the __all__ singleton (via our generic code for __getattr__, it looks like). That we do that at all is a performance optimization yet to be resolved.

PyO3 0.21.2 doesn't support the version-specific Python 3.13 API, so it should be the case that this has been built using ABI3 "forward compatibility".

encukou commented 3 months ago

Immortal objects are an implementation detail, so ideally you should call Py_IncRef on "__all__". But you should call it, as a library function, rather than increase ob_refcount directly.

Except if you're using stable ABI. Then you can adjust ob_refcount directly, for backwards compatibility reasons. (Don't do it unless you have to, though...)

(Outside the stable ABI, you could instead reimplement the Py_INCREF macro in Rust, and call that. But in that case include the immortality check, and note that it might change in the next feature release.)

mgorny commented 3 months ago

Thank you for investigating, and for the pull request. I'm going to test it today.

encukou commented 3 months ago

The CPython PR is up at https://github.com/python/cpython/pull/121358. It fixes the reproducer above, at least on my machine :)

davidhewitt commented 3 months ago

Immortal objects are an implementation detail, so ideally you should call Py_IncRef on "__all__". But you should call it, as a library function, rather than increase ob_refcount directly.

Except if you're using stable ABI. Then you can adjust ob_refcount directly, for backwards compatibility reasons. (Don't do it unless you have to, though...)

(Outside the stable ABI, you could instead reimplement the Py_INCREF macro in Rust, and call that. But in that case include the immortality check, and note that it might change in the next feature release.)

Yep that's pretty much exactly the situation. By the optimization I meant in this case I think PyO3 shouldn't need to call Py_INCREF on the attribute name to be passed to PyObject_GetAttr (as that API doesn't steal a reference). But unrelated to the main point here wrt the assertion which PyO3 will still hit in the general case.

davidhewitt commented 3 months ago

I think I will close this one as we understand the fix to be coming in 3.13b4. Thanks @mgorny @encukou!