python / cpython

The Python programming language
https://www.python.org
Other
62.94k stars 30.14k forks source link

Spurious "Type does not define the tp_name field" in 3.13.0rc2t #124768

Open dpdani opened 2 weeks ago

dpdani commented 2 weeks ago

Bug report

Bug description:

I'm hitting some (I believe) spurious errors when using my C extension with the latest rc. Due to the nature of the exception, I would suppose that the build type should not affect it, but it does:

====================== test session starts =======================
platform linux -- Python 3.13.0rc1, pytest-7.4.2, pluggy-1.5.0
rootdir: /home/dp/repos/thesis/cereggii
plugins: reraise-2.1.2
collected 96 items                                               

tests/test_atomic_dict.py .............s....s.....ss...... [ 33%]
.                                                          [ 34%]
tests/test_atomic_int.py ................................. [ 68%]
.......................                                    [ 92%]
tests/test_atomic_ref.py ......                            [ 98%]
tests/test_basics.py .                                     [100%]

================= 92 passed, 4 skipped in 0.60s ==================

The free-threaded build is using the deadsnakes action, and I believe that's CPython's main from yesterday.

After double-checking, I'm pretty sure I'm setting the tp_name field correctly everywhere, apologies in advance if I made a mistake.

CPython versions tested on:

3.13, CPython main branch

Operating systems tested on:

Linux

ZeroIntensity commented 2 weeks ago

At a quick glance, it looks like this type object is missing tp_name: https://github.com/dpdani/cereggii/blob/e887f1f060ad94ceb8e16b6ab2f62f29c7a1d8ea/src/include/atomic_dict_internal.h#L47

Though, I'm not sure why this only occurs on the free-threaded build.

dpdani commented 1 week ago

After removing it (commit), it's still failing with the same error message. (It wasn't being used.)

Thanks for the catch anyways, that's something I needed to clean up.

ZeroIntensity commented 1 week ago

Could you narrow this down to what class is causing this? I'm hesitant to mark this as a blocker because we don't have a proper reproducer yet.

dpdani commented 1 week ago

I've tried commenting everything except for:

Both are failing in exactly the same way.

I haven't tried every single type, I suspect they would behave the same.

ZeroIntensity commented 1 week ago

Sorry, I don't know how to reproduce this. Trying to use CMake builds for locally installed Python versions is a PITA, and I wasn't able to get it working with setuptools. If you could come up with a more standalone reproducer, that would be useful.

Another possible case here is that this is memory corruption; something getting passed to PyType_Ready could be any arbitrary memory, and if that memory happens to have a NULL at the same offset as tp_name, then that would cause problems. I suggest running through valgrind to rule out that possibility.

dpdani commented 1 week ago

Understood, no problem.

Will try to make a minimal repro, or will confirm it's some bug with my code, as soon as I can.