python / cpython

The Python programming language
https://www.python.org
Other
63.46k stars 30.39k forks source link

Optimise the way tracemalloc and PyRefTracer hooks work #125790

Open pablogsal opened 3 weeks ago

pablogsal commented 3 weeks ago

In https://github.com/python/cpython/issues/125703 @markshannon has raised that he is unhappy about the performance implications of where these hooks are placed and in a call we discussed that he has some ideas on how to make them more performant by moving them elsewhere or adapting then.

I am opening this issue to track and sync about these improvements for 3.14 and beyond.

markshannon commented 2 weeks ago

I think we can fix the performance issues by raising the level at which allocation/free goes through a function pointer.

Instead of a malloc-like interface void *malloc(size_t size), we should be returning partially initialized objects. PyObject *obj_malloc(PyTypeObject *tp, size_t size, size_t presize) would allocate a chunk of memory size + presize, returning a PyObject * pointing to that memory + presize, with the ob_type field set to tp and the ob_refcount set to one.

This is low-enough level to be fully general, but with enough context to support tracemalloc.

I think we would need the following implementations, switchable at runtime:

We don't need (or want) to switch between the free-threading and default allocators, but it keeps the rest of the code simpler if they have the same interface.