python / cpython

The Python programming language
https://www.python.org
Other
63.49k stars 30.4k forks source link

API for setting the memory allocator used by Python #47579

Closed 8c96fb5b-ff12-473c-b9d8-80a9399b2a1c closed 11 years ago

8c96fb5b-ff12-473c-b9d8-80a9399b2a1c commented 16 years ago
BPO 3329
Nosy @warsaw, @rhettinger, @gpshead, @jcea, @amauryfa, @ncoghlan, @pitrou, @kristjanvalur, @vstinner, @jszakmeister, @tpn, @miss-islington
PRs
  • python/cpython#24795
  • python/cpython#24799
  • python/cpython#24800
  • Files
  • pymem.h: locally patched version of pymem.h
  • ccpmem.h
  • Capture.JPG: Profiling email
  • py_setallocators-filename.patch
  • pybench.txt
  • benchmarks.txt
  • py_setallocators-9.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['interpreter-core', 'type-feature'] title = 'API for setting the memory allocator used by Python' updated_at = user = 'https://bugs.python.org/jlaurila' ``` bugs.python.org fields: ```python activity = actor = 'miss-islington' assignee = 'none' closed = True closed_date = closer = 'vstinner' components = ['Interpreter Core'] creation = creator = 'jlaurila' dependencies = [] files = ['30451', '30452', '30496', '30537', '30559', '30560', '30753'] hgrepos = [] issue_num = 3329 keywords = ['patch'] message_count = 48.0 messages = ['69482', '69483', '69484', '69494', '69497', '69499', '69511', '78995', '79309', '91957', '142981', '183587', '183590', '183591', '183947', '183950', '183951', '190429', '190528', '190529', '190534', '190539', '190741', '190937', '190940', '190951', '190962', '191029', '191030', '191049', '191050', '191074', '191077', '191165', '191184', '191379', '191436', '191508', '192220', '192504', '192506', '192508', '192509', '192570', '192576', '388351', '388352', '388356'] nosy_count = 18.0 nosy_names = ['barry', 'rhettinger', 'gregory.p.smith', 'jcea', 'amaury.forgeotdarc', 'ncoghlan', 'Rhamphoryncus', 'pitrou', 'kristjan.jonsson', 'vstinner', 'jszakmeister', 'tlesher', 'jlaurila', 'trent', 'neilo', 'pjmcnerney', 'python-dev', 'miss-islington'] pr_nums = ['24795', '24799', '24800'] priority = 'normal' resolution = 'fixed' stage = None status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue3329' versions = ['Python 3.4'] ```

    8c96fb5b-ff12-473c-b9d8-80a9399b2a1c commented 16 years ago

    Currently Python always uses the C library malloc/realloc/free as the underlying mechanism for requesting memory from the OS, but especially on memory-limited platforms it is often desirable to be able to override the allocator and to redirect all Python's allocations to use a special heap. This will make it possible to free memory back to the operating system without restarting the process, and to reduce fragmentation by separating Python's allocations from the rest of the program.

    The proposal is to make it possible to set the allocator used by the Python interpreter by calling the following function before Py_Initialize():

    void Py_SetAllocator(void* (*alloc)(size_t), void* (*realloc)(void*,
    size_t), void (*free)(void*))

    Direct function calls to malloc/realloc/free in obmalloc.c must be replaced with calls through the function pointers set through this function. By default these would of course point to the C stdlib malloc/realloc/free.

    brettcannon commented 16 years ago

    Is registering pointers to functions really necessary, or would defining macros work as well? From a performance perspective I would like to avoid having a pointer indirection step every time malloc/realloc/free is called.

    I guess my question becomes, Jukka, is this more for alternative implementations of Python where changes to source are already expected, or for apps that embed Python where a change of malloc/realloc/free varies from app to app that dynamically loads Python?

    8d8a8db7-faf5-4c09-a2a3-2697dbaf0735 commented 16 years ago

    How would this allow you to free all memory? The interpreter will still reference it, so you'd have to have called Py_Finalize already, and promise not to call Py_Initialize afterwords. This further supposes the process will live a long time after killing off the interpreter, but in that case I recommend putting python in a child process instead.

    8c96fb5b-ff12-473c-b9d8-80a9399b2a1c commented 16 years ago

    Brett, the ability to define the allocator dynamically at runtime could be a compile time option, turned on by default only on small memory platforms. On most platforms you can live with plain old malloc and may want to avoid the indirection. If no other platform is interested in this, we can just make it a Symbian-specific extension but I wanted to see if there's general interest in this.

    The application would control the lifecycle of the Python heap, and this seemed like the most natural way for the application to tell the interpreter which heap instance to use.

    Adam, the cleanup would work by freeing the entire heap used by Python after calling Py_Finalize. In the old PyS60 code we made Python 2.2.2 clean itself completely by freeing the Python-specific heap and making sure all pointers to heap-allocated items are properly reinitialized.

    Yes, there are various static pointers that are initially set to NULL, initialized to point at things on the heap and not reset to NULL at Py_Finalize, and these are currently an obstacle to calling Py_Initialize again. I'm considering submitting a separate ticket about that since it seems like the ability to free the heap combined with the ability to reinitialize the static pointers could together make full cleanup possible.

    ncoghlan commented 16 years ago

    Given where we are in the release cycle, I've bumped the target releases to 2.7/3.1. So Symbian are probably going to have to do something port-specific anyway in order to get 2.6/3.0 up and running.

    And in terms of hooking into this kind of thing, some simple macros that can be overriden in pyport.h (as Brett suggested) may be a better idea than baking any specific approach into the core interpreter.

    rhettinger commented 16 years ago

    I think it is reasonable to get a macro definition change into 2.6. The OP's request is essential for his application (running Python on Nokia phones) and it would be a loss to wait two years for this. Also, his request for a macro will enable another important piece of functionality -- allowing a build to intercept and instrument all calls to the memory allocator.

    Barry, can you rule on whether to keep this open for consideration in 2.6. It seems daft to postpone this discussion indefinitely. If we can agree to a simple, non-invasive solution while there is still yet another beta, then it makes sense to proceed.

    8d8a8db7-faf5-4c09-a2a3-2697dbaf0735 commented 16 years ago

    Basically you just want to kick the malloc implementation into doing some housekeeping, freeing its caches? I'm kinda surprised you don't add the hook directly to your libc's malloc.

    IMO, there's no use-case for this until Py_Finalize can completely tear down the interpreter, which requires a lot of special work (killing(!) daemon threads, unloading C modules, etc), and nobody intends to do that at this point.

    The practical alternative, as I said, is to run python in a subprocess. Let the OS clean up after us.

    a8ccac6d-a65f-45b0-a1d5-055b71f01f04 commented 15 years ago

    I'll be in agreement here. I integrated Python into a game engine not too long ago, and had to a do a fair chunk of work to isolate Python into it's own heap - given that fragmentation on low memory systems can be a bit of a killer. Would also make future upgrades a heck of a lot easier too, as there'd be no need to do a search for all references and carefully replace them all.

    8c96fb5b-ff12-473c-b9d8-80a9399b2a1c commented 15 years ago

    Brett is right. Macroing the memory allocator is a better choice than forcing indirection on all platforms. We did this on Python for S60, using the macros PyCore_{MALLOC,REALLOC,FREE}_FUNC for interpreter's allocations, and then redirected those to a mechanism that allows to set the allocator at runtime.

    Sorry we don't have a clean patch at present for this change only, but in case anyone's interested the full source is at https://garage.maemo.org/frs/?group_id=854

    7a430b3c-e8f2-4b25-b6c7-62266260c629 commented 15 years ago

    Has the ability to set the memory allocator been added to Python 2.7/3.1?

    Thanks, PJ

    pitrou commented 13 years ago

    All this needs is a patch. Note that there are some places where we call malloc()/free() without going through our abstraction API. This is not in allocation-heavy paths, though.

    vstinner commented 11 years ago

    I attached a patch that I wrote for Wyplay: py_setallocators.patch. The patch adds two functions:

    PyAPI_FUNC(int) Py_GetAllocators(
        char api,
        void* (**malloc_p) (size_t),
        void* (**realloc_p) (void*, size_t),
        void (**free_p) (void*)
        );
    
    PyAPI_FUNC(int) Py_SetAllocators(
        char api,
        void* (*malloc) (size_t),
        void* (*realloc) (void*, size_t),
        void (*free) (void*)
        );

    Where api is one of these values:

    These functions are used by the pytracemalloc project to hook PyMem_Malloc() and PyObject_Malloc() API. pytracemalloc traces all Python memory allocations to compute statistics per Python file. https://pypi.python.org/pypi/pytracemalloc

    Wyplay is also using PySetAllocators() internally to replace completly system allocators *before* Python is started. We have another private patch on Python adding a function. This function sets its own memory allocators, it is called before the start of Python thanks to an "\_attribute__((constructor))" attribute.

    --

    If you use Py_SetAllocators() to replace completly a memory allocator (any memory allocation API), you have to do it before the first Python memory allocation (before Py_Main()) *or* your memory allocator must be able to recognize if a pointer was not allocated by him and pass the operation (realloc or free) to the previous memory allocator.

    For example, PyObject_Free() is able to recognize that a pointer is part of its memory pool, or fallback to the system allocator (extract of the original code):

        if (Py_ADDRESS_IN_RANGE(p, pool)) {
            ...
            return;
        }
        free(p);

    --

    If you use Py_SetAllocators() to hook memory allocators (do something before and/or after calling the previous function, *without* touching the pointer nor the size), you can do it anytime.

    --

    I didn't run a benchmark yet to measure the overhead of the patch on Python performances.

    New functions are not documented nor tested yet. If we want to test these new functions, we can write a simple hook tracing calls to the memory allocators and call the memory allocator.

    vstinner commented 11 years ago

    To be exhaustive, another patch should be developed to replace all calls for malloc/realloc/free by PyMem_Malloc/PyMem_Realloc/PyMem_Free. PyObject_Malloc() is still using mmap() or malloc() internally for example.

    Other examples of functions calling malloc/realloc/free directly: _PySequence_BytesToCharpArray(), block_new() (of pyarena.c), find_key() (of thread.c), PyInterpreterState_New(), win32_wchdir(), posix_getcwd(), Py_Main(), etc.

    amauryfa commented 11 years ago

    Some customizable memory allocators I know have an extra parameter "void *opaque" that is passed to all functions:

    OTOH, expat, libxml, libmpdec don't have this extra parameter.

    84401114-8e59-4056-83cb-632106c0b648 commented 11 years ago

    At ccp we have something similar. We are embedding python in the UnrealEngine on the PS3 and need to get everything through their allocators. For the purpose of flexibility, we added an api similar to the OPs, but more flexible:

    /* Support for custom allocators */
    typedef void *(*PyCCP_Malloc_t)(size_t size, void *arg, const char *file, int line, const char *msg);
    typedef void *(*PyCCP_Realloc_t)(void *ptr, size_t size, void *arg, const char *file, int line, const char *msg);
    typedef void (*PyCCP_Free_t)(void *ptr, void *arg, const char *file, int line, const char *msg);
    typedef size_t (*PyCCP_Msize_t)(void *ptr, void *arg);
    typedef struct PyCCP_CustomAllocator_t
    {
        PyCCP_Malloc_t  pMalloc;
        PyCCP_Realloc_t pRealloc;
        PyCCP_Free_t    pFree;
        PyCCP_Msize_t   pMsize;    /* can be NULL, or return -1 if no size info is avail. */
        void            *arg;      /* opaque argument for the functions */
    } PyCCP_CustomAllocator_t;
    
    /* To set an allocator!  use 0 for the regular allocator, 1 for the block allocator.
     * pass a null pointer to reset to internal default
     */
    PyAPI_FUNC(void) PyCCP_SetAllocator(int which, const PyCCP_CustomAllocator_t *);

    For a module to install itself as a "hook" at runtime, this approach can be extended by querying the current allocator, so that such a hook can the delegate the previous calls.

    The "block" allocator here, is intended as the underlying allocator to be used by obmalloc.c. Depending on platforms, this can then allocate aligned virtual memory directly, which is more efficient than layering that on-top of a malloc-like allocator.

    There are areas in cPython that use malloc() directly. Those are actually not needed in all cases, but to cope with them we change them all to new RAW api calls (using preprocessor macros). Essentially, malloc() maps to PyCCP_RawMalloc() or PyMem_MALLOC_INNER() (both local additions) based on whether the particular site using malloc() requires truly gil free malloc or not.

    For this reason, the custom allocators mentioned canot be assumed to be called with the GIL. However, it is easily possible to extend the system above so that there is a GIL and non-GIL version for the 'regular' allocator.

    I'll put details of the stuff we have done for EVE Online / Dust 514 on my blog. It is this, but much much more too.

    Hopefully we can arrive at a way to abstract memory allocation away from Python in a flexible and extendible manner :)

    ncoghlan commented 11 years ago

    Note that I'm definitely open to including extra settings to set up custom allocators as part of Py_CoreConfig in PEP-432 (http://www.python.org/dev/peps/pep-0432/#pre-initialization-phase).

    I don't really want to continue the tradition of additional PySet_* APIs with weird conditions on when they have to be called, though (trying to prevent more of that kind of organic growth in complexity is why I wrote PEP-432 in the first place)

    84401114-8e59-4056-83cb-632106c0b648 commented 11 years ago

    Absolutely. Although there is a very useful scenario where this could be consided a run-time setting:

      # turboprofiler.py
      # Load up the memory hooker which will supply us with all the info
      import _turboprofiler
      _turboprofiler.hookup()

    Perhaps people interested in memory optimizations and profiling could hook up at pycon? It is the most common regular query I get from people in my organization: How can I find out how python is using/leaking/wasting memory?

    vstinner commented 11 years ago

    typedef void *(PyCCP_Malloc_t)(size_t size, void *arg, const char *file, int line, const char \msg);

    I don't understand the purpose of the filename and line number. Python does not have such information. Is it just to have the API expected by Unreal engine?

    What is the message? How is it filled?

    --

    I'm proposing a simpler prototype:

    void* (*malloc) (size_t);

    Just because Python does not use or have less or more. I'm not against adding an arbitrary void* argument, it should not hurt, and may be required by some other applications or libraries.

    @kristjan.jonsson: Can you adapt your tool to fit the following API?

    PyAPI_FUNC(int) Py_SetAllocators(
        char api,
        void* (*malloc) (size_t size, void *data),
        void* (*realloc) (void* ptr, size_t size, void *data),
        void (*free) (void* ptr, void *data)
        );

    --

    My pytracemalloc project hooks allocation functions and then use C Python functions to get the current filename and line number. No need to modify the C code to pass __FILE and __LINE.

    It can produce such summary:

    2013-02-28 23:40:18: Top 5 allocations per file

    1: .../Lib/test/regrtest.py: 3998 KB

    2: .../Lib/unittest/case.py: 2343 KB

    3: .../ctypes/test/init.py: 513 KB

    4: .../Lib/encodings/init.py: 525 KB

    5: .../Lib/compiler/transformer.py: 438 KB

    other: 32119 KB Total allocated size: 39939 KB

    You can also configure it to display also the line number.

    https://pypi.python.org/pypi/pytracemalloc

    84401114-8e59-4056-83cb-632106c0b648 commented 11 years ago

    Hi. the file and line arguments are for expanding from macros such as PyMem_MALLOC. I had them added because they provide the features of a comprehensive debugging API.

    Of course, I'm not showing you the entire set of modifications that we have made to the memory allocation scheme. They including more extensive versions of the memory allocation tools, in order to more easily monitor memory allocations from within C.

    For your information, I'm uploading pymemory.h from our 2.7 patch. The extent of our modifications can be gleaned from there.

    Basically, we have layered the macros into outer and inner versions, in order to better support internal diagnostics.

    I'm happy with the api you provide, with a small addition:
    PyAPI_FUNC(int) Py_SetAllocators(
        char api,
        void* (*malloc) (size_t size, void *data),
        void* (*realloc) (void* ptr, size_t size, void *data),
        void (*free) (void* ptr, void *data),
        void *data
        );

    The 'data' pointer is pointless unless you can provide it as part of the api. This sort of extra indirection is necessary for C callbacks to provide instance specific context to statically compiled and linked callback functions.

    84401114-8e59-4056-83cb-632106c0b648 commented 11 years ago

    Also, our ccpmem.h, the interface to the ccpmem.cpp, internal flexible memory allocator framework. Again, just FYI. There are no trade secrets here, so please ask me for more details, if interested. One particular trick we have been using, which might be of interest, is to be able to tag each allocation with a "context" id. This is then set according to a global sys.memcontext variable, which the program will modify according to what it is doing. This can then be used to track memory usage by different parts of the program.

    vstinner commented 11 years ago
    """
    I'm happy with the api you provide, with a small addition:
    PyAPI_FUNC(int) Py_SetAllocators(
        char api,
        void* (*malloc) (size_t size, void *data),
        void* (*realloc) (void* ptr, size_t size, void *data),
        void (*free) (void* ptr, void *data),
        void *data
        );
    """

    Oops, I forgot "void *data". Yeah, each group of allocator functions (malloc, free and realloc) will get its own "data" pointer.

    vstinner commented 11 years ago

    New patch (version 2), more complete:

    Main API: ---

    #define PY_ALLOC_MEM_API 'm'      /* PyMem_Malloc() API */
    #define PY_ALLOC_OBJECT_API 'o'   /* PyObject_Malloc() API */
    
    PyAPI_FUNC(int) Py_GetAllocators(
        char api,
        void* (**malloc_p) (size_t size, void *user_data),
        void* (**realloc_p) (void *ptr, size_t size, void *user_data),
        void (**free_p) (void *ptr, void *user_data),
        void **user_data_p
        );
    
    PyAPI_FUNC(int) Py_SetAllocators(
        char api,
        void* (*malloc) (size_t size, void *user_data),
        void* (*realloc) (void *ptr, size_t size, void *user_data),
        void (*free) (void *ptr, void *user_data),
        void *user_data
        );
    
    PyAPI_FUNC(void) Py_GetBlockAllocators(
        void* (**malloc_p) (size_t size, void *user_data),
        void (**free_p) (void *ptr, size_t size, void *user_data),
        void **user_data_p
        );
    
    PyAPI_FUNC(int) Py_SetBlockAllocators(
        void* (*malloc) (size_t size, void *user_data),
        void (*free) (void *ptr, size_t size, void *user_data),
        void *user_data
        );

    I see the following use cases using allocators:

    "Hook" means adding extra code before and/or after calling the original function.

    The final API should allow to hook the APIS multiple times and replacing allocators. So it should be possible to track memory leaks, detect buffer overflow and our your own allocators. It is not yet possible with the patch 2, because _PyMem_DebugMalloc() calls directly malloc().

    _PyMem_DebugMalloc is no more used by PyObject_Malloc. This code should be rewritten to use the hook approach instead of replacing memory allocators.

    Example tracing PyMem calls using the hook approach: -----------------------------------

    typedef struct {
        void* (*malloc) (size_t, void*);
        void* (*realloc) (void*, size_t, void*);
        void (*free) (void*, void*);
        void *data;
    } allocators_t;

    allocators_t pymem, pyobject;

    void* trace_malloc (size_t size, void* data)
    {
        allocators_t *alloc = (allocators_t *)data;
        printf("malloc(%z)\n", size);
        return alloc.malloc(size, alloc.data);
    }
    
    void* trace_realloc (void* ptr, size_t size, void* data)
    {
        allocators_t *alloc = (allocators_t *)data;
        printf("realloc(%p, %z)\n", ptr, size);
        return alloc.realloc(ptr, size, alloc.data);
    }
    
    void trace_free (void* ptr, void* data)
    {
        allocators_t *alloc = (allocators_t *)data;
        printf("free(%p)\n", ptr);
        alloc.free(ptr, alloc.data);
    }
    
    void hook_pymem(void)
    {
       Py_GetAllocators(PY_ALLOC_MEM_API, &pymem.malloc, &pymem.realloc, &pymem.free, &pymem.data);
       Py_SetAllocators(PY_ALLOC_MEM_API, trace_malloc, trace_realloc, trace_free, &pymem);
    
       Py_GetAllocators(PY_ALLOC_OBJECT_API, &pyobject.malloc, &pyobject.realloc, &pyobject.free, &pyobject.data);
       Py_SetAllocators(PY_ALLOC_OBJECT_API, trace_malloc, trace_realloc, trace_free, &pyobject);
    }

    I didn't try the example :-p It is just to give you an idea of the API and how to use it.

    84401114-8e59-4056-83cb-632106c0b648 commented 11 years ago

    I'd like to add some argument to providing a "file" and "line number" to the allocation api. I know that currently this is not provided e.g. by the PyMem_Allocate() functions, but I think it would be wise to provide a "debug" version of these functions that pass in the call sites. An allocator api that then also allows for these values to be provided to the malloc/realloc/free routines is then future-proof in that respect.

    Case in point: We have a memory profiler running which uses a allocator hook system similar to what Victor is proposing. But in addition, it provides a "file " and "line" argument to every function.

    Now, the profiler is currently not using this code. Here how the "malloc" function looks:

    static void *
    PyMalloc(size_t size, void *arg, const char *file, int line, const char *msg)
    {
        void *r = DustMalloc(size);
        if (r) {
            tmAllocEx(g_telemetryContext, file, line, r, size, "Python alloc: %s", msg);
            ReportAllocInfo(AllocEvent, 0, r, size);
        }
        return r;
    }

    tmAllocEx is calling the Telemetry memory profiles and passing in the allocation site. (http://www.radgametools.com/telemetry.htm, also my blog about using it: http://cosmicpercolator.com/2012/05/25/optimizing-python-condition-variables-with-telemetry/

    But our profiler, called with ReportAllocInfo, isn't using this. It relies solely on extracting the python callstack.

    Today, I got this email (see attached file Capture.jpg)

    Basically, the profiler sees a lot of allocated memory with no python call stack. Now it would be useful if we had the C call site information, to know where it came from.

    So: My suggestion is that the allocator api be 1) a struct, which allows for a cleaner api function 2) Include C filename and line number.

    Even though the current python memory API (e.g. PyMem_Malloc(), PyObjectMalloc()) do not currently support it, this would allow us to internally have _extended versions of these apis that do support it and macros that pass in that information. This can be added at a later stage. Having it in the allcoator api function would make it more future proof.

    See also my "pymem.h" and "ccpmem.h" files attached to this defect for examples on how we have tweaked python's internal memory apis to support this information.

    vstinner commented 11 years ago

    py_setallocators-filename.patch: Here is a try to define an API providing the filename and line number of the C code. The Py_SetAllocators() API is unchanged:

    PyAPI_FUNC(int) Py_SetAllocators(
        char api,
        void* (*malloc) (size_t size, void *user_data),
        void* (*realloc) (void *ptr, size_t size, void *user_data),
        void (*free) (void *ptr, void *user_data),
        void *user_data
        );

    If Python is compiled with -DPYMEM_TRACE_MALLOC, user_data is not the last parameter passed to Py_SetAllocators() but a pointer to a _PyMem_Trace structure:

    typedef struct {
        void *data;
        /* NULL and -1 when unknown */
        const char *filename;
        int lineno;
    } _PyMem_Trace;

    The problem is that the module using Py_SetAllocators() must be compiled differently depending on PYMEM_TRACE_MALLOC. Example from pytracemalloc, modified for this patch: ---

        _PyMem_Trace *ctrace;
        trace_api_t *api;
        void *call_data;
        void *ptr;
    #ifdef PYMEM_TRACE_MALLOC
        ctrace = (_PyMem_Trace *)data;
        api = (trace_api_t *)ctrace->data;
        ctrace->data = api->data;
        call_data = data;
    #else
        ctrace = NULL;
        api = (trace_api_t *)data;
        call_data = api->data;
    #endif
        ptr = api->malloc(size, call_data);
        ...

    I didn't like the "ctrace->data = api->data;" instruction: pytracemalloc modifies the input _PyMem_Trace structure.

    pytracemalloc code is a little bit more complex, but "it works". pytracemalloc can reuse the filename and line number of the C module, or of the Python module. It can be configured at runtime. Example of output for the C module: --- 2013-06-11 00:36:30: Top 15 allocations per file and line (compared to 2013-06-11 00:36:25)

    1: Objects/dictobject.c:352: size=6 MiB (+4324 KiB), count=9818 (+7773), average=663 B

    2: Objects/unicodeobject.c:1085: size=6 MiB (+2987 KiB), count=61788 (+26197), average=111 B

    3: Objects/tupleobject.c:104: size=4054 KiB (+2176 KiB), count=44569 (+24316), average=93 B

    4: Objects/typeobject.c:770: size=2440 KiB (+1626 KiB), count=13906 (+10360), average=179 B

    5: Objects/bytesobject.c:107: size=2395 KiB (+1114 KiB), count=24846 (+11462), average=98 B

    6: Objects/funcobject.c:12: size=1709 KiB (+1103 KiB), count=11516 (+7431), average=152 B

    7: Objects/codeobject.c:117: size=1760 KiB (+871 KiB), count=11267 (+5578), average=160 B

    8: Objects/dictobject.c:399: size=784 KiB (+627 KiB), count=10040 (+8028), average=80 B

    9: Objects/listobject.c:159: size=420 KiB (+382 KiB), count=5386 (+4891), average=80 B

    10: Objects/frameobject.c:649: size=1705 KiB (+257 KiB), count=3374 (+505), average=517 B

    11: ???:?: size=388 KiB (+161 KiB), count=588 (+240), average=676 B

    12: Objects/weakrefobject.c:36: size=241 KiB (+138 KiB), count=2579 (+1482), average=96 B

    13: Objects/dictobject.c:420: size=135 KiB (+112 KiB), count=2031 (+1736), average=68 B

    14: Objects/classobject.c:59: size=109 KiB (+105 KiB), count=1400 (+1345), average=80 B

    15: Objects/unicodeobject.c:727: size=188 KiB (+86 KiB), count=1237 (+687), average=156 B

    37 more: size=828 KiB (+315 KiB), count=8421 (+5281), average=100 B Total Python memory: size=29 MiB (+16 MiB), count=212766 (+117312), average=145 B Total process memory: size=68 MiB (+22 MiB) (ignore tracemalloc: 0 B) ---

    I also had to modify the following GC functions to get more accurate information:

    For example, PyTuple_New() calls PyObject_GC_NewVar() to allocate its memory. With my patch, you get "Objects/tupleobject.c:104" instead of a generic "Modules/gcmodule.c:1717".

    vstinner commented 11 years ago

    New version of the patch, py_setallocators-3.patch:

    This patch does not propose a simple API to reuse internal debug hooks when replacing system (PyMem) allocators.

    amauryfa commented 11 years ago

    I prefer the new version without PYMEM_TRACE_MALLOC :-)

    Can we rename "API" and "api_id" to something more specific? maybe DOMAIN and domain_id?

    vstinner commented 11 years ago

    Amaury Forgeot d'Arc added the comment:

    I prefer the new version without PYMEM_TRACE_MALLOC :-)

    Well, py_setallocators-filename.patch is more a proof-of-concept showing how to use my Py_SetAllocators() API to pass the C trace (filename/line number), than a real proposition. The patch is very intrusive and huge, I also prefer py_setallocators-3.patch :-)

    Can we rename "API" and "api_id" to something more specific? maybe DOMAIN and domain_id?

    Something like: {PY_ALLOC_MEM_DOMAIN, PY_ALLOC_OBJECT_DOMAIN}. or {PYMEM_DOMAIN, PYOBJECT_DOMAIN} ?

    There are only two values, another option is to duplicate functions:

    I prefer PyMem_SetAllocators() over PYOBJECT_DOMAIN.

    vstinner commented 11 years ago

    Benchmark of py_setallocators-3.patch:

    If I understood correctly, the overhead is really really low (near zero).

    vstinner commented 11 years ago

    If I understood correctly, the overhead is really really low (near zero).

    See attached output pybench.txt and benchmarks.txt.

    vstinner commented 11 years ago

    New version (4) of the patch:

    Does the new API look better? py_setallocators-4.patch is ready for a final review. If nobody complains, I'm going to commit it.

    vstinner commented 11 years ago

    py_setallocators-4.patch:

    I just saw that I forgot ".. versionadded:: 3.4" in the doc.

    vstinner commented 11 years ago

    This patch does not propose a simple API to reuse internal debug hooks when replacing system (PyMem) allocators.

    Ok, this is now fixed with new patch (version 5). Nick does not want a new environment variable, so I added instead a new function PyMem_SetupDebugHooks() which reinstalls hooks to detect bugs if allocator functions were replaced with PyMem_SetAllocators() or PyObject_SetAllocators(). The function does nothing is Python is not compiled in debug more or if hooks are already installed (so the function can be called twice).

    I also added unit tests for PyMem_SetAllocators() and PyObject_SetAllocators()! And I added "versionadded:: 3.4" to the C API documentation.

    vstinner commented 11 years ago

    To be exhaustive, another patch should be developed to replace all calls for malloc/realloc/free by PyMem_Malloc/PyMem_Realloc/PyMem_Free.

    I created issue bpo-18203 for this point.

    PyObject_Malloc() is still using mmap() or malloc() internally for example.

    Arena allocator can be replaced or hooked with PyObject_SetArenaAllocators() of my lastest patch.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 11 years ago

    New changeset 6661a8154eb3 by Victor Stinner in branch 'default': Issue bpo-3329: Add new APIs to customize memory allocators http://hg.python.org/cpython/rev/6661a8154eb3

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 11 years ago

    New changeset b1455dd08000 by Victor Stinner in branch 'default': Revert changeset 6661a8154eb3: Issue bpo-3329: Add new APIs to customize memory allocators http://hg.python.org/cpython/rev/b1455dd08000

    vstinner commented 11 years ago

    Convert changeset 6661a8154eb3 into a patch: py_setallocators-6.patch.

    vstinner commented 11 years ago

    Update the patch to follow the API described in the PEP-445 (2013-06-18 22:33:41 +0200).

    vstinner commented 11 years ago

    Update patch according to the last version of the PEP.

    vstinner commented 11 years ago

    Updated patch (version 9):

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 11 years ago

    New changeset ca78c974e938 by Victor Stinner in branch 'default': Issue bpo-3329: Implement the PEP-445 http://hg.python.org/cpython/rev/ca78c974e938

    vstinner commented 11 years ago

    Ok, let see if buildbots like the PEP-445 (keep this issue open until we have the result of all 3.4 buildbots).

    I created the issue bpo-18392 to document PyObject_Malloc().

    vstinner commented 11 years ago

    It looks like the changeset ca78c974e938 broke the "x86 XP-4 3.x" buildbot: buildbot.python.org/all/builders/x86 XP-4 3.x/builds/8795/

    Traceback (most recent call last):
      File "../lib/test/regrtest.py", line 1305, in runtest_inner
        test_runner()
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\test_tools.py", line 459, in test_main
        support.run_unittest(*[obj for obj in globals().values()
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\support.py", line 1600, in run_unittest
        _run_suite(suite)
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\support.py", line 1566, in _run_suite
        result = runner.run(suite)
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\runner.py", line 175, in run
        result.printErrors()
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\runner.py", line 109, in printErrors
        self.printErrorList('ERROR', self.errors)
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\runner.py", line 117, in printErrorList
        self.stream.writeln("%s" % err)
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\runner.py", line 25, in writeln
        self.write(arg)
    MemoryError
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "../lib/test/regrtest.py", line 1615, in <module>
        main_in_temp_cwd()
      File "../lib/test/regrtest.py", line 1590, in main_in_temp_cwd
        main()
      File "../lib/test/regrtest.py", line 796, in main
        match_tests=match_tests)
      File "../lib/test/regrtest.py", line 998, in runtest
        debug, display_failure=False)
      File "../lib/test/regrtest.py", line 1330, in runtest_inner
        msg = traceback.format_exc()
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\traceback.py", line 254, in format_exc
        return "".join(format_exception(*sys.exc_info(), limit=limit, chain=chain))
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\traceback.py", line 180, in format_exception
        return list(_format_exception_iter(etype, value, tb, limit, chain))
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\traceback.py", line 152, in _format_exception_iter
        yield from _format_list_iter(_extract_tb_iter(tb, limit=limit))
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\traceback.py", line 17, in _format_list_iter
        for filename, lineno, name, line in extracted_list:
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\traceback.py", line 64, in _extract_tb_or_stack_iter
        line = linecache.getline(filename, lineno, f.f_globals)
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\linecache.py", line 15, in getline
        lines = getlines(filename, module_globals)
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\linecache.py", line 41, in getlines
        return updatecache(filename, module_globals)
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\linecache.py", line 127, in updatecache
        lines = fp.readlines()
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\codecs.py", line 301, in decode
        (result, consumed) = self._buffer_decode(data, self.errors, final)
    MemoryError
    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 11 years ago

    New changeset 51ed51d10e60 by Victor Stinner in branch 'default': Issue bpo-3329: Fix _PyObject_ArenaVirtualFree() http://hg.python.org/cpython/rev/51ed51d10e60

    vstinner commented 11 years ago

    Buildbots are happy, changeset 51ed51d10e60 fixed the memory leak on Windows XP. Let's close this issue, 5 years after its creation!

    84401114-8e59-4056-83cb-632106c0b648 commented 11 years ago

    Well done.

    vstinner commented 3 years ago

    New changeset 0d6bd1ca7c683137d52041194f3a2b02219f225a by Victor Stinner in branch 'master': bpo-3329: Fix typo in PyObjectArenaAllocator doc (GH-24795) https://github.com/python/cpython/commit/0d6bd1ca7c683137d52041194f3a2b02219f225a

    miss-islington commented 3 years ago

    New changeset 5ca02c4799c07dba5192f87f9f2378dc24097166 by Miss Islington (bot) in branch '3.8': bpo-3329: Fix typo in PyObjectArenaAllocator doc (GH-24795) https://github.com/python/cpython/commit/5ca02c4799c07dba5192f87f9f2378dc24097166

    miss-islington commented 3 years ago

    New changeset ea46c7bc503d1103d60c0ec7023bb52c5defa11d by Miss Islington (bot) in branch '3.9': bpo-3329: Fix typo in PyObjectArenaAllocator doc (GH-24795) https://github.com/python/cpython/commit/ea46c7bc503d1103d60c0ec7023bb52c5defa11d