python / cpython

The Python programming language
https://www.python.org/
Other
60.88k stars 29.39k forks source link

[Windows] Embedded Python Memory Leaks #96853 ( Corrected 98%), Need testing #98359

Open AlexSoft73 opened 1 year ago

AlexSoft73 commented 1 year ago

Hi Python team,

I detected python memory leaks when using it embedded on windows platform, unfortunately the issue I opened was closed.

I decided to correct the issue (though it is not 100% corrected), python now returns most of the memory it takes when finished.

I need test to be done with the modifications I did, especially if it requires any modifications for Unix based systems:

pylifecycle.c --> call function __Py_ArenaMemoryRelease obmalloc.c --> Define function __Py_ArenaMemoryRelease pluss the trackig of memory allocations and releases

Pelase, test it and let me know.

Thanks you all @AlexSoft73

iritkatriel commented 7 months ago

@AlexSoft73 I don't understand. Which test needs to be done?

AlexSoft73 commented 7 months ago

I was working in a CPython copy fixing the issue of memory leaks, these memory leaks are not seeing if python is run as the main application since these memory gets reused by the python interpreter, BUT, if you embed python in applications created in any other language to use its benefits (Python is not the mail application in this case) is where you see the memory leaks, like I showed in the samples I put back to those days, iE: (pardon me, don't take literally the name of the python library function names I am writing from the top of my head and from what I get to recall, but the issue can be seeing)

PyInitialize(...) Python gets memory from the system, this is OK! ... ...You run Python code here, OK! ... now you get done with Python and call it to finish: PyFinalize(...) Python never releases the memory it took in the call to PyInitialize, BAD!!!!!!!

So, in my branch, you run Python embed and you can see most of the memory being returned back to the System, the modified CPython code that I wrote explains what I did and where the problem was.

But, back to those days it seemed to me that CPython community was not interested in fixing this. In that sample I showed ther main reason python was not releasing the memory but the only answers I saw were negatives, felt a bit disappointed and I put aside my participation in the project.

But if you are interested in getting CPython improved, here I am and you are very welcomed

Thanks for your mail Alexei


From: Irit Katriel @.> Sent: Friday, December 1, 2023 09:28 PM To: python/cpython @.> Cc: AlexSoft73 @.>; Mention @.> Subject: Re: [python/cpython] [Windows] Embedded Python Memory Leaks #96853 ( Corrected 98%), Need testing (Issue #98359)

@AlexSoft73https://github.com/AlexSoft73 I don't understand. Which test needs to be done?

— Reply to this email directly, view it on GitHubhttps://github.com/python/cpython/issues/98359#issuecomment-1836801453, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3C7TGHD5LMA3TQOUUAORGLYHJDWBAVCNFSM6AAAAAARHHUALCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZWHAYDCNBVGM. You are receiving this because you were mentioned.

vstinner commented 7 months ago

I don't understand how to reproduce your issue. Which Python version did you test?

Python 3.12 now releases most memory in Py_Finalize(), but not all stdlib extensions have been updated yet.

vstinner commented 7 months ago

detected python memory leaks when using it embedded on windows platform, unfortunately the issue I opened was closed.

Which issue? Do you have a link?

vstinner commented 7 months ago

Ah, wait, there are 2 other issues:

AlexSoft73 commented 7 months ago

Hi Victor,

If you run Python in the Mac world, probably you are right, I am talking running CPython (embeded ) in Windows Platform, have you tried?

thanks


From: Victor Stinner @.> Sent: Wednesday, December 6, 2023 08:06 PM To: python/cpython @.> Cc: AlexSoft73 @.>; Mention @.> Subject: Re: [python/cpython] [Windows] Embedded Python Memory Leaks #96853 ( Corrected 98%), Need testing (Issue #98359)

I don't understand how to reproduce your issue. Which Python version did you test?

Python 3.12 now releases most memory in Py_Finalize(), but not all stdlib extensions have been updated yet.

— Reply to this email directly, view it on GitHubhttps://github.com/python/cpython/issues/98359#issuecomment-1843609544, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3C7TGCSYNTI3SIVLQXT5GLYIDF2ZAVCNFSM6AAAAAARHHUALCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBTGYYDSNJUGQ. You are receiving this because you were mentioned.Message ID: @.***>

AlexSoft73 commented 7 months ago

If you search GitSub for this you will see, I left a copy with the fix made where 98% of memory is released back to the system, once more, CPython embedded on Windows Platform.

Thanks


From: Victor Stinner @.> Sent: Wednesday, December 6, 2023 08:07 PM To: python/cpython @.> Cc: AlexSoft73 @.>; Mention @.> Subject: Re: [python/cpython] [Windows] Embedded Python Memory Leaks #96853 ( Corrected 98%), Need testing (Issue #98359)

detected python memory leaks when using it embedded on windows platform, unfortunately the issue I opened was closed.

Which issue? Do you have a link?

— Reply to this email directly, view it on GitHubhttps://github.com/python/cpython/issues/98359#issuecomment-1843610905, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3C7TGHD2Y5AP3HQMOXMPATYIDF6XAVCNFSM6AAAAAARHHUALCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBTGYYTAOJQGU. You are receiving this because you were mentioned.Message ID: @.***>

AlexSoft73 commented 7 months ago

And Yes, you were one of a few closing the tickets I opened and never paid attention to what I was simply asking. And once more, I am taking about CPython embedded on Windows Platform.

Thanks


From: Victor Stinner @.> Sent: Wednesday, December 6, 2023 08:23 PM To: python/cpython @.> Cc: AlexSoft73 @.>; Mention @.> Subject: Re: [python/cpython] [Windows] Embedded Python Memory Leaks #96853 ( Corrected 98%), Need testing (Issue #98359)

Ah, wait, there are 2 other issues:

— Reply to this email directly, view it on GitHubhttps://github.com/python/cpython/issues/98359#issuecomment-1843632972, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3C7TGCPAMUYXKBSISQP7MDYIDH5JAVCNFSM6AAAAAARHHUALCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBTGYZTEOJXGI. You are receiving this because you were mentioned.

neonene commented 7 months ago

Repeating OP's example 2000 times:

#include <Python.h>
int main(int argc, char *argv[])
{
    for (int N = 0; N < 2000; N++) {
        printf("%d\n", N);
        Py_InitializeEx(0);
        Py_FinalizeEx();
    }
    return 0;
}

When running this code on an embedded x64 executable built with MSVC v.1936, the following 3 commits from the left seem to cause serious memory consumption (taken from the Task Manager):

N 3.12.0a1+
67807cfc87135fdce4992d38d2ffe3e44747e73b
3.12.0a7+
ea2c0016502472aa8baa3149050ada776d17a009
3.13.0a0+
15d4c9fabce67b8a1b5bd9dec9612014ec18291a
3.11.x
3.12.0a1
500 330MB 770MB 1000MB 4MB
1000 650MB 1500MB 2000MB 4MB
1500 970MB 2240MB 3000MB 4MB
2000 1290MB 3000MB 4000MB 4MB

I'm not sure if they affect each other. The python.org official binaries (.lib/.dll) can also be used to confirm.

cc @ericsnowcurrently @eduardo-elizondo @markshannon

AlexSoft73 commented 7 months ago

Is this test run tuning Peyton as main application or embedded?

Thanks Alexei

On Dec 8, 2023, at 1:08 PM, neonene @.***> wrote:



Repeating OP's examplehttps://github.com/python/cpython/issues/96853#issuecomment-1270515838 2000 times:

include

int main(int argc, char *argv[]) { for (int N = 0; N < 2000; N++) { printf("%d\n", N); Py_InitializeEx(0); Py_FinalizeEx(); } return 0; }

When running this code on an embedded x64 executable built with MSVC v.1936, the following 3 commits from the left seem to cause serious memory consumption (taken from the Task Manager):

N 3.12.0a1+ 67807cfhttps://github.com/python/cpython/commit/67807cfc87135fdce4992d38d2ffe3e44747e73b 3.12.0a7+ ea2c001https://github.com/python/cpython/commit/ea2c0016502472aa8baa3149050ada776d17a009 3.13.0a0+ 15d4c9fhttps://github.com/python/cpython/commit/15d4c9fabce67b8a1b5bd9dec9612014ec18291a 3.11.x 3.12.0a1 500 330MB 770MB 1000MB 4MB 1000 650MB 1500MB 2000MB 4MB 1500 970MB 2240MB 3000MB 4MB 2000 1290MB 3000MB 4000MB 4MB

I'm not sure if they affect each other. The python.org official binaries (.lib/.dll) can also be used to confirm.

cc @ericsnowcurrentlyhttps://github.com/ericsnowcurrently @eduardo-elizondohttps://github.com/eduardo-elizondo @markshannonhttps://github.com/markshannon

— Reply to this email directly, view it on GitHubhttps://github.com/python/cpython/issues/98359#issuecomment-1847609983, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3C7TGEPTJLE5PB57FCW6HTYINJQJAVCNFSM6AAAAAARHHUALCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBXGYYDSOJYGM. You are receiving this because you were mentioned.Message ID: @.***>

neonene commented 7 months ago

Is this test run tuning Peyton as main application or embedded?

As embedded.

AlexSoft73 commented 6 months ago

Hi guys and again thanks for your time

I am sending you a little sample (created in MSVS2019 compile for Windows) that shows something, once more, I always repeat it since this cannot be seen when python is used as main application

The following program showed these results:

Memory usage: 4292608------------------------> Before Python was Initialized Memory usage: 10498048----------------------> Python initialized, this is ok, Python got the memory for its interpreter Memory usage: 10002432----------------------> Python is Finalized here, the memory usage should be around 4292608, the usage before Python initializes Memory usage difference between start and end: 5709824, if this is not a leak, then what is it?

thanks Alexei

//C/c++ sample //--------------------------------------------------------------------------------------------------------------------------------------

include

include

//#define ALEXDEB _DEBUG

undef _DEBUG

include

//#define _DEBUG ALEXDEB

void ___print(wchar_t* pToPrint) { OutputDebugString(pToPrint); wprintf(pToPrint); }

float PrintMemoryInfo(DWORD processID) { HANDLE hProcess; PROCESS_MEMORY_COUNTERS pmc; double f;

hProcess = OpenProcess(PROCESS_QUERY_INFORMATION | PROCESS_VM_READ,
    FALSE, processID);
if (NULL == hProcess)
    return;

if (GetProcessMemoryInfo(hProcess, &pmc, sizeof(pmc)))
{
    wchar_t Info[1024];

    memset(Info, 0, 1024 * sizeof(wchar_t));
    f = pmc.WorkingSetSize;
    swprintf(Info, 1023, L"Memory usage: %1.0f\n", f);
    ___print((wchar_t*)Info);
}

CloseHandle(hProcess);

return f;

}

/** WinMain() ***

neonene commented 6 months ago

Your example at least needs EmptyWorkingSet() Windows API like:

    hProcess = OpenProcess(
        PROCESS_QUERY_INFORMATION |
        PROCESS_VM_READ |
        PROCESS_SET_QUOTA,  // added
        ...
    );
    ...

    if (EmptyWorkingSet(hProcess)) {  // added
        if (GetProcessMemoryInfo(hProcess, ...)) {
            ...
        }
    }
    CloseHandle(hProcess);

And I repeated the sequence of the main function to see the risk of an out-of-memory. ~Then Python 3.11.7 shows:~ Then the executable embedded with Python 3.11.7 shows:

(1st)
Memory usage: 217088  // BeforePyInit
Memory usage: 212992  // AfterPyInit 
Memory usage: 204800  // AfterPyFinalized 
Memory usage difference between start and end: -12288
(2nd)
Memory usage: 200704
Memory usage: 212992
Memory usage: 196608
Memory usage difference between start and end: -4096
(3rd)
Memory usage: 196608
Memory usage: 212992
Memory usage: 204800
Memory usage difference between start and end: 8192
(4th)
Memory usage: 200704
Memory usage: 212992
Memory usage: 200704
Memory usage difference between start and end: 0
(5th)
Memory usage: 200704
Memory usage: 212992
Memory usage: 200704
Memory usage difference between start and end: 0
...

If there is any leak, EmptyWorkingSet() cannot reduce the usage constantly like that. Note that GetProcessMemoryInfo() is not recommended for checking a memory leak. See:

AlexSoft73 commented 6 months ago

Hi guys and thanks again for your time

Back when I branched the python project, my branch was versioned 3.12.0a0, what version is python currently? I am bit out of Python for work reasons. Maybe the fixes I recommended were included and then there is nothing to argue about here.

Neonene, you were right and I appreciate your suggestion about using , the only thing here is that if you use a function that returns a Boolean value, you should still check the reason of this false, you miss checking for errors with GetLastError.

EmptyWorkingSet can return false and still things be ok.

AlexSoft73 commented 6 months ago

Hi guys,

Neonen,e you seem to work with windows platform and c/c++ coding, here I am sending you a log trace in MSVS2019 while running python embedded, For the sake of simplicity, I only recorded Memory requests to Virtual memory management of the windows operating system, I have commented kind of explain what was going on, also I have used the latest Python available:

'RunScript_x64.exe' (Win32): Loaded 'D:\Programming\VC\2019_COMCLASSES\x64\Debug\PyLoaderx64_d.dll'. Symbols loaded. <-- Proxy Dll To allow late binding Python libraries 'RunScript_x64.exe' (Win32): Loaded 'D:\Python\python3.dll'.                           <-- Python is loading 'RunScript_x64.exe' (Win32): Unloaded 'D:\Python\python3.dll' 'RunScript_x64.exe' (Win32): Loaded 'D:\Python\python3.dll'. 'RunScript_x64.exe' (Win32): Loaded 'D:\Python\python312.dll'. 'RunScript_x64.exe' (Win32): Loaded 'C:\Windows\System32\bcrypt.dll'. 'RunScript_x64.exe' (Win32): Loaded 'C:\Windows\System32\vcruntime140.dll'. <-- Python is loaded! 3.12.1 (tags/v3.12.1:2305ca5, Dec 7 2023, 22:03:25) [MSC v.1937 64 bit (AMD64)]           <-- Here I print the python version, this is the only thing I do when I loaded Python 'RunScript_x64.exe' (Win32): Unloaded 'D:\Python\python3.dll'                           <--Python starts to unload, I am finish with Python, remember embedded!      'RunScript_x64.exe' (Win32): Unloaded 'C:\Windows\System32\vcruntime140.dll' 'RunScript_x64.exe' (Win32): Unloaded 'D:\Python\python312.dll' 'RunScript_x64.exe' (Win32): Unloaded 'D:\Programming\VC\2019_COMCLASSES\x64\Debug\PyLoaderx64_d.dll' <-- Python get unloaded!

At this point all virtual memory taken by Python should be releases, I am talking about VirtualAlloc/VirtualFree windows API functions, well, when all this was happening, I was sniffing calls to VirtualAlloc and Virtual Free, here is the result, among the 4 blocks of virtual memory requested Python, only one was return to the operating system (0x2168EBC0000 16384 FV....) AV: VirtualAlloc/ FV: virtualFree

[cid:3a3bc1b0-87b1-4e6e-b7db-4679fd55fb08]

Why the other 3 were never released? If I continue the runs, you will see more records being requested to the virtual memory manager of windows, is this a memory leak Yes or No? Please explain me?

Always welcoming your comments, very constructive, Alexei


From: neonene @.> Sent: Sunday, December 10, 2023 09:48 AM To: python/cpython @.> Cc: AlexSoft73 @.>; Mention @.> Subject: Re: [python/cpython] [Windows] Embedded Python Memory Leaks #96853 ( Corrected 98%), Need testing (Issue #98359)

Your example at least needs EmptyWorkingSet() Windows API like:

hProcess = OpenProcess(
    PROCESS_QUERY_INFORMATION |
    PROCESS_VM_READ |
    PROCESS_SET_QUOTA,  // added
    ...
);
...
neonene commented 6 months ago

Yes. Any tiny leakage from any version of Python will be accumulated in a process if an executable repeats loading/freeing the dynamic DLL which is linked to Python. Currently, embedding does not mean plugging in/out. 3.12 and newer are under reducing global objects, which has another issue to be opened.

AlexSoft73 commented 6 months ago

Thanks for your replay,

Are you a core developer?


From: neonene @.> Sent: Tuesday, December 12, 2023 03:00 PM To: python/cpython @.> Cc: AlexSoft73 @.>; Mention @.> Subject: Re: [python/cpython] [Windows] Embedded Python Memory Leaks #96853 ( Corrected 98%), Need testing (Issue #98359)

Yes. Any tiny leakage from any version of Python is accumulated in a process if an executable repeats loading/freeing the dynamic DLL which is linked to Python. Currently, embedding does not mean plugging in/out. 3.12 and newer are under reducing global objects, which has another issue to be opened.

— Reply to this email directly, view it on GitHubhttps://github.com/python/cpython/issues/98359#issuecomment-1852207593, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3C7TGCGMC2UMN2JY4UUNFDYJBWRZAVCNFSM6AAAAAARHHUALCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJSGIYDONJZGM. You are receiving this because you were mentioned.