pyinstaller / pyinstaller

Freeze (package) Python programs into stand-alone executables
http://www.pyinstaller.org
Other
11.49k stars 1.93k forks source link

Shared memory leak on Windows? #5123

Open denravonska opened 3 years ago

denravonska commented 3 years ago

We have an issue where we upload 1-3GB of data to a REST server. When running in native Python it works fine. You can see that the memory goes up while reading the file, then drops when the data has been uploaded. Once PyInstall:ed in Windows we observe that the memory isn't released. Instead, it keeps climbing for each uploaded file, but what's worse is that the memory isn't released even after the application exits.

I have a small example which can trigger the issue using either httpx or aiohttp. Each time I run it in PyInstaller I permanently lose as much memory as I upload, eventually turning to swap and finally resulting in an unusable system. The only way to get back on track is to reboot.

Is this a known issue? I feel it's such a big thing that we can't possibly be the only ones that's been bitten by it.

Example:

from pathlib import Path
import asyncio

# Path to a directory of large files. I've tried with ~50MB PCAP files.
p = Path("...")
# URL where POST or PUT is allowed
url = "..." 

async def upload_httpx():
    import httpx

    for file in p.iterdir():
        async with httpx.AsyncClient(verify=False) as client:
            await client.put(
                url,
                data=file.read_bytes(),
                timeout=15,
            )

async def upload_aiohttp():
    import atiohttp
    for file in p.iterdir():
        async with aiohttp.ClientSession() as session:
            await session.put(
                url,
                data=file.read_bytes(),
                timeout=15,
            )

if __name__ == "__main__":
    # Trigger the leak
    asyncio.run(upload_httpx())

This has been tested against a flask-server but I will double check with a SimpleHttpServer and doing the same calls without asyncio.

rokm commented 3 years ago

I did some testing based on the example you provided. The client/uploader code is running in Windows 10 VM (python 3.7/3.8), while the test server is running on linux host (a simple server based on http.server.BaseHTTPRequestHandler and http.server.HTTPServer).

Using aiohttp, I observe the described behavior - memory usage keeps rising, and the memory does not seem to be released when the program ends. However, I observe this behavior both when running from source and when running pyinstaller'd version. Also, the behavior seems to go away if the data buffer is wrapped in io.BytesIO (both in source and pyinstaller'd version), i.e.:

data=io.BytesIO(file.read_bytes()),

Using httpx, I observe the described behavior. Running from source seems to work as expected, but the pyinstaller'd version keeps allocating memory and does not release it at the end. This seems to happen both with async version (http.AsyncClient) and synchronous (httpx.Client) version of the client (so it is unrelated to asyncio).

gSpikey commented 3 months ago

I have a similar issue. My pyinstaller EXE is leaking, but the native python doesn't leak. I'm not using asyncio, but I am using sockets. I'm using python 3.7 x64 with pyinstaller 4.7 on windows 11. I'll see if I can make a small sample that reproduces the leak. I made a simple socket send/receive guy and no leaks. So, it's not sockets. I do load a C DLL and call that, so I'll try that next.