kaliiiiiiiiii / CDP-Socket

socket for handling chrome-developer-protocol connections
MIT License
11 stars 3 forks source link

High memory consumption #13

Closed juhacz closed 7 months ago

juhacz commented 7 months ago

High memory consumption There is something wrong with the class, if a program runs in a loop and downloads data non-stop, the program's memory consumption increases every second.

Sample program:

from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio

async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        while True:
            await driver.get('https://pudelek.pl')
            await driver.sleep(0.5)
            source = await driver.page_source
            await asyncio.sleep(3)

asyncio.run(main())

After several minutes of running the program, the memory usage of the python.exe process increases by several MB, and after several dozen minutes even up to 2 GB.

Python 3.10.6 Windows 11 Chrome 119.0.6045.124

kaliiiiiiiiii commented 7 months ago

Uh yeah there might actually be a memory leak indeed - probably even in the underlying CDP-socket library @juhacz Thanks for raising this issue, I'll have a look at it.

juhacz commented 7 months ago

To supplement the problem report, after about 1.5 hours of running the script and calling driver.get about 1,500 times, python.exe already occupies about 1.7 GB of RAM. Then the TimeoutError occurs('page: "https://xxxxxxxxx" didn\'t load within timeout of 10') chrome exited

A solution might be, for example, closing Chrome via driver.quit() every 100 iterations, but in my case it will not work, because often the first time I call driver.get() the page from which I am trying to download data displays the Captcha (datadome). After solving it, the program runs until it takes up a lot of memory and crashes.

kaliiiiiiiiii commented 7 months ago

@juhacz Solution probably will be to implement a time or date buffer, and not store all messages from//to chrome forever. Suspect that to be the case. Unless Chrome as a subprocess takes up that much memory - but then I can't do a lot.

juhacz commented 7 months ago

After downloading the page source, I parse the json from it and save it to a file. I don't store the values retrieved with each loop interaction, so the problem isn't on that side. The next time the loop runs, the variables are overwritten.

kaliiiiiiiiii commented 7 months ago

After downloading the page source, I parse the json from it and save it to a file. I don't store the values retrieved with each loop interaction, so the problem isn't on that side. The next time the loop runs, the variables are overwritten.

Yeah I meant more like - inside my library yk:)

kaliiiiiiiiii commented 7 months ago

on

I can confirm this. Memory increases about 12.5 MB/minute

Google-Chrome process seems stable at 300MB - 500 MB

will be doing further testing..

kaliiiiiiiiii commented 7 months ago

with https://github.com/kaliiiiiiiiii/CDP-Socket/commit/e657c46cdf8350c88ea45c8c6712991d28e43735, Python is stable at about 20MB-50MB for me. @juhacz Can you confirm that? Just install with `

pip install https://github.com/kaliiiiiiiiii/CDP-Socket/archive/refs/heads/dev.zip

and the run the script you provided previously

juhacz commented 7 months ago

I checked this version and unfortunately nothing had changed, after about 3 minutes of running the program, Python.exe was already using about 130MB of memory.

Python 3.10.6 x64 Windows 11 x64 Chrome 119.0.6045.160 x64

kaliiiiiiiiii commented 7 months ago

I checked this version and unfortunately nothing had changed, after about 3 minutes of running the program, Python.exe was already using about 130MB of memory.

Python 3.10.6 x64 Windows 11 x64 Chrome 119.0.6045.160 x64

@juhacz Huh that's weird - can you check if the changes at https://github.com/kaliiiiiiiiii/CDP-Socket/commit/e657c46cdf8350c88ea45c8c6712991d28e43735 actually got applied in your env?

I'll check then Python 3.10 as well

juhacz commented 7 months ago

I checked and I still had the old version of the class installed, I don't know why but I followed the command pip install https://github.com/kaliiiiiiiii/CDP-Socket/archive/refs/heads/dev.zip

after executing the command: pip install --upgrade --force-reinstall https://github.com/kaliiiiiiiiii/CDP-Socket/archive/refs/heads/dev.zip

the correct version has been installed. I checked the program and everything is OK, memory consumption is around 56MB. Thank you for your great work, it helps me a lot!

kaliiiiiiiiii commented 7 months ago

I checked and I still had the old version of the class installed, I don't know why but I followed the command pip install https://github.com/kaliiiiiiiii/CDP-Socket/archive/refs/heads/dev.zip

after executing the command: pip install --upgrade --force-reinstall https://github.com/kaliiiiiiiiii/CDP-Socket/archive/refs/heads/dev.zip

the correct version has been installed. I checked the program and everything is OK, memory consumption is around 56MB. Thank you for your great work, it helps me a lot!

That's great to hear! I noticed that opening and closing a tab creates further leaks. I'll leave this issue open for now & try to fix it as well.