Closed Romain-P closed 9 months ago
I have a case where I make several requests per minute, it leaks memory all over the place. I've tried closing pages, closing contexts, closing browsers, catching all the onClose events and reclosing everything all together, even closing the main playwright instance all the time, nothing works, it still leaks memory and at some point the GC thrashes completely with the CPU setting on fire.
Please provide and/or document a way to fully release all resources so that we can clear it out every now and then. Please don't just focus on testing, it's full of other use cases where playwright is needed to run long long times.
Which utility was used to plot this graph? I would like to reproduce the experiment.
@rigwild :) ?
I have a case where I make several requests per minute, it leaks memory all over the place. I've tried closing pages, closing contexts, closing browsers, catching all the onClose events and reclosing everything all together, even closing the main playwright instance all the time, nothing works, it still leaks memory and at some point the GC thrashes completely with the CPU setting on fire. Please provide and/or document a way to fully release all resources so that we can clear it out every now and then. Please don't just focus on testing, it's full of other use cases where playwright is needed to run long long times.
Which utility was used to plot this graph? I would like to reproduce the experiment.
@rigwild :) ?
JProfile
This issue is still happening.
Any solution? Why after a year there is still no fix? This problem hinders playwright's usability. There's nothing more annoying than wasting time integrating a third-party package to find out later that it's riddled with bugs and incompetent developers.
I'll have to ditch this tool very soon if this is not resolved, so annoying.
@gigitalz
I'll have to ditch this tool very soon if this is not resolved, so annoying.
you can always ask for your money back :dancers:
I resolved my issue by doing the following.
- Saved the browser's state to a local file (session, local storage, etc) after creating the browser/context and performing the actions required to meet my needs:
context.StorageState("state.json")
- Close browser, context and kill all node.exe processes every 30 minutes. (this is where the memory leak exists for me), if you don't kill them it creates a separate node.exe process every time. The previous process remains in memory taking up space.
- Create new browser/context and load in the saved state.. navigate back to where you need to be.
context, err := browser.NewContext( playwright.BrowserNewContextOptions{ StorageStatePath: playwright.String("state.json"), })
While this won't help with infinite scroll or other scenarios it might help some of you. A good example where this would work fine is creating a session with a QR code (for my situation) or after a simple login.
Kind of did the same thing, except I didn't have to save any state, just killed the whole playwright process tree and relooped to create a new spin of the same stuff. Cringe AF.
@gigitalz
I'll have to ditch this tool very soon if this is not resolved, so annoying.
you can always ask for your money back 👯
Very funny, I can't because I can't go back in time, moron.
👋 @gigitalz
Ignore @dgtlmoon - he's very opinionated about open source since he released his "paid hosted service"
The sooner someone with interpersonal skills forks his project the better
Regards
Looks like they're not going to fix this https://github.com/microsoft/playwright/issues/17736
Edit: There have been other threads that have been closed in 2020 https://github.com/microsoft/playwright/issues/4511 https://github.com/microsoft/playwright/issues/4549
At least give us an option to clear the garbage that's collected. I tried gc.collect in python, but this doesn't release it and wouldn't clear what's built up on the node process anyway
I also encountered this memory that caused the server to crash every hour. I had no choice but to switch to Puppeteer, and not only the memory was stable, but the page loading was also faster.
Switching to Puppeteer was the best workaround for me. I used the following Dockerfile:
FROM node:19.6.0-alpine
# Installs latest Chromium package.
RUN apk add --no-cache \
chromium \
nss \
freetype \
harfbuzz \
ca-certificates \
ttf-freefont \
dumb-init
# Tell Puppeteer to skip installing Chrome. We'll be using the installed package.
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser
# Copy the files necessary for the build
# ...
# Install and build
RUN npm install
RUN npm run build
# Expose and run the server
EXPOSE 8080
CMD ["node", "build"]
With Node JS v16.17.1 & Playwright v1.26.1, this problem no longer seems to be an issue. I run it about 60 minutes, 2000 pages loaded. Repro:
import { chromium } from 'playwright'
const setup = async () => {
const browser = await chromium.launch({ headless: false })
let page = await browser.newPage()
let j = 0
while (true) {
for (let i = 0; i < 20; i++) {
console.log(j, i)
await page.goto('https://httpbin.org/delay/1')
}
j++;
console.log('Trying to create a new context, does not fix the leak!')
await page.close()
page = await browser.newPage()
}
}
setup()
@gigitalz
I have the same issue. In my use case I am navigating to 100+ URLs every minute. I reach a gig of memory usage in 15 minutes, and running overnight leads to 15gb+ of usage. I've tried closing the page, browser, browser context and Playwright instance. I've even tried nulling the objects for the GC to free up any resources. I have been wrestling with this issue for a few weeks now, and am at the point of considering dumping it. My current solution is to force restart process every time it reaches limit but this is an awful hacky solution. Has anyone found a fix for this issue?
If one of the libraries does not have this leak, I would re-write my program in that language. Currently I am using Java's Playwright.
I deactivated Javascript loading and execution and my memory leak seems to be gone. Or it is still here, but with very low numbers so it's fine. If Javascript is not important, you might want to deactivate it.
Resources on how to do it:
(deactivating Javascript execution) https://stackoverflow.com/questions/65958243/disable-javascript-in-playwright
(blocking loading Javascript) https://scrapingant.com/blog/block-requests-playwright
I think it was user error. I ended up creating a bare bones pool where I created Browsers, Browser Pools, Contexts in a pool and cleared them each time I navigated to a new URL. Using a profiler, this is what my resource usage looks like:
If you have issues with these browsers, I recommend trying to build barebones architecture with multithreading, then porting over to larger use case. Cheers.
@CrazedCoderNate: What tool do you use to see this? Do you have some examples for creating a bare bones pool?
@LeMoussel
Sure thing! I used VisualVM. It is free and it works great for what I needed it for!
I uploaded the project here for this bare bones browser pool.
Why does using page.close() to close the browser process still exist? This is a very serious memory consumption, is there any solution?
Any update?
I would love a solution as well. I am using const context = await browser.newContext();
and then const page = await context.newPage();
and still getting this. Trying to do infinite scroll for certain pages. Even after finishing that the memory usage just keeps increasing...
I would love a solution as well. I am using
const context = await browser.newContext();
and thenconst page = await context.newPage();
and still getting this. Trying to do infinite scroll for certain pages. Even after finishing that the memory usage just keeps increasing...
I raised an issue for this, it got closed, then a dev responded somewhere for me to remake the ticket. I got lazy as I put time into it.
Anyway, make a loop spamming about:blank and watch the memory usage for python / node (both increase pretty much in tandem). If you do it for a bigger page, like amazon.com it goes up much faster
A solution for this would be to allow us to clear the memory out. I tried garbage clear in python but nothing changes.
If this would ruin the traces or something I'd understand, but we should still be able to manually clear it with a disclaimer on the method or something
For anyone doing this in typescript/javascript you can use my code. I have my node app running in a docker container. So, to clear up chromium browsers that are left unopened I just run the code below if I get an error trying to open another browser and after everytime I close a browser in the for loop I have running a list of url's to scrape
import { exec } from 'child_process';
// Function to execute the killall node command
export function killAllNode() {
exec('killall node', (error, stdout, stderr) => {
if (error) {
console.error(`Error executing killall node: ${error.message}`);
return;
}
if (stderr) {
console.error(`killall node stderr: ${stderr}`);
return;
}
console.log(`killall node stdout: ${stdout}`);
});
}
You can see the processes you are running with ps -e
and can choose other processes to shutdown. Doing this works and keeps the express server where I send a request that would run my scraper up and running.
still same issue
still same issue
Using the code I use in my project?
I think it's a stupid design in playwright. Any programmer will hate the memory problem and that is not in his control.
@vonkoff It's not Windows compliant.
On Windows machine, to kill a Node.js server, and you don't have any other Node processes running, you can tell your machine to kill all processes named node.exe
. That would look like this:
taskkill /im node.exe
And if the processes still persist, you can force the processes to terminate by adding the /f
flag:
taskkill /f /im node.exe
Besides not being a proper solution, I would just like to make it clear that restarting everything is just not feasible is some cases, like when you are working on a infinite scroll page. If you restart, you are doomed.
Thought I would reiterate on this just so that no one thinks there is a good way to solve this issue. The bug should be fixed.
I think it's a stupid design in playwright. Any programmer will hate the memory problem and that is not in his control.
@tzbo tough comment, why dont you make something better?
@tzbo tough comment, why dont you make something better?
This is the usual dump f. take people come with when they have zero priority to solve a ticket.
It's tough and it's a long time ticket. There is no better solution in two years to solve memory problem. Do you think it's a good design? Maybe the original intention is to simplify usage. Of courese it's more simple than puppeteer in some senario. But I think at least it should keep some release memory funtions which mean I call these funtions. I promise I will not use previouse response and ....
context.close? no, it can not release memory. I used it at previous version. Another, I don't want to close it.So I migrate to pupppeter.
I wish playwright will be better soon. It supports more browsers.
👋 @gigitalz,
dgtlmoon isn't well socialised
Ignoring is the only language he speaks
Regards
🥩
This is also a very confusing issue for us. As long as we keep opening new pages within the loop, the memory keeps increasing continuously. We have encountered several OOM errors recently.
We have all the feedback we need for this issue and it is currently pending due to prioritization. I'll disable the comments since they no longer add actionable details to the issue.
Unbounded heap growth should be mitigated by https://github.com/microsoft/playwright/commit/ffd20f43f8ee1a7a016cd9b29c372e25ec685a62. The heap will still saturate to a certain size (1K handles per object type, ~50-100Mb on average), but will stop growing afterwards.
Apart from making demands to people I dont know like some previous commenters, I would like to thank Pavel for adding that heap stack test to npm, that's a super cool idea. And generally thank the maintainers for their incredible work here, it's a highly complex project!
I face this issue in Azure Dev Ops agent pipeline. Locally all test run fine.. we have 200+ test and 5-6 run in parallel at a time.. after 8 minutes of running build it fails with
<--- Last few GCs ---> [12856:000002C6125D7660] 111 ms: Scavenge 7.9 (9.1) -> 7.6 (9.8) MB, 0.9 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure; [12856:000002C6125D7660] 142 ms: Scavenge 8.4 (9.8) -> 8.0 (10.1) MB, 0.8 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure; [12856:000002C6125D7660] 192 ms: Scavenge 8.7 (10.1) -> 8.2 (10.1) MB, 35.7 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure; <--- JS stacktrace ---> FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory 1: 00007FF621C4194F 2: 00007FF621BC6026 3: 00007FF621BC7D10 4: 00007FF6226721F4 5: 00007FF62265D582
After update from 1.29.1 to the latest 1.40.0, started to receive the error: page.goto: The object has been collected to prevent unbounded heap growth. And yes, I know that I'm not using page.close() to close the window after each test, because it's necessary to speedup tests. With closing page after each test, the error about memory usage disappeared, but appearing another problems, which I do not need to solve.
So, for now decided to downgrade to prev version. Any fix to the lib will be appreciated. Let me know if it's done in some new release. Thanks.
PS. I have about 800 scenarios, each of them has about 15 separate steps. In average, on the 300-350th test they become to fail.
We need unfortunately a reproduction case in order to debug issues like that. A small repository would be ideal. I'm going to lock this for now, so others can re-file and we can work on the missing scenarios, thanks!
Context:
Describe the bug
I'm watching full-js apps (e.g react/angular websites). I initialize one instance, 1 browser and 1 page. I keep the page in cache & retrieving content every 2 seconds.
After 1/2hours, the memory goes crazy. I tried to
reload()
the page every 30 minutes. It doesn't free the memory. Only way to free the memory is closing the page and recreating a new one.What could be the source of this memory leak? I suppose
reload()
frees the javascript-vm so it must be a leak internally to the page