Closed zqjimlove closed 1 year ago
13.4.12 - still high number of processes and bigger memory load
13.4.12
13.2.4
Just re-read every single post in this issue and there's a common topic of sharing screenshots. In order for us to investigate what your application is running into we need access to the code. Without this we can't verify what you're running into as you can understand.
I've posted an update around production memory usage on this issue: https://github.com/vercel/next.js/issues/49929#issuecomment-1649637524. At this point we are fairly certain there is no memory leak in production. We're still working on bringing down the number of processes to 2 instead of 4. We haven't investigated development yet as we've focused on slowdowns first in #48748.
@w7br you're talking about 13.2.4 and 13.1.6. Those versions are from months ago, there have been many optimizations landed since then. Would recommend using the latest version first. Either way please provide access to the application on which you're seeing memory issues so that we can confirm what you're seeing and investigate.
As we've shown in #49929 and #48748 we've dedicated significant engineering time towards investigating and improving these, however, the only way we can do this for memory usage issues is by having access to your code and running it ourselves.
As you can see in my previous updates on #49929 we had to run the lowest level tools dumping v8 memory allocations to investigate these. For slowdowns luckily we don't need access because you can share a CPU profile, for investigating memory allocation a heap dump is not enough.
Also please make it clear what you're running. I.e. @sedlukha is that development? I guess so?
Hey @timneutkens, I appreciate all the work in debugging the memory issues.
As you indicated in https://github.com/vercel/next.js/issues/45508#issuecomment-1637831723 this particular issue is about processes being retained after a build exits, as opposed to memory usage issues.
We are currently running into this problem also when next dev
crashes, or when running in production in standalone mode and the main server.js process crashes (which only happens during some rare issues during startup, like EADDRINUSE, so is probably less relevant).
Do you still need a reproduction case for this lingering worker process problem? I will be happy to see if I can provide one.
@timneutkens I am on the same boat as @hnsr, anytime the next build
fails for any reason, starting the server (even in the command line with npm start
fails with something like: Error: Could not find a production build in the '/app/.next' directory.
(In my case the build is now currently failing because of #53086) but it failed for many other reasons before as this is part of a deploy process.
To make this worst, I am using pm2
that tries to restart the app frequently so it quickly ends up with a lot of worker processes and they are doing something, using memory and everything. Is like the worker thread lose its parent process so eventhough npm start finishes with error the process sticks.
Do you still need a reproduction case for this lingering worker process problem? I will be happy to see if I can provide one.
This would be helpful indeed if you're able to provide that, saves me significant amounts of time trying to figure out how this is being run into. I guess the "next start without a build" case @hanoii is talking about is a good start if that reproduces though 👍
I just checked with @ijjk and as it turns out he saw something similar and fixed it in a recent refactor: https://github.com/vercel/next.js/blob/46677ccda6a62203d7a7ae359c1020780aeccee5/packages/next/src/server/lib/router-server.ts#L247-L262. Could you try with next@canary
?
@timneutkens I tried it locally as I was able to reproduce it as well and yes, next@canary at least doesn't leave the process in a straight out start fail:
I am getting a different error:
[Error: ENOENT: no such file or directory, open '/var/www/html/next/.next/BUILD_ID'] {
but I guess that's ok.
Maybe this fixes it.
@timneutkens
Also please make it clear what you're running. I.e. @sedlukha is that development? I guess so?
No, this is prod. I run it for 17 apps.
And i've tried canary, now even worse memory usage, 4.9G (13.4.13-canary.6) vs 2.4G (v13.2.4) vs 3.16G (v13.4.12)
@timneutkens seems that experimental.appDir: false might disable next-render-worker-app process and solve the problem for those, who use only pages routing.
I would be happy to test it, but I can't do it on my real apps because of next issue https://github.com/vercel/next.js/issues/52875
@sedlukha Seems what you're reporting is exactly the same as #49929 in which I've already explained the memory usage, there is no leak, it's just using multiple processes and we're working on optimizing that: https://github.com/vercel/next.js/issues/49929#issuecomment-1637185156
Setting appDir: false
is not supported and that option will go away in a future version, we just haven't gotten around to removing the feature flag.
@hanoii thanks for checking 👍
Same here.. my macbook crashed when I used Nextjs latest with turbo repo.. multiple child processes were running in the background even after terminating the server...
FYI: experimental: {appDir: false} does not work anymore on 13.4.13 for me (page rendered, but url changes failed to load json and triggering ssr), and now spawning 3 processes apart from main process.
next-router-worker
next-render-worker-app
next-render-worker-pages
I have same issue as well. version 13.4.8
@Nirmal1992 @S-YOU @space1worm I'm surprised you did not read my previous comment. I thought it was clear that these types of comments are not constructive? https://github.com/vercel/next.js/issues/45508#issuecomment-1653226340
@space1worm I'm even more surprised you're posting "same issue" without trying the latest version of Next.js...
@timneutkens Hey, yeah sorry I missed it, here I made my test repo public.
You can check this commit tracer
I had memory usage problem on version 13.4.8, after navigating on any page my pod's memory would skyrocket for some reason... and after that whole app was braking an becoming unresponsive.
not sure, if this problem is related to my codebase or not, would love to hear what is the problem!
one more thing, I tried to increase resources but the application was still unresponsive after breaking.
Here as a reference
Application is still not using the latest version of Next.js, same in the commit linked: https://gitlab.cern.ch/nzurashv/tracer/-/blob/master/package-lock.json#L4673
@timneutkens I have updated to latest version, created new branch tracer/test
Issue still persist
here you can check this link as well
Additionally, I inquired with the support team regarding the cause of the failure, and they provided me with the following explanation.
FYI: experimental: {appDir: false} does not work anymore on 13.4.13 for me (page rendered, but url changes failed to load json and triggering ssr), and now spawning 3 processes apart from main process.
next-router-worker
next-render-worker-app
next-render-worker-pages
I have a question about these child processes, currently it seems they open random ports which broke my application behind WAF in Azure, this happened because we only open certain ports. Is there anyway for me to force the ports these child processes are going to use at all? I am on the latest next release
FYI: With 13.4.11 we were unable to start our app in Kubernetes. We received a spawn process E2BIG at jest-worker. This only happened when our rewrites (regex path matching) were above a certain length (although still below max).
Downgrading back to 13.2.4 resolved the issue.
FYI: now main process started with node server.js
is gone in Next.js 13.4.15, and next-router-worker
's parent PID become 1 (init). This could probably use less memory since It use one less process.
1362416 1 00:00:02 next-router-worker
1362432 1362416 00:00:00 next-render-worker-app
1362433 1362416 00:00:05 next-render-worker-pages
@timneutkens, sorry, I probably misread it. I do not mean to claim or anything. I am just sharing what I've observed in the version I am using (which supposed to be latest release).
In 13.4.15 (but really upgrade to 13.4.16 instead) this PR has landed to remove one of the processes indeed: https://github.com/vercel/next.js/pull/53523
@sladg is this a joke...?
@timneutkens I've tried v13.4.20-canary.2.
It was expected that https://github.com/vercel/next.js/pull/53523 and https://github.com/vercel/next.js/pull/54143 would reduce the number of processes, resulting in lower memory usage.
Yes, the number of processes has been reduced; after the update, I see only two processes. However, memory usage is still higher than it was with v.13.2.4.
node v.16.18.1 (if it matters)
v13.4.20-canary.2
13.2.4
It's entirely unclear what you're running / filtering by, e.g. you're filtering by next-
but 13.2.4
doesn't set process.title to anything specific.
Sharing screenshots is really not useful, I keep having to repeat that in every single comment around these memory issues.
Please share code, I can't do anything to help you otherwise.
@timneutkens I've tried v13.4.20-canary.2.
It was expected that #53523 and #54143 would reduce the number of processes, resulting in lower memory usage.
Yes, the number of processes has been reduced; after the update, I see only two processes. However, memory usage is still higher than it was with v.13.2.4.
node v.16.18.1 (if it matters)
v13.4.20-canary.2
13.2.4
The memory looks stable - but it is really hard to see anything in the screen shots.
I'm seeing this behaviour running next dev
starting on 13.3 and newer versions (13.4 included). This isn't happening on 13.2. Somehow this looks like it's happening whenever files are being added/removed from the FS (not sure due to my current use case) while the dev script is running.
Even after closing next dev
, jest orphaned processes are leftover.
I'm amazed by how often my comments are flat out ignored the past few weeks on various issues. We won't be able to investigate/help based on comments saying the equivalent of "It's happening". Please share code, I can't do anything to help you otherwise.
I'll have to close this issue when there is one more comment without a reproduction as I've checked multiple times now and the processes are cleaned up correctly in the latest version.
By latest version you mean the latest RC @timneutkens ? Sorry can't help with steps to reproduce, this is happening inside a spawn call in a very specific use case so best I can do is confirm that it happens.
Verify canary release
Provide environment information
Which area(s) of Next.js are affected? (leave empty if unsure)
CLI (create-next-app)
Link to the code that reproduces this issue
https://github.com/vercel/next.js/files/10565355/reproduce.zip
To Reproduce
reproduce.zip
This problem can reproduce above next@12.0.9, but 12.0.8 was all right.
Or remove
getInitialProps
in_app.tsx
was all right above next@12.0.9.Describe the Bug
Hight number of processes of /next/dist/compiled/jest-worker/processChild.js still alive after next build
Expected Behavior
Kill all child processes.
Which browser are you using? (if relevant)
No response
How are you deploying your application? (if relevant)
No response
NEXT-1348