Closed jflayhart closed 1 year ago
Curious if there are any known dependencies that Next requires to be updated for Next12+? What changed specifically in >=12.0.2 that could be causing growing Promises and array objects on a node server?
I could provide our package.json deps if that helps, but I am not exactly sure if it would be the fault of prod deps or devDeps due to the build process possibly causing issue.
Possibly related
As a temporary workaround you can disable esmExternals
What does this mean exactly? What is this disabling as a workaround? I'd prefer a real fix rather than a workaround :D What are the tradeoffs with disabling esmExternals
?
@timneutkens sorry to directly ping you, but es-modules-support definitely seems to be the culprit and I don't understand why. Could you please shed some light onto what changed in Next12? Y'all said Next12 is "backwards compatible" but it seems that is not the case with enabling by default instead of opt-in:
ES Modules support includes backward compatibility to support the previous import behavior of CommonJS. In Next.js 12,
esmExternals: true
will become the default
That doesn't seem like a backwards compat change given the amount of mem leaks people are running into š¤·š» But maybe I am misunderstanding. Setting to false "works" so I can now run Next >12.0.1 (including latest), but is that the right thing to do? Again, what are we disabling exactly and why do we want to do this? I don't want to do this just cause "now it works" I'd like to understand any tradeoffs. Thanks!
given the amount of mem leaks people are running into
As far as we've seen on the other issues the webpack config was customized through next-transpile-modules which is not officially supported as it's a community customization of webpack config, could you confirm you're also using next-transpile-modules? If you have a reproduction that'd be great given that the other issues don't have a reproduction provided which makes it impossible to investigate further, so far my suspicion is that next-transpile-modules has a memory leak but we couldn't reproduce it in a new app so it might be related to specific combination of imports.
We're planning to build something like next-transpile-modules into Next.js to have an official supported way to do something like that: #35150
Y'all said Next12 is "backwards compatible" but it seems that is not the case with enabling by default instead of opt-in:
It is backwards compatible. It passes all integration tests and we test Next.js against the internal apps that Vercel has like the vercel.com app. These are among the largest apps we've seen to date. What you can't expect from customizing webpack config is that your customization of the build pipeline is backwards compatible between versions as you get full control over the webpack config.
What are the tradeoffs with disabling esmExternals?
Resolving of ESM packages in node_modules will be incorrect / not work. It won't resolve packages with "type": "module"
. More here: https://nextjs.org/blog/next-11-1#es-modules-support
@timneutkens awesome thanks so much for that explanation! I trust that y'all have the right tests in place and it's hard to know what enterprise, custom apps are doing with your framework. I get it.
We do NOT use next-transpile-modules
Considering that the esmExternals: false
change seems to help most people out of this OOM issue, including me, should we add it to docs as a last resort "backwards compat" config change? For example, in Next12 upgrade guide:
"if you experience [this OOM error] first try setting esmExternals: false
since it was turned on by default in Next12. Doing this will...."?
What do you think?
If you have a reproduction that'd be great given that the other issues don't have a reproduction provided which makes it impossible to investigate further, so far my suspicion is that next-transpile-modules has a memory leak but we couldn't reproduce it in a new app so it might be related to specific combination of imports.
I've spent so much time already, so I may as well work on a solid reproduction outside a large enterprise app š The problem is I don't know what in our npm spaghetti is causing the conflict with Next12, so hard to reproduce with only a handful of deps in an isolated sandbox. Will try my best...
@timneutkens awesome thanks so much for that explanation! I trust that y'all have the right tests in place and it's hard to know what enterprise, custom apps are doing with your framework. I get it.
Considering that the
esmExternals: false
change seems to help most people out of this OOM issue, should we add it to docs as a last resort "backwards compat" config change? For example, in Next12 upgrade guide:"if you experience [this OOM error] first try setting
esmExternals: false
since it was turned on by default in Next12. Doing this will...."?What do you think?
I'd prefer to fix the OOM even if it's not specifically in Next.js through a combination of deps. That flag will definitely go away in the future. Main problem so far is that there is no reproduction provided that would allow us to spend time on it as I'd be happy to have someone on our team look at it if one is provided.
I've spent so much time already, so I may as well work on a solid reproduction outside a large enterprise app š The problem is I don't know what in our npm spaghetti is causing the conflict with Next12, so hard to reproduce with only a handful of deps in an isolated sandbox. Will try my best...
That'd be great! š
@timneutkens I can start to reproduce what may eventually lead to a OOM server kill in my sandbox env, but it is so small of an app I can't get it to crash (yet). It makes sense why a large, enterprise site would crash more quickly due to its size and need for more resources.
That being said, the memory retained size trends are the same in heap snapshots between our proprietary enterprise site and my personal sandbox site, where obviously the latter is far less resource intensive. But I have reproduced the same memory leak!
To reproduce, I installed deps based on what I assumed to be the main offenders from preliminary R&D on the enterprise site:
yarn install
yarn build
yarn start
At its core, my sandboxed env is barebones version of our enterprise app, where the memory leak is mainly around Promises being held in the global namespace:
You can imagine this is orders of magnitude worse for an enterprise app where, assuming the culprit is styled-components, there are thousands more lines of css-in-jss.
This is all a working theory, but it is fact that Promises and other constructors are retaining memory over time (see promises start growing up to the top in the last screenshot):
Could this mean something is wrong with _document's initialProps
usage with styled-component? My working theory is this is part of the issue
UPDATE: Setting esmExternals: false
on sandbox project still shows growing retained memory size, but at a MUCH slower rate and after a while most of it gets GCd correctly:
not much additional info to provide, but we had to downgrade from 12.1.4 to 12.0.7 to fix a memory leak crash in the node process. can say that it was unrelated to which compiler we used, tried both. 12.0.7 not having the issue though
I've just tested and downgrading to 12.0.1 with babel fallback didn't solve issue for us. Issue described in #32314 still occurs (FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory).
For now we use dynamic imports for reducers as workaround.
not much additional info to provide, but we had to downgrade from 12.1.4 to 12.0.7 to fix a memory leak crash in the node process. can say that it was unrelated to which compiler we used, tried both. 12.0.7 not having the issue though
@nol13 could you please list out your deps as well? I'm trying to see if there's a pattern here, for example, do you have styled-components perhaps? Axios? react-query? What could be causing Node's Promises leak (eg. https://github.com/nodejs/node/issues/34328)?
none of those
"next": "12.0.7", "next-connect": "0.10.1", "react": "17.0.2", "react-redux": "7.2.4", "request": "^2.88.2", "send": "^0.17.1", "async-retry": "1.2.3",
Hmm is redux the common thread here? @jchatrny what do you mean when you say "we use dynamic imports for reducers as workaround"?
I have been fighting with heap memory issues for hours now, and nothing seemed to help. It started randomly this morning, and have been driving me crazy since. The only thing that helped is wiping .next directory and starting the server again. Maybe it helps someone with the knowledge of internals to figure out what's happening. Seems that webpack is trying to do something it shouldn't. I tried to inspect with node --inspect, but the traces were quite useless.
We are also experiencing this issue on our enterprise grade app, where we see exactly the same behavior as @jflayhart. When trying to reproduce it on a small project, the node process doesn't get killed, because it never reaches the critical memory point. However, for our large project, memory usage skyrockets to about 5GB before ending on heap errors.
Turning off esmExternals
solved the issue.
It seems to be happening after the compilation has been done (the server logs that the compilation of the page has been done).
These are the dependencies we use:
Hi, we recently landed some changes (https://github.com/vercel/next.js/pull/37397) to canary
that might help fix this issue, without setting esmExternals: false
.
Please try it out by installing next@canary
and let us know!
It seemed to help a lot of people in this thread: https://github.com/vercel/next.js/issues/32314
@balazsorban44 seems to have fixed the issue! I'll check later for a 100% confirm, because I had to disable middleware because of breaking changes in the latest canary. If the memory leak was because of the middleware, it might not have been fixed.
Thanks for checking. I would say the memory issue is unlikely to be middleware related, but happy to get a 100% confirmation. š
@jflayhart could you check canary
as well?
I was lurking on here and want to confirm that switching to canary seems to have gotten rid of the memory leak in our enterprise-grade app as well! š
@balazsorban44 I just pulled down v12.1.7-canary.42 and it's working great! Not seeing the same memory leaks š Certainly not around Promises. Great work!
Curious what was the exact issue in webpack (presuming since that was updated)?
I was lurking on here and want to confirm that switching to canary seems to have gotten rid of the memory leak in our enterprise-grade app as well! š
Follow up to my lurk, turns out the memory leak just changed š We are using next in 2 repos: one with a lot of files and one with LONG files. Prior to the canary, long files had the memory leak and lots of files did not, whereas with the canary our repo with long files is now fine and our lots of files repo is getting OOM. I know this is a vague report but wanted to pass this insight along before we file an issue through vercel!
This issues is (tentatively) fixed for us in 12.2.6-canary.8. Guessing it might be https://github.com/vercel/next.js/pull/39902 that did the trick
Confirmed here early reports are showing that 12.3.0
has fixed the leak our end š
Cool we might have to test out 12.3.0 then. Was worried we'd have to go with the canary since #39902 got reverted.
Based on multiple reports, going to close this for now. Please open a new issue (testing next@canary
) with a new reproduction if you have similar issues.
This closed issue has been automatically locked because it had no new activity for a month. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.
Verify canary release
Provide environment information
What browser are you using? (if relevant)
Chrome 100
How are you deploying your application? (if relevant)
custom express server running on GCP distributed across multiple containers running 1GB memory
Describe the Bug
We're in a bit of a catch22 because we tried to upgrade to 12.0.0, which works, but then our Lighthouse scores started failing due to a security vulnerability in 12.0.0. I don't know if y'all ever fixed that but something was done in 12.0.5 it seems based on the change log, so we tried that, which created another issue: memory leaks.
After upgrading to 12.0.5 a new issue occurred where the prod optimized build app simply crashes due to OOM errors:
I tried 12.1 and even canary releases to see if anything worked itself out. Same problem. I tracked down a memory leak with growing Promises and Arrays within webpack runtime and cjs-loader and tslib (whatever that means) when building prod build locally and running
node --inspect
:In the end, I found 12.0.1 to be the only version after 12.0.0 with no leak; going to 12.0.2 starts the memory leak and crashes the app in the same way.
The prime suspects are the following deps we have installed based on their interaction with promises and arrays:
Already tried upgrading node to latest LTS (16), still doesn't work. Local dev runs "fine" albeit slow, but local builds don't crash. Only prod optimized builds
Expected Behavior
App is expected to NOT crash, so I upgraded my personal website to 12.1, which is hosted on Vercel and has less dependencies than the enterprise, e-commerce site I am working on. Well, it is fine and running great, so makes me think the issue is more complex possibly around the use of a custom server in the enterprise app and/or Next12 isn't playing nice with some node deps.
When I ran this in 12.0.0 and 12.0.1 the memory leak is non-existent:
It is not until 12.0.2 that we experience the memory leak! and I've gone through your change logs and nothing looks suspicious.
To Reproduce
Unable to reproduce with custom server next example, so it's likely because we're using an enterprise app and something (ie other dependencies) isn't playing nice with next12 and would like to understand what could be causing that.
I can only assume something around SWC would have changed the build process so when it runs in node it gets wonky? Just speculating at this point because we can't upgrade to next12 at all.
But in our enterprise app, I simply do the following to reproduce consistently every time: