cloudflare / next-on-pages

CLI to build and develop Next.js apps for Cloudflare Pages
https://www.npmjs.com/package/@cloudflare/next-on-pages
MIT License
1.24k stars 119 forks source link

[πŸ› Bug]: Failed: build exceeded memory limit and was terminated #309

Closed anonymouscatcher closed 1 year ago

anonymouscatcher commented 1 year ago

next-on-pages environment related information

node v: v18.16.0

Description

I have a large project and migrating Cloudflare, Trying to deploy to Cloudflare pages but while building the application I get this error.

Failed: build exceeded memory limit and was terminated

Also I get this message when I build the application locally with npx @cloudflare/next-on-pages@latest. FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

Full Log:

β–²  ℇ  (Streaming)  server-side renders with streaming (uses React 18 SSR streaming or Server Components)
β–²  β—‹  (Static)     automatically rendered as static HTML (uses no initial props)
β–²  Traced Next.js server files in: 27.112ms
β–²  Created all serverless functions in: 1:02.051 (m:ss.mmm)
β–²  Collected static files (public/, static/, .next/static): 37.616ms
β–²  Warning: Node.js functions are compiled from ESM to CommonJS. If this is not intended, add "type": "module" to your package.json file.
β–²  Compiling "middleware.js" from ESM to CommonJS...
β–²  Build Completed in .vercel/output [4m]
⚑️ Completed `yarn vercel build`.
<--- Last few GCs --->
[22466:0x7fd95284d000]   267301 ms: Mark-sweep (reduce) 4073.1 (4143.3) -> 4072.7 (4143.6) MB, 2833.6 / 0.0 ms  (average mu = 0.107, current mu = 0.002) allocation failure; scavenge might not succeed
[22466:0x7fd95284d000]   270200 ms: Mark-sweep (reduce) 4073.9 (4143.6) -> 4073.4 (4144.3) MB, 2893.0 / 0.0 ms  (average mu = 0.056, current mu = 0.002) allocation failure; scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
 1: 0x101b8ceb5 node::Abort() (.cold.1) [/usr/local/bin/node]
 2: 0x100602a69 node::Abort() [/usr/local/bin/node]
 3: 0x100602c4e node::OOMErrorHandler(char const*, bool) [/usr/local/bin/node]
 4: 0x10078e653 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 5: 0x100957305 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/bin/node]
 6: 0x100955ce2 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
 7: 0x10094801a v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
 8: 0x100948995 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
 9: 0x10092ae0e v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/local/bin/node]
10: 0x100d60c6c v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/usr/local/bin/node]
11: 0x1011567f9 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/usr/local/bin/node]
12: 0x10619d871
Done in 275.52s.

is there any way that I can increase the memory in while building in pages? or any other suggestions?

Reproduction

No response

Pages Deployment Method

None

Pages Deployment ID

No response

Additional Information

Would you like to help?

anonymouscatcher commented 1 year ago

@dario-piotrowicz any thoughts on this issue?

anonymouscatcher commented 1 year ago

I just noticed that passing NODE_OPTIONS will solve the issue, but this is not possible on Cloudflare, :(

dario-piotrowicz commented 1 year ago

@anonymouscatcher I'm not really sure, it's strange, especially since from your console output it'd look like this happens after the Vercel build completes, but there shouldn't be any logic that can fill the js heap πŸ˜•

The only thing that comes to mind which could potentially cause this would either be when we recursively visit the functions directory or some external library logic (esbuild, acorn, etc..), I'd be interested to see your application and try to debug when it fails (but since you didn't I suspect you can't share it).

have you tried passing --max-old-space-size to the command? (not via NODE_OPTIONS but directly)

anonymouscatcher commented 1 year ago

Indeed it's very strange, I don't know whats happening exactly, I tried with --max-old-space-size as well but it seems the runtime is not node so passing these options won't help. @dario-piotrowicz

dario-piotrowicz commented 1 year ago

but if the runtime is not how can NODE_OPTIONS be actually helping? πŸ˜• (is there some node code that runs node code? πŸ€”)

anyways, are you sure you can't pass NODE_OPTIONS in pages?

Regardless if that makes your app build locally then you can manually deploy it via wrangler pages deploy .vercel/output/static no? (and automate the whole thing via actions)

PS: I am not saying that this is the long term solution, just trying to see if we can find a temporary solution for your case

anonymouscatcher commented 1 year ago

Yes I mean it seems on Cloudflare run time is not node, at least this how a Cloudflare dev responded to me when I asked on Discord, so passing those parameters won't help.

It seems when I have more than specific number of routes this failure happens, Probably somewhere in the code it trying to read a very big string. Do you think it can be in cloudflare/next-on-pages lib? @dario-piotrowicz

11: 0x10cfdca31 v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
12: 0x10cfa9ac7 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/local/bin/node]
13: 0x10d3615be v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/usr/local/bin/node]
14: 0x10d70a319 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/usr/local/bin/node]
15: 0x10d77fc4a Builtins_StringSlowFlatten [/usr/local/bin/node]
16: 0x10d77fd61 Builtins_StringIndexOf [/usr/local/bin/node]
Done in 231.37s.
dario-piotrowicz commented 1 year ago

@anonymouscatcher yes Cloudflare workers/pages don't run on node, however the issues you're experiencing are during build, aren't they? (sorry that was my understanding based on your logs please let me know if I'm wrong πŸ˜“)

So I'd imagine that as long as you can build your app (locally on in the Pages pipeline) it should be ok to be deployed successfully

__

Regarding your question I'm sorry I am not really sure...

anyways first of all I'd like to understand if it's a build issue or a runtime one as that would be quite good to clear up πŸ™‚

anonymouscatcher commented 1 year ago

No worries :) I also tried to deploy the app with wranger, but while upload route.json it fails.

- __next-on-pages-dist__/functions/video/[id].func.js (esm)
:sparkles: Compiled Worker successfully
:sparkles: Uploading Worker bundle
:sparkles: Uploading _routes.json

✘ [ERROR] A request to the Cloudflare API (/accounts/xxx/pages/projects/appname/deployments) failed.

  An unknown error occured. Contact your account team or Cloudflare support: https://cfl.re/3WgEyrH.
  [code: 8000000]

 If you think this is a bug, please open an issue at:

πŸ€¦β€β™‚οΈπŸ€¦β€β™‚οΈπŸ€¦β€β™‚οΈπŸ€¦β€β™‚οΈ I'm thinking cloudflare is not ready yet to support Next app.

dario-piotrowicz commented 1 year ago

😒

The last unknown error is unrelated to next-on-pages, I think there's something going wrong with wrangler and/or your project πŸ˜“

have you set the nodejs_compat flag in the dashboard for your Pages project? (Also note that you cannot manually deploy an application to a Pages project set up to work with the github integration)

Could you try to create a new Pages project, manually upload your app there and make sure to set the nodejs_compat flag? (and re-deploy after having done that)

steps What I would basically do is to create a new pages app by clicking on Upload assets in the application creation screen: ![Screenshot 2023-06-15 at 11 33 30](https://github.com/cloudflare/next-on-pages/assets/61631103/9db5724d-be1a-4714-9211-41c53b50fd7b) Give the project a name/create it and going back without uploading anything yet: ![Screenshot 2023-06-15 at 11 35 18](https://github.com/cloudflare/next-on-pages/assets/61631103/df13cdb6-a8a8-4716-a9c0-d9235bff6c60) ![Screenshot 2023-06-15 at 11 35 30](https://github.com/cloudflare/next-on-pages/assets/61631103/40254a7f-2bd1-4294-ade5-2bf98bfb154d) Going into the settings page -> functions and set the flag: ![Screenshot 2023-06-15 at 11 37 14](https://github.com/cloudflare/next-on-pages/assets/61631103/c4821501-9cc5-4ff8-9d1a-1e3e47790dc9) ![Screenshot 2023-06-15 at 11 37 06](https://github.com/cloudflare/next-on-pages/assets/61631103/cb07d42b-aaa5-4d15-bce3-3c66467861c3) Then try again with wrangler but selecting this new project for your deployment (you might need to delete the `node_modules/.cache/wrangler/pages.json` file to reset the project you're deploying to) Sorry that it is a bit cumbersome πŸ˜“
anonymouscatcher commented 1 year ago

No worries,

Yes I just did it manually and uploaded a zip file and now the app is deployed but it shows a 404 page generate by next js pages, not app dir. how that is possible? do you think might be because I have api running in pages dir?

I have app dir + pages/api/

image

dario-piotrowicz commented 1 year ago

do you think might be because I have api running in pages dir? no I don't think so... πŸ˜•

does the app work locally? (if you run wrangler pages dev .vercel/output/static --compatibility-flag=nodejs_compat)

PS: I've never actually tried uploading a next-on-pages zip, so I can't guarantee that that works nicely, might be worth trying to do the deployment via wrangler instead

anonymouscatcher commented 1 year ago

image

anonymouscatcher commented 1 year ago

It seems that worker size is huge, CF staff checked the logs, maybe this is also the reason for the build exceeded memory limit, do you have any suggestion how to debug this? @dario-piotrowicz

dario-piotrowicz commented 1 year ago

Sorry @anonymouscatcher I am not sure, I have seen this sort of issue popping up recently as well: https://github.com/cloudflare/workers-sdk/issues/3391

It might be related πŸ˜•

Unfortunately I must admit that I haven't tried next-on-pages with very large applications, we might have some scaling issues, I'll have to try that soon (the problem is that people with very large next applications usually can't share their code so I'll have to find/create such app hoping to hit the issue people are facing)


Regarding suggestion on how to debug it I am not really sure about that either 😒, checking the produced code in .vercel/output/static/_worker.js/__next-on-pages-dist__ would be a start but the code there is minified so it's not going to be very simple...

One possible point of failure is also where we deduplicate the webpack chunks: https://github.com/cloudflare/next-on-pages/blob/783dc7993da06db7bd2f15c09b633b42ba597092/src/buildApplication/generateFunctionsMap.ts#L388 you could see if there is code that gets duplicated that the deduplication logic fails to capture (for example you could build your app with --disable-chunks-dedup and see, if the deduplication works as expected you should see a much larger worker with the flag set)

But these are all quite involved and would take significant time and effort to look into (but if you look into this and need any clarification I'm always here)

By the way, as you can see here: https://blog.cloudflare.com/making-cloudflare-for-web/ we are increasing the application size to 10mb (gzipped), so I'd also check how big your application is when gzipped, maybe when the increase kicks in your app will fit?

dario-piotrowicz commented 1 year ago

@anonymouscatcher I think PR #347 should solve your issues, if you get the chance please try the prerelease and let me know if it works for you πŸ™‚

dario-piotrowicz commented 1 year ago

I'm closing this issue as I believe that the problems mentioned here have been solved (we no longer produce huge build outputs)

There is still the issue with large applications potentially hitting the message too large error, but that's already recorded in https://github.com/cloudflare/workers-sdk/issues/3391 (and I'm looking into it and trying to find a solution)

I hope you don't mind @anonymouscatcher