vikejs / vike

🔨 Flexible, lean, community-driven, dependable, fast Vite-based frontend framework.
https://vike.dev
MIT License
4.14k stars 348 forks source link

Writing to disk earlier when pre-rendering to reduce memory usage #580

Closed d3x7r0 closed 10 months ago

d3x7r0 commented 1 year ago

Description

Similarly to what was described in #134, I found myself hitting memory limits on my CI runner when trying to build a rather large website (it's a blog with over 20 years of content).

While I can try to reduce the size of pageContext, this only buys me a small amount of time until I hit the same limit again.

The problem seems to stem from these lines of code: https://github.com/brillout/vite-plugin-ssr/blob/96d9d892b752ebd50347ccece0b7ba25fddf0d2f/vite-plugin-ssr/node/prerender/runPrerender.ts#L175-L200

Since all the pages stay resident in memory until they are written to disk, the memory consumption skyrockets.

If it were possible to write the html files to disk earlier (either by chunking them or writing them as they are pre-rendered) this would dramatically cut down on memory usage.

brillout commented 1 year ago

Up for a PR? Could we make Node.js garbage collect pages already written to disk?

samuelstroschein commented 1 year ago

Seems like we are running into a similar issue but without pre-rendering.

CleanShot 2023-02-01 at 18 40 07@2x

Building the production site of https://github.com/inlang/inlang/tree/main/source-code/website requires us to deploy the site with a 2GB instance to handle the short peek.

brillout commented 1 year ago

@samuelstroschein It's probably unrelated to OP; I'd open another ticket/discussion for it. What makes you believe it's related to VPS and not to user code?

samuelstroschein commented 1 year ago

@brillout

What makes you believe it's related to VPS and not to user code?

That this issue exists aka I don't know whether this is VPS related or not.

it's probably unrelated to OP; I'd open another ticket/discussion for it.

I don't want to flood your repo with new issues that might be related to VPS

We are fine with paying for a larger machine for now. Investigating the issue is not a priority for us. Hence, I leave our comment as is. Maybe someone else finds it, indicating that our problem seems VPS related

brillout commented 1 year ago

@samuelstroschein OP is about memory usage while pre-rendering, whereas your issue is about the SSR runtime. So, yea, I'd say the disucssion should live in another ticket.

I don't want to flood your repo with new issues that might be related to VPS

No problems. We can close the ticket until someone finds a reproduction while also gathering further insights. It's always good to spark the conversation.

paying for a larger machine for now.

To be clear: shall it be beacuse of VPS then I'll take this very seriously – VPS can and should be light on memory.

samuelstroschein commented 1 year ago

OP is about memory usage while pre-rendering, whereas your issue is about the SSR runtime. So, yea, I'd say the disucssion should live in another ticket.

Oh wait, no. The peak happens during the build step, not during runtime. The chart may be a bit misleading. That's why I thought it was related to this issue. We also import markdown files. But only 10 or so.

brillout commented 1 year ago

Ok, then yes it's probably related. And you're using pre-rendering, correct? We can explore how to reduce memory usage, let me know when it becomes urgent.

Also, make sure to check out the parallel option. Both @d3x7r0 and @samuelstroschein.

samuelstroschein commented 1 year ago

We are not using pre-rendering.

let me know when it becomes urgent.

Will do. For now, we just pay $15 a month more for a 2GB machine

d3x7r0 commented 1 year ago

Ok, then yes it's probably related. And you're using pre-rendering, correct? We can explore how to reduce memory usage, let me know when it becomes urgent.

Also, make sure to check out the parallel option. Both @d3x7r0 and @samuelstroschein.

Sorry for not replying. I've been super busy with work and haven't had time to sit down and look at this since I'm just using it in a personal side project.

Sadly the parallel option has no noticeable effect on memory usage for my scenario (SSR a large number of pages).

My hypothesis is, again, related to the code I quoted initially. You can see that during the prerender process all the pages are first processed, html is generate and stored in a js variable (in memory) and only on the subsequent step are they written to disk.

https://github.com/brillout/vite-plugin-ssr/blob/96d9d892b752ebd50347ccece0b7ba25fddf0d2f/vite-plugin-ssr/node/prerender/runPrerender.ts#L175

This variable will, eventually, contain the entire generate website and nothing will be flushed to disk before all pages are processed. This is, obviously, faster than doing i/o but it comes at a cost of increased memory consumption.

My quick look at it also tells me that it might not be a simple one afternoon fix, hence why I didn't tackle it immediately. I can try and take a look at it but I don't know when I'll have the time. It would mean a rather large rework of that bit of the pre-render function. Either to operate in chunks, in parallel as a whole, or both.

EDIT: minor note, I agree that @samuelstroschein's issue might not be related to this. If it isn't doing pre-render I imagine it's a separate memory issue to this one.

brillout commented 1 year ago

@d3x7r0 Makes sense. If you wouldn't mind sharing your project's code with me, then I'd have a look at it (or a reproduction if you prefer). That said, I've a couple of urgent things to work on right now (the V1, and a production ready VPS framework demo) so this may take a little while until I do. But feel free to let me know if this is/becomes a major blocker and you can't find a workaround in user land.

Yes, @samuelstroschein's issue is probably unrelated and more likely to be related to Vite.

d3x7r0 commented 1 year ago

I'll try to get a repro going this weekend. Should be enough to just generate an increasing number of random pages until you see memory usage baloon. Though you might need to generate random data for the memory to really fill up.

brillout commented 1 year ago

That'd be great.

d3x7r0 commented 1 year ago

Btw this isn't a particularly huge blocker for me. I'm just using it for a personal site, not mission critical stuff. And I can still do it on my local machine, I just can't really use ci cause it would be too costly :)

brillout commented 1 year ago

👍. Although having a reproduction is always great and much welcome.

brillout commented 1 year ago

Also check out https://vite-plugin-ssr.com/disableAutoFullBuild to break down the mem spike into two/three smaller spikes.

d3x7r0 commented 1 year ago

Ok, got a repro:

https://github.com/d3x7r0/vite-ssr-memory-issue

Just adjust the number of entries in pages/page/index.page.server.js to increase/decrease memory usage.

On my machine (macbook pro intel 2020, 16gb ram, node v18.12.1) I get the build crashing at around 4gb of ram used by the node process.

I know 25k URLs is a lot, I specifically overshoot to trigger the issue. I haven't done the math to see how many I have on my project but I wouldn't be surprised if it was around the 10k mark. It's a long running blog with over 10 years of content. Just in tag pages alone it's easy to reach those high numbers.

brillout commented 1 year ago

Thanks, that's very helpful to have (rough) numbers.

Incrementally pre-rendering should do the trick, I'll have a stab at it when I've little less on my plate.

As always: if this becomes a blocker for you (future users), definitely let me know.

brillout commented 1 year ago

@d3x7r0 Did you find a workaround?

As always: Let me know this is a blocker and/or PR welcome.

d3x7r0 commented 1 year ago

Hi there. Sorry, I haven't been able to come back to this. No workaround from my end aside from tweaking node to use more memory. I did want to try and tackle this one but haven't found the time and, realistically, probably won't for the next month or so :-/

brillout commented 1 year ago

No problems. FYI an alternative, in case you think you can convince your company, is sponsoring (https://github.com/sponsors/brillout) as I considerably increase feature priority for my sponsors.

But still let me know if this becomes a blocker.

brillout commented 10 months ago

Done in #1262 by @nitedani.