Add incremental static regeneration

Nick-Mazuk commented 3 years ago

Overview

One of the best features of Next.js is incremental static regeneration. It would be great to have something similar to it.

https://nextjs.org/docs/basic-features/data-fetching#incremental-static-regeneration

The core features being:

Pages can be dynamically generated
Resulting page is cached for X time
After X time, the page is dynamically regenerated again

In the ideal world, there would also be some way to invalidate the cache. https://github.com/vercel/next.js/discussions/16488

As I understand Sveltekit, currently, this is not possible.

Proposed solution

Extend the prerender API with a revalidate parameter.

<script context="module">
    export const prerender = true;
    export const revalidate = 900; // 900 seconds, or 15 minutes
</script>

Conduitry commented 3 years ago

This can essentially be done now by returning cache headers and having a proxy sitting in front of the Kit app that understands those cache headers. I'm not entirely convinced this needs its own implementation solely in Kit.

Rich-Harris commented 3 years ago

Your pages can return cache headers via the maxage property: https://kit.svelte.dev/docs#loading-output-maxage

Cache invalidation is always going to be platform-specific. We've vaguely talked about adapters being able to do cache invalidation for you (i.e. adapter-vercel would know how to invalidate URLs in the Vercel cache) but I imagine it will always be a bit of an inexact science.

lukasIO commented 3 years ago

If I understand correctly, the difference to what @Nick-Mazuk is proposing would be that one unlucky user has to wait for the server side requests to complete everytime the cache header expires. In the worst case scenario that unlucky user would experience this on every single (unique) route while navigating the website.

skrhlm commented 3 years ago

I must say that I like the proposed non-solution from @Conduitry. It seems better to leave this kind of implementation to the developer, hence avoiding SvelteKit turning into yet another unconfigurable one-size-fits-all monolith, *cough* nextjs *cough*

@lukasIO what you're mentioning is actually one of the worst things with the implementation in NextJS. It essentially makes an environment dependent on itself since it prerenders these pages in the build-process.

lukasIO commented 3 years ago

@skrhlm I get your point that this might not be suitable in all situations, or rather it's probably only suitable in few situations, but as far as i understood, the static prerending/generation is exactly the point of this proposal.

I could see the value for this proposal for less frequented sites, where the proposed cache-header solution would have a rather negligible effect if every second or third user hits the site with an expired cache header (and thus has to wait for the server side requests to complete instead of being served a static build).

benmccann commented 3 years ago

If there aren't many users then you also don't need to worry about performance very much

Nick-Mazuk commented 3 years ago

This can essentially be done now by returning cache headers and having a proxy sitting in front of the Kit app that understands those cache headers. I'm not entirely convinced this needs its own implementation solely in Kit.

That would work if you controlled the production environment. I'm deploying on Vercel, so I'm not entirely sure that would be possible. Could be wrong, though.

If I understand correctly, the difference to what @Nick-Mazuk is proposing would be that one unlucky user has to wait for the server side requests to complete everytime the cache header expires. In the worst case scenario that unlucky user would experience this on every single (unique) route while navigating the website.

I'm not terribly concerned with this, though I can see how it would be important to others.

I'm actually more concerned with high-traffic sites where every single user visits either an uncached page or an out-of-date page.

skrhlm commented 3 years ago

@lukasIO Good point! Actually my main problem is with the implementation of it in next, not precisely the idea of it. But as @benmccann said the problem diminishes with the size of the crowd.

Nick-Mazuk commented 3 years ago

@skrhlm, out of curiosity, what's your issue with the implementation of it in Next.js?

lukasIO commented 3 years ago

Imho the point about performance impact for less frequented sites boils down to what you are doing in load. Once load requests data from a headless CMS, static pregeneration would mean a noticable performance improvement even if there are only few users. But I fully accept that this kind of scenario is probably not in the focus at all 😃

Rich-Harris commented 3 years ago

The thing I'm having a hard time getting my head round is what is doing the regeneration? E.g. you can't have long-lived timeouts in a serverless environment.

Or are we just talking about adding a stale-while-revalidate cache control directive? Is that how Next works? In that case, this...

<script context="module">
  export async function load({ page }) {
    const props = await whatever(page.params);

    return {
      props,
      maxage: 1,
      revalidate: 59
    };
  }
</script>

...could translate to Cache-Control: public, max-age=1, stale-while-revalidate=59 (or private for pages that use user data).

Rich-Harris commented 3 years ago

Next's docs suggest that's not what's happening in their case, but I don't yet understand how what they describe could work in an environment-agnostic way. Is this actually a Next+Vercel feature? And if so what advantages does it have over stale-while-revalidate?

lukasIO commented 3 years ago

...could translate to Cache-Control: public, max-age=1, stale-while-revalidate=59 (or private for pages that use user data).

At least for the use cases that I had in mind, this doesn't sound like a bad option...

For the arguments sake: The difference I can see (while I'm not familiar with how next implements the static regeneration feature) is that with stale-while-revalidate different routes could get out of sync.

In the case that /foo is visited quite frequently but /foo/bar only less so, the content in /foo/bar could end up significantly older in comparison.

edit:

The thing I'm having a hard time getting my head round is what is doing the regeneration?

Apart from the node-adapter, I have no idea how that could work. Thinking out loud: Optionally expose a server-route to trigger regeneration and leave it up to the app developer how to trigger it regularly or on demand (e.g. cronjobs, a seperate service, CMS content update hooks etc.)

Nick-Mazuk commented 3 years ago

After some research, I'm guessing that incremental static regeneration (ISR) might have originally been just a Next.js on Vercel thing, but other platforms are adopting it.

For instance, the original next-on-netlify plugin did not allow for ISR.

https://github.com/netlify/netlify-plugin-nextjs/issues/151

But in December, Netlify announce a new plugin, @netlify/plugin-nextjs, which allows for ISR.

Announcement: https://www.netlify.com/blog/2020/12/07/announcing-one-click-install-next.js-build-plugin-on-netlify/

Repo: https://github.com/netlify/netlify-plugin-nextjs

Code for ISR (potentially): https://github.com/netlify/netlify-plugin-nextjs/tree/main/src/lib/pages/getStaticPropsWithRevalidate

So it seems like ISR is possible in a serverless environment outside of Vercel, but perhaps the implementation will be platform-specific… and not all platforms will support it.

The thing I'm having a hard time getting my head round is what is doing the regeneration? E.g. you can't have long-lived timeouts in a serverless environment.

@Rich-Harris With Next.js, the server is doing the regeneration. I could be wrong, but I think this is now Next.js works for static pages with this option.

When a visitor visits the page, the server will dynamically generate the page SSR style.
The server will cache the file. I assume it uploads it to the build cache or something to prevent long timeouts. It also adds a tag of when the page is invalid (e.g., 15 minutes from now)
When another person views the page, the server will check the build cache to see if the page already exists in the cache.
1. if it does, check if the page is expired. If expired, regenerate the page as before. If it isn't expired, serve the static page from the cache.
2. If it doesn't exist, generate the page SSR style

...could translate to Cache-Control: public, max-age=1, stale-while-revalidate=59 (or private for pages that use user data).

I don't think this will do what I was asking for. I believe this will cache the HTML on the client, not on the server?

Rich-Harris commented 3 years ago

I believe this will cache the HTML on the client, not on the server?

Or the CDN — if HTML were suitable for caching by the server, it would be served with a public cache control header which means your CDN takes care of that for you. If it uses user data somehow then it should only be cached in the client. Either way, the server doesn't need to retain anything

skrhlm commented 3 years ago

@skrhlm, out of curiosity, what's your issue with the implementation of it in Next.js?

Well, it's the lack of environment agnosticity ( is it a word !? ) that bothers me. If you're using getStaticPaths to limit the urls which static generation can happen for, then Next tries to build these pages in build-time, which both takes a lot of time, and either:

Build the pages from your dev environment, which you probably do not want.
Build the pages from the live environment, You have to depend on the live environment in the build. If for example the site is down, not release yet or broken somehow you can't possibly prepare a deploy before the backend structure is live.

As a concluson I really do not see the reasoning behind even optimizing the pages in build-time. Doing it on the first request in runtime and caching the result makes a lot more sense.

LuudJanssen commented 3 years ago

Or the CDN — if HTML were suitable for caching by the server

I'm an avid user of Next.js and it has always bothered me that Next.js can't share it's ISR cache between nodes in a container-like architecture. I like the idea of letting a CDN (or any reverse proxy for that matter) handle the caching and just sending the right cache headers for pages that can be cached.

Maybe to avoid some confusion in @Nick-Mazuk's answer: Next.js just keeps a rendered version of the page on disk and serves it whenever a user requests the page, so it's platform independent. (I do think they use another tactic for hosting on Vercel, which uses serverless functions per page, so there the CDN might actually handle it, don't know).

I do think it might be worth spending some time on this use case, because Next.js users might be looking for this when trying out SvelteKit (at least I was, hence why I ended up in this thread). Maybe an easy way to add the cache headers to a page like suggested in this comment and explaining that using a CDN in between will probably yield the same result as Next.js's ISR with the added benefit of a shared cache?

kaleabmelkie commented 3 years ago

I do think it might be worth spending some time on this use case, because Next.js users might be looking for this when trying out SvelteKit (at least I was, hence why I ended up in this thread).

I too am a recent migrant from Next.js and was looking for this feature. This issue thread explains it, and adding a quick way for setting the cache headers per page (i.e. a revalidate option in LoadOutput besides maxage, as Rich mentioned in this comment) would be nice.

matindow commented 3 years ago

I commented regarding this on a different issue about editing headers, because I didn't know there was an established name for it, but in my view @Rich-Harris is correct in that cache headers probably should be enough to address this on their own, and I think would be a totally acceptable stance for svelte to take, but given the "adapter" design, and the individual strengths of the deployment platforms you are building adapters for, there are opportunities for specific improved user experiences that people are right to investigate.

For cloudflare in particular, because of the very large number of edge servers in their network, if we rely on "the one unlucky user" to rebuild cache, you are actually relying on hundreds of different unlucky users, multiplied by the number of now stale pages. Or, to put it another way, a huge strength of the cloudflare workers architecture, totally separate from any potential request logic, is that all of your static assets (including html files) are replicated to their entire network, and so all requests are served directly from the edge, avoiding any interaction not just with an origin server, but also with their (region specific) cache API. This "just works" for static sites, but with svelte's load function, even with appropriate cache headers, you are breaking that perk.

To me, this sounds like something that should be configurable as part of the individual adapter rather than svelte itself, as the particulars are likely to be different for different platforms. (eg cloudflare workers provides cron triggers that could be used to compare and rerender the SSR pages, but this may not be possible or the most efficient route on another platform.)

olimination commented 2 years ago

I still have found this one here: https://github.com/sveltejs/kit/issues/2369 Not sure if this is kind of similar to this issue. I think if we solve https://github.com/sveltejs/kit/issues/2369 then this one is obsolete or do I misunderstand something?

Or is the difference that this issue is more about the "runtime-oriented" approach and #2369 is more about the "build-time-oriented" approach?

What do you think?

MarcGodard commented 2 years ago

Is there a way to break cache of previously built files for static?

Example, I build the site, deploy, visit, this caches the pages, then make a change, re-build, deploy, re-visit but don't see the change without reloading. Sorry this is slightly different issue, but can't seem to find some sort of build versioning.

Maus3rSR commented 2 years ago

Hello,

Just leaving here a possible use case with ISR : https://www.youtube.com/watch?v=-_3gqy7U9zE&ab_channel=Delba

ISR with next kinda looks like a "Continuous Delivery" (update on demand) feature handled by vercel

Best regards

git-no commented 2 years ago

Just leaving here a possible use case with ISR : https://www.youtube.com/watch?v=-_3gqy7U9zE&ab_channel=Delba ISR with next kinda looks like a "Continuous Delivery" (update on demand) feature handled by vercel

It is NextJS On-Demand Revalidation. It works with a hook to a Next API, no longer the NextJS has to poll or work with expiring cache headers. Here is an example of changing an issue at Github immediately (300ms) issues a rebuild of a page in NextJS On-Demand Demo (including process explanation and setup instructions).

This is something Svelte does not have but would worth to have.

reesericci commented 2 years ago

I was looking for a use case where at build time SK would prerender the page and then every X minutes (or a webhook) SK would re-render the page in the background while still serve the static page until the re-render is done. (My renders take a good 15-30 mins - web scraping)

magne4000 commented 2 years ago

I would like to add some details and update information regarding the state of ISR with current Vercel (v3) API. I created vite-plugin-vercel, which support ISR, and things have changed since API v2.

In v2:

ISR endpoints were generated on filesystem at build time
When expired, they were updated on filesystem
They were cached on the Edge Network via stale-while-revalidate and other cache headers.

The benefits compared to just playing around with stale-while-revalidate were:

First hit was fast, because it was prerendered
When revalidation occurs:
- First Edge Network from any region connection triggers a call to the Serverless Function
- While revalidating, each Edge Network region still serves its cached version, or asks for the filesystem one
- This results in Serverless Function being called only once for a revalidation. Using only stale-while-revalidate would call the Serveless Function as many times as there are Edge Network regions

Now in v3:

No filesystem generation anymore
Results are cached on the Edge Network based on different cache rules, probably still the same logic as with v2

So, what's the difference now between ISR and just playing around with Cache header?

It's not clear if each Edge Network region keep each other updated when an ISR endpoint is updated. But let's assume not.

Some benefits still remain:

On demand ISR is still a good thing for some use cases
bypassToken to be used as a Preview Mode feature. This one could still be implemented manually I think
fallback can serve a static file so that first hit is still fast
allowQuery allows to serve different URLs behind the same cache entry, contrary to stale-while-revalidate which would always store each URL in a different cache entry

To me, this sounds like something that should be configurable as part of the individual adapter rather than svelte itself, as the particulars are likely to be different for different platforms. (eg cloudflare workers provides cron triggers that could be used to compare and rerender the SSR pages, but this may not be possible or the most efficient route on another platform.)

I agree with that. I'm not familiar with SvelteKit bundle API yet, but having any exported const in any .svelte file (e.g. export const revalidate = 900;) can probably be read at some point by the vercel adapter plugin and do its magic with it right? The way I see that is that exporting a revalidate const doesn't impact in any way the code base. It just allows the adapter to do some specific work.

And if exports are not easily accessible, perhaps a mapping in the adapter configuration based on the route id or path could do it?

impactvelocity commented 2 years ago

Maybe the build output API from vercel can offer insights on how to do this with SK

https://vercel.com/docs/build-output-api/v3 https://vercel.com/blog/build-your-own-web-framework

:)

rbenzazon commented 2 years ago

I got a project where I need to build static pages from a headless CMS (prismic). The pages will initially get created often, then maintained occasionally, an optimal solution would let us build all the existing page and deploy them, then each update on prismic could trigger a rebuild and redeploy of an individual page. Even when building all (when the app change) only the modified content should be deployed (why transferring unchanged files ?), if we reach this, we'll get the best reactivity and less hosting usage.

jdgamble555 commented 2 years ago

I don't think you need just Vercel to do this. Someone wrote a package for Angular that does this by in-memory caching: ngx-isr.

Basically, the build routes would have to have a cache.

If that is the case, someone could look at this package and copy the ideas for sveltekit.

J

kazzkiq commented 1 year ago

ISR is no simple task. So until it gets added to SvelteKit, here's how to achieve a "close" approach if you just fell onto this Issue:

Vercel

Just use stale-while-revalidate and proper caching with max-age params.

Deno Deploy

AFAIK, no solution yet.

AWS CloudFront

AFAIK, no solution yet

VPS or physical server (DigitalOcean, Vultr, AWS EC2/ECS/Fargate, any VM provider, etc)

Add NGINX as reverse-proxy and enable microcaching.

SvelteKit 1.0 is a feat we just can't get used to being so hyped about, really incredible work. And I feel like both "native" i18n implementation and ISR are probably the most anticipated post-1.0 features around.

jdgamble555 commented 1 year ago

The real issue is on demand revalidation. Adding headers for max-age is easy to do with setHeaders on server endpoints. SvelteKit could do the on-demand validation in Vercel or CloudFlare directly, but I don't know there there are any other CDNs that support on demand revalidation at this point that I am aware of.

https://vercel.com/docs/concepts/functions/serverless-functions

J

multipliedtwice commented 1 year ago

ISR is no simple task. So until it gets added to SvelteKit, here's how to achieve a "close" approach if you just fell onto this Issue:

Vercel

Just use stale-while-revalidate and proper caching with max-age params.

Deno Deploy

AFAIK, no solution yet.

AWS CloudFront

AFAIK, no solution yet

VPS or physical server (DigitalOcean, Vultr, AWS EC2/ECS/Fargate, any VM provider, etc)

Add NGINX as reverse-proxy and enable microcaching.

SvelteKit 1.0 is a feat we just can't get used to being so hyped about, really incredible work. And I feel like both "native" i18n implementation and ISR are probably the most anticipated post-1.0 features around.

Can you please show an example with stale while revalidate? It doesn't make any difference for me, initial response time is still taking ages. Google doesn't want to even crawl my website because of poor IRT. Probably I doing something wrong.

Rich-Harris commented 1 year ago

Thought I'd share an update since this is a highly anticipated feature: #8740 will implement ISR for people deploying to Vercel.

The underlying mechanism (route-level config) is platform-agnostic, but since Vercel exposes a dead simple mechanism for ISR via the Build Output API it's a no-brainer to use route-level config to implement ISR.

It's possible that we'd one day have ISR as a framework primitive, but the challenges of designing it in a platform-agnostic way are substantial. So for now an adapter-centric approach makes more sense, if only to make some cowpaths that we can later pave. I'd be very happy if other adapters enabled ISR as well!

In the meantime, if you'd like to use ISR in your apps, sign up for a Vercel account 😀

jdgamble555 commented 1 year ago

Thought I'd share an update since this is a highly anticipated feature: #8740 will implement ISR for people deploying to Vercel.

The underlying mechanism (route-level config) is platform-agnostic, but since Vercel exposes a dead simple mechanism for ISR via the Build Output API it's a no-brainer to use route-level config to implement ISR.

It's possible that we'd one day have ISR as a framework primitive, but the challenges of designing it in a platform-agnostic way are substantial. So for now an adapter-centric approach makes more sense, if only to make some cowpaths that we can later pave. I'd be very happy if other adapters enabled ISR as well!

In the meantime, if you'd like to use ISR in your apps, sign up for a Vercel account 😀

Nice! Would this include the on-demand option that NextJS + Vercel has?

Thanks!

J

Rich-Harris commented 1 year ago

Yes! The docs are here, and we could probably do a better job of describing on-demand ISR, but: if you specify a bypassToken...

// +page.server.js
export const config = {
  isr: {
    expiration: false, // only revalidate manually
    bypassToken: 'xyz123'
  }
}

...then issuing a GET or HEAD request to the route in question with a x-prerender-revalidate: xyz123 header will forcibly revalidate it.

eecue commented 1 year ago

This would be incredibly useful in the static file generator! I currently have a site that has roughly 1M pages and it takes over 2 hours to build, I frequently update it and that's a lot of build time/load. Ideally it would just update the files I have changed.

npm run build 1686.10s user 755.32s system 37% cpu 1:48:02.08 total

Antonio-Bennett commented 1 year ago

@eecue I think your issue is more closely related to https://github.com/sveltejs/kit/issues/8430 there's a suggestion by @Rich-Harris in there as well https://github.com/sveltejs/kit/issues/2369#issuecomment-1101788463 hopefully there's something helpful for you for now.

caoimhebyrne commented 1 year ago

@Rich-Harris

It doesn't look like ISR is working, visiting a website deployed on Vercel with the following config gives an error in the function logs.

export const config = {
    isr: {
        expiration: 60,
    },
};

[GET] /
2023-02-13T04:45:17.515Z   dabb066b-5685-4a28-907b-cc1e1df327b3   ERROR   Error: Not found: /fn-0
    at resolve (file:///var/task/vercel/path0/.svelte-kit/output/server/index.js:3246:18)
    at resolve (file:///var/task/vercel/path0/.svelte-kit/output/server/index.js:3113:34)
    at #options.hooks.handle (file:///var/task/vercel/path0/.svelte-kit/output/server/index.js:3290:59)
    at respond (file:///var/task/vercel/path0/.svelte-kit/output/server/index.js:3111:43)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

kevbook commented 1 year ago

Getting the same error

Rich-Harris commented 1 year ago

Apologies, I meant to update this thread — there's an issue with ISR that didn't surface during testing. We've been working on a fix (on the Vercel side) over the last week and it should land very soon. Bear with us!

caoimhebyrne commented 1 year ago

Thanks for the update @Rich-Harris :)

elliott-with-the-longest-name-on-github commented 1 year ago

@Rich-Harris can probably be closed -- congrats on finishing one of our older issues 😁

KodingDev commented 1 year ago

@tcc-sejohnson Still encountering the same issue as of 2 hours ago on Vercel deployments, will just wait for Rich to provide an update

elliott-with-the-longest-name-on-github commented 1 year ago

@KodingDev

Have you updated adapter-vercel to 2.1.0? ISR should be working.

schwartzmj commented 1 year ago

@KodingDev

Have you updated adapter-vercel to 2.1.0? ISR should be working.

I'm getting the following error during build:

Error: Could not find target Lambda at path "fn"
--
08:32:19.594 | at qg (/var/task/sandbox.js:248:4167)

adapter-vercel version: 2.1.0 @sveltejs/kit version: 1.8.3

dummdidumm commented 1 year ago

Could you please open a new issue for that with a reproduction?

KodingDev commented 1 year ago

@tcc-sejohnson Yep, working for me now. pnpm had a moment I guess haha. Appreciate it :)

kevbook commented 1 year ago

Maybe I'm completely off, so any clarity would be appreciated. I thought with ISR, a page is pre-rendered into HTML during build (let's say the page calls an external API via the load function on +page.server.js). With setting the below config, every 60 seconds (and when a user visits the page 1st time), a background function would run and re-call the external API via load function to regenerate the HTML again (so if the API returned new data, it would be reflected in the HTML served statically)

export const config = {
  isr: { expiration: 60, allowQuery: ['search'] },
};

tonprince commented 1 year ago

Maybe I'm completely off, so any clarity would be appreciated. I thought with ISR, a page is pre-rendered into HTML during build (let's say the page calls an external API via the load function on +page.server.js). With setting the below config, every 60 seconds (and when a user visits the page 1st time), a background function would run and re-call the external API via load function to regenerate the HTML again (so if the API returned new data, it would be reflected in the HTML served statically)
export const config = {
  isr: { expiration: 60, allowQuery: ['search'] },
};

That is exactly what I also expect to happen (but I am currently not able to achieve this behavior with the latest version of Sveltekit (1.8.4) and adapter-vercel (2.1.1). Would be great if someone from the Sveltekit dev team can bring light into the darkness and also show all possible/impossible use cases, and describe how to invalidate the ISR cache manually.

z-x commented 1 year ago

Read through all of the posts and I think I am missing a key workflow. Back in the days I've implemented something like this in PHP:

User creates a new post in the admin panel
Clicks 'Save'
The data goes to the database
A background job is triggered that renders the page and saves it as a static file (to be precise here - it was just the content part that was rendered and then included in the main layout file)
When the visitor enters the page, they basically get a pre-rendered page
User edits the post
Data goes to the database
The background job recreates that particular static page

This was, unsurprisingly, super responsive for the end user as they just get the end result. I saw some questions in this thread about what triggers the regeneration and I think the optimal way would be to trigger those on particular events on the site (comment added, post created, post edited etc.). And regenerate just the corresponding path (or even just the components?). There are of course some use cases for the incremental regeneration in an interval, but I would say for most of the cases having control over when this happens would be optimal.

I have no idea how difficult would it be in the modern stack to implement, I am just learning the whole backend-in-JS thing, but if there would be a way I imagine having a function to call with a path as a argument? So after doing all the database requests, I just call regenerate('/blog/my-first-post')? It should work in a separate thread/background not to block the saving process, so the admin/editing panel would still feel responsive. Or maybe there would be a more inteligent way. Solving this would be a great feature, hosting is cheap but databases not so much.

Or maybe those latest changes are exactly this? I have to admit I haven't yet played with the latest changes and just gone through the documentation and this thread.

subhasishdas159 commented 1 year ago

Need this feature in our project as well. Thanks in advance!

sveltejs / kit