bfcache - Githubissues

Rich-Harris commented 1 year ago

Describe the problem

SvelteKit always runs any load functions that have been invalidated when you navigate, regardless of the type of navigation.

This is deliberate: caching the result of a load function and reapplying it when navigating is a bad default:

data might be stale
if the data was mutated inside the component (which we don't discourage, because there are cases where that's useful!), bugs are likely
trading memory for performance is something that should be done consciously, not automatically

Browsers don't work that way, however. When you navigate between documents using back/forward buttons, the browser will restore the entire document from the bfcache. For some apps (but definitely not all), that's desirable behaviour to emulate.

Describe the proposed solution

There's a few ways we could go about this. The simplest would be a bfcache export alongside load:

// +page.server.js
export async function load() {
  const { posts } = await db.get_posts();
  return { posts };
}

export const bfcache = true;

This has the virtue of simplicity, though it might be too blunt a tool (you couldn't, for example, use it to cache the returned value for a set amount of time). Exporting a function instead of a boolean seems like an obvious solution, but it doesn't really work because the client needs to know if it can reuse cached data without going back to the server.

Another possibility is to add a new method to load:

// +page.server.js
export async function load({ setTtl }) {
  setTtl(Infinity);

  const { posts } = await db.get_posts();
  return { posts };
}

With this, it's no longer specifically related to back/forward navigations, which could be good or bad depending on your viewpoint.

A third option was suggested by @dummdidumm, and it came up because of a real world use case we're currently looking at — tie it to snapshots. The rationale is that snapshots are already related to bfcache emulation.

The use case I'm dealing with right now is that I'm using snapshots as a cheat code for infinite pagination:

<script>
  import InfiniteScroller from '$lib/components/InfiniteScroller.svelte';

  export let data;

  let loading = false;

  export const snapshot = {
    capture: () => ({
      data,
      scroll: scroller.capture()
    }),
    restore: (values) => {
      data = values.data;
      scroller.restore(values.scroll);
    }
  };
</script>

<InfiniteScroller
  bind:this={scroller}
  items={data.items}
  on:loadmore={async () => {
    if (loading || !data.next) return;
    loading = true;

    const { items, next } = await load_more_items(data.next);

    data.items = [...data.items, ...items];
    data.next = next;

    loading = false;
  }}
>
  <!-- ... -->
</InfiniteScroller>

This is neat because I can scroll down to the 400th item (loading a bunch of stuff on the way), click on it, click back, and my position will be perfectly preserved along with all the stuff I already loaded. I don't need any fancy client-side data management to achieve this, it's all just regular load functions with an API route that I call inside the loadmore event.

The trouble is that when I navigate back, I still have to call load, only for the resulting data to immediately be nuked by the restore function.

If I could express my intent to use the existing data object, that wouldn't be necessary:

export const snapshot = {
+ bfcache: true,
  capture: () => ({
    data,
    scroll: scroller.capture()
  }),
  restore: (values) => {
    data = values.data;
    scroller.restore(values.scroll);
  }
};

Combined with a way to associate snapshots with a key other than the history index (https://github.com/sveltejs/kit/issues/9313), this would give us a lot of flexibility.

It does throw up one interesting question though: are we caching the return value from load, or the data object that is produced by combining that value with the return values from parent layout load functions? If we cached data itself then it would respect any mutations that occurred, and the example above could just be this:

export const snapshot = {
  bfcache: true,
  capture: () => scroller.capture(),
  restore: (scroll) => scroller.restore(scroll)
};

However, any data that changed in a parent layout load function would not be respected. I'm not sure which behaviour is most universally desirable/understandable.

A snapshot.bfcache boolean would have the same limitation as export const bfcache, namely that we couldn't express (for example) the desire to cache data for a set amount of time. We could make it a function or something else this time though, since it will be available to the client.

In short, I'm not sure what the right answer is, but I wanted to do a braindump while it's fresh.

Alternatives considered

No response

Importance

nice to have

Additional Information

No response

brandtam commented 1 year ago

I like the idea of adding a new method to load as it could be used to set a pattern in the eventual case where other features need to be added or the possibility that even the method could take arguments.

sserdyuk commented 1 year ago

Thank you for mentioning this ticket in the Summit video. I have been thinking about this for a while. We actually avoid using the load method on slow loading pages because the back navigation takes forever. Also, and this is a personal opinion, I find the current snapshot API is feeling unnaturally. So, here's what I'd propose the snapshot API work like:

page.svelte

<script lang="ts">
    import { saveSnapshot } from '$app/navigation';
    import type { PageData } from './$types';

    export let data: PageData;

    saveSnapshot(() => ({ someData: data.someData, moreData: ... }))
</script>

<p>{data.someData}</p>

page.ts

import type { PageLoad } from './$types';

export const load = (async ({params, fetch, snapshot}) => {
    if (snapshot) {
        return { someData: snapshot.someData, moreData: ... };
    } else {
        return { someData: await fetch(`/api/someData?id={params.id}`) };
    }
}) satisfies PageLoad;

The snapshot value in the loadmethod would only be present if the back-forward navigation is detected by Sveltekit. Unlike the TTL approach, this one allows the load method to decide whether to use the snapshot or discard it based on any complex logic.

Overall, it seems to be a more organic way to integrate snapshots into the framework. I hope this is not too late to open this can of worms.

quentin-fox commented 1 year ago

Agreed that it makes the snapshots a bit more well-integrated, but as one downside, it does make it less ergonomic when you have a page that needs snapshots which doesn't have a load function. Not sure how often that comes up, but if we forced restoring to happen with a load, then it does add a bit of extra overhead.

It also would be a bit more awkward to perform browser-only actions on restore, such as restoring scroll position, if we made restoring from a snapshot have to happen through the load function.

We could instead just have the snapshot argument to load as you suggested, but otherwise keep the snapshot API the same as it is now? That way we could use it to prevent load functions from running, but otherwise keep the same behaviour as before.

The snapshot would have to be a union of the snapshots created by the page itself and all the parent layouts' snapshots, but SvelteKit already has that nuance for load when merging resulting data fields together from nested layouts, so that would likely be okay.

sserdyuk commented 1 year ago

@quentin-fox Thank you for pitching in. Yes, it does require a +page.ts file for the loadfunction. For me, it is already there most of the times.

I think there are two different goals here. First is to speed up the BF navigation by avoiding calling the server-side endpoints (both internal and 3rd party). Second is to restore UI state after BF navigation. The current snapshot handling was made only with the 2nd goal in mind . Maybe we need two completely different solutions here.

quentin-fox commented 1 year ago

For the infinite scroll restoration example given above, it does seem like that the two should potentially be integrated?

I guess we could also have some sort of cacheData function that's exported from the +page.svelte in the same way that the snapshot function object is exported, and would be automatically typed as being:

import type { PageData } from './$types';
type CacheData = (callback: () => Partial<PageData>);

Or potentially the type wouldn't have to necessarily have to be the same as the PageData type? Or maybe it wouldn't be a Partial<>, but it would instead be the exact return type of the page's load page? Instead of including all the page data from parent layouts which were merged in?

That cache would be then available in the load function, which it could use to potentially bypass some of the asynchronous tasks which could slow down page loads. bypass any potential fetches.

A few additional thoughts:

Probably as a separate addition, we might want the ability to both restore from a cache in a load function, and then also re-run the load function client-side after a first render, and then update the content after that load? This might make it possible to immediately render the previous page, but also get updated content (if it's present).
Would we want the cacheData to be keyed to a history entry? Or would this be something that could be used a bit more generally for an ISR setup as proposed in the original body of this issue?
Would we want to opt for a super-flexible API versus something simpler to write, but with less customization?

sserdyuk commented 1 year ago

The problem here is that we're trying to marry two realms together. The server side processing and the client ephemeral UI state management. Note that in the original example by RH the load function is in the +page.server.js file. The server doesn't know anything about the client's UI so the TTL solution is the only one feasible there.

The only place you can intelligently control BF data caching is the +page.js file IF you can control whether the call to the server side is made. Right now it only works if you're calling fetch on the endpoint like in my example. The +page.server.js still gets called regardless though, and in order to prevent this call you need to do one of the things- (1) change +page.server.js to be a custom endpoint so you can fetch it from +page.js or (2) introduce a breaking change to the API to make data a function rather than piece of data.

Here's what I mean for scenario 1:

// src/routes/one/[id]/+page.ts
import type { PageLoad } from './$types';

export const load = (async ({params, fetch, snapshot}) => {
    if (snapshot) {
        return { someData: snapshot.someData, moreData: ... };
    } else {
                const serverResponse = await fetch(`/api/one/{params.id}`);
                const serverData = await serverResponse.json(); // data is not typed unfortunately
        return { someData: serverData.someData }
    }
}) satisfies PageLoad;

// src/routes/api/one/[id]/+page.server.ts *used to be* src/routes/one/[id]/+page.server.ts
export const load = (async ({params}) => {
        const record = db.fetchById(params.id); // loading from database as an example
    return { someData: record }
}) satisfies PageServerLoad;

The visible downsides here are: two endpoints instead of one, data typing isn't working automatically when the fetch is made.

And the scenario 2:

// src/routes/one/[id]/+page.ts
import type { PageLoad } from './$types';

export const load = (async ({data, snapshot}) => {
    if (snapshot) {
        return { someData: snapshot.someData, moreData: ... };
    } else {
                // API breaking change. data is a function. 
                // +page.server.ts doesn't execute if data() is not called
                const serverData = await data(); 
        return { someData: serverData.someData }
    }
}) satisfies PageLoad;

// src/routes/one/[id]/+page.server.ts
export const load = (async ({params}) => {
        const record = db.fetchById(params.id); // loading from database as an example
    return { someData: record }
}) satisfies PageServerLoad;

The downside is that it is a breaking change after all.

Sorry for the long read here. Hopefully it makes sense.

quentin-fox commented 1 year ago

No that all makes perfect sense.

The breaking change makes sense and feels like it would help solve the problem in an elegant way. Right now the docs don't say much about having a +page.server.ts alongside +page.ts, but the caching would definitely be another use case.

One downside to the breaking change (aside from it being breaking) is that you then have different ux with data vs. parent parameters to load. You are never required to await parent(), since it's merged into the final data prop passed down, but with data you would always have to await it if you wanted the +page.server.ts load to run + have its data passed down to the component. It's not a huge issue, but always good to make the behaviour seem sane + reduce the number of gotchas that add up over time.

The docs aren't quite clear, but it seems like if there is a +page.server.ts, and a +page.ts, then the type of PageData is just the return type from the +page.ts load - the only way to get data from the +page.server.ts to the component is to forward it in the load function from +page.ts. However, if you delete the +page.ts, then the PageData is equal to the return type from +page.server.ts.

Maybe the sveltekit maintainers can jump in here re: the likelihood of someone using a +page.server.ts and +page.ts alongside each other, but I think reserving the cache functionality to +page.ts wouldn't be the worst thing? And then it wouldn't require the breaking change to require awaiting the data from +page.server.ts in +page.ts, at the cost of ergonomics for (hopefully) a small % of users?

sserdyuk commented 1 year ago

Yes, parent is merged in automatically. data is not. So, if your +page.js load function isn't taking data from the call parameters and using it somehow, the +page.server.js function is executed but the results aren't passed to +page.svelte. In reality, +page.server.js does't have to return data, it may be throwing a redirect to prevent unauthorized access or do server side logging.

quentin-fox commented 1 year ago

Following the API you showed in scenario 2), If we didn't do an await data() in +page.ts load, were you thinking that the +page.server.ts load would still run?

To your point, if we did a server redirect in the +page.server.ts, even if we had the ability to not require the output of +page.server.ts in +page.ts, we'd still need it to run to re-compute the redirect on back navigation - we wouldn't be able to skip the +page.server.ts load, even if we had the data to skip the asynchronous work in the +page.ts

sserdyuk commented 1 year ago

In the scenario 2, if both +page.js and +page.server.js files are present, .+page.js has to explicitly execute +page.server.js by calling data() or it won't run otherwise. If +page.js isn't there, +page.server.js runs automatically just like now. That's the breaking change.

quentin-fox commented 1 year ago

Cool, that all makes sense, thanks for clarifying.

The breaking API change seems clean to me, but I'd be a bit surprised if they went for it, as there have been no breaking changes since 1.0.0 - probably ideal to get the maintainers to chime in?

sveltejs / kit

bfcache #9822

Describe the problem

Describe the proposed solution

Alternatives considered

Importance

Additional Information