Multiple `useFragment` uses for same fragment data creates large transport size

maciesielka commented 2 months ago

👋🏻 We're working on optimizing our Next.js App Router usage as much as we can and came into some possibly unexpected behavior related to the streaming support offered by this package. If this turns out to be expected behavior, that'd still be useful for us to know as well 🙂

Overview

We rely on useFragment to deliver cache updates for singleton data (not unlike a setup similar to what's described by AllProducts in this documentation) to many different components that are rendered at once. Despite the same fragment, each usage of useFragment seems to contribute linearly to the size of the script tag that manages the ApolloSSRDataTransport in the browser as a part of the SSR streaming process. Since we are subscribing to updates for the same location in the cache, we'd expect some kind of deduplication or normalization to prevent so much redundancy.

Example

You can find a reproducing project forked from this repo here.

In order to visualize the problem, follow these steps:

Run the project with yarn dev
Navigate to the new page that uses just one useFragment call at: http://localhost:5001/cc/use-fragment/1
1. Inspect the page source / ApolloSSRDataTransport script tag for this page
  
  A pretty-printed example is shown below:
```html ```
In a new tab, navigate to the new page that uses 24 of the same useFragment call at: http://localhost:5001/cc/use-fragment/24
1. Inspect the page source / ApolloSSRDataTransport script tag for this page. Compare it to the results in step 2.i and see that it's considerably larger and full of mostly redundant information
  
  Find a pretty-printed example below that includes 23 more duplicate entries for the same Poll data
```html ```
Feel free to continue testing with as many fragments as you'd like to see how it affects the transport. The number of rendered items is configurable in the URL:
```
/cc/use-fragment/[num-items]
```

phryneas commented 2 months ago

I fear there's no real way around that :(

Let me try to explain:

What you see here is not us transporting cache data - it's something different: we transport a snapshot of your hook, to prevent React rehydration errors.

The problem here is twofold:

hooks that render on the server don't immediately render in the browser, they can wait (for a very long time) for a hydration boundary to finish
Apollo Client is a normalized cache, so responses to other queries or mutations can influence the cache value of a different hook

So there is a possible scenario where your useFragment renders on the server (so the HTML is already generated and flushed to the browser), but doesn't get rehydrated and added to your visible Browser DOM because of something else suspending in the same tree for a long amount of time. Meanwhile, in the browser a cache update happens that would result in a different result to that useFragment call. Once your suspended tree finally finishes on the server, the HTML is added to the visible DOM and React re-runs the component to see if it would actually render the same. You get a hydration mismatch error. And while usually, a hydration mismatch is something React can recover from by throwing away all work from the server and restarting everything on the client, I've seen cases where it completely crashed the page. Also, depending on your architecture, this can be a lot of work.

The whole thing looks like this (taken from my RFC).

sequenceDiagram
  participant GQL as Graphql Server

  box gray Server
  participant SSRCache as SSR Cache
  participant SSRA as SSR Component A
  end
  participant Stream

  box gray Browser
  participant BCache as Browser Cache
  participant BA as Browser Component A
  end

  participant Data as External Data Source

  SSRA ->> SSRA: render
  activate SSRA
  SSRA -) SSRCache: query
  activate SSRCache
  Note over SSRA: render started network request, suspend
  SSRCache -) GQL: query A
  GQL -) SSRCache: query A result
  SSRCache -) SSRA: query A result
  SSRCache -) Stream: serialized query A result
  deactivate SSRCache
  Stream -) BCache: add query A result to cache
  SSRA ->> SSRA: render
  Data -) BCache: cache update
  SSRA ->> SSRA: other children of the suspense boundary still need more time
  Note over SSRA: render successful, suspense finished
  SSRA -) Stream: transport
  deactivate SSRA
  Stream -) BA: restore DOM
  BA ->> BA: rehydration render
  Note over BA: ⚠️ rehydration mismatch, data changed in the meantime

So, to prevent these hydration errors, we essentially snapshot the value a hook had at the time it rendered first on the server, transport that over, render it once with that value, and if it differs from the actual cache contents, we immediately rerender with the current cache contents.

All to prevent that hydration mismatch 🤦

Now, the thing is: each of those is a snapshot of that individual hook, at that individual point in time. All of these are not the same object, and they could have slight variances. So deduplicating them is hard, and it adds a lot of bundle size that in most use cases is a lot more than the data you actually save. So we don't.

That said, we do have escape hatches you could use to try and add your own deduplication logic (and if you come up with something good, please share it!):

buildManualDataTransport accepts an optional stringifyForStream and reviveFromStream callback option, and you could use that to create a modifed version of ApolloNextAppProvider. Here's the "normal" implementation without those options: https://github.com/apollographql/apollo-client-nextjs/blob/da4c6f8705bd78a789073521cda81c1a8e5afe01/packages/experimental-nextjs-app-support/src/ApolloNextAppProvider.ts#L48-L66

maciesielka commented 2 months ago

@phryneas thanks for the quick and super-detailed response! This helps our understanding of this particular problem and the streaming functionality 10x.

I'll close this since there's no action item here.

github-actions[bot] commented 2 months ago

Do you have any feedback for the maintainers? Please tell us by taking a one-minute survey. Your responses will help us understand Apollo Client usage and allow us to serve you better.

apollographql / apollo-client-nextjs

Multiple `useFragment` uses for same fragment data creates large transport size #344

Overview

Example