Integrate support for batching server-side GraphQL requests

👋 Hi! I've recently been looking in how we can best support third-party integrations in Hydrogen. While Hydrogen is typically focused on serving components from our own first-party storefront API, most headless merchants still end up needing to integrate with third-party services. I've been looking into how we can be as opinionated as possible when it comes to this integration so that we can offer much better performance & ergonomics than manually fetching an external API on the client and massaging all the data yourself.

We've been documenting some thoughts about how we can best achieve that over here, but ultimately we think we can add the most value to fetching third-party data with three things: caching, batching, and mapping. Why these three strategies?

Caching should significantly increase performance when dealing with third-party data by colocating it alongside the Hydrogen runtime and would be a nice value-add for running Hydrogen on Oxygen.
Batching all GraphQL requests to a given service would also reduce the number of waterfalling network requests needed to render a given page.
Mapping foreign keys in a third-party datastore to the results returned from the SF API can be really tedious and requires a lot of glue code we can remove if we set have some default assumptions about how data is stored for given third-parties.

This idea in the form of a code example illustrates what the ergonomics of all server fetches in Hydrogen could look like if you squinted:

  const query = gql`
    query product($handle: String!) {
      product(handle: $handle) {
        id
      }
    }
  `;
  // These three shop queries are batched across hook calls (or across components within a Suspense boundary) into a single GraphQL request
  const {data: data1} = useBatchedGraphQLQuery(query, { handle });
  // Duplicate query detected while batching, no need to perform additional work & hashes to the same cache key
  const {data: data2} = useBatchedGraphQLQuery(query, { handle });
  // Cache-aware batching crafts the smallest request necessary, incorporates individual cache hits and makes use of H2 caching strategy
  const {data: data3} = useBatchedGraphQLQuery(query, { handle: 'foo' }, {cache: { maxAge: 10 }});

  const selectionSet = gql`{
    title
    upc
  }; `
  // Takes an existing Shopify product record and request data mapped to it via foreign key in external CMS
  const {data: foo1} = useBatchedRecordsQuery(data.product, selectionSet, new SanityService(...));
  // Batches many records with the same selection set within the same query,
  // alongside other batched queries to the same service, while caching some records differentially
  const {data: foo2} = useBatchedRecordsQuery(
    {id: 'gid://shopify/Product/673085082834431'},
    selectionSet,
    new SanityService(...),
    { cache: { maxAge: 1} },
  );

There's even a "working" prototype with comments for something similar if you want do just dive right in! But in trying to implement this, a few complications arose, and they mostly center around Hydrogen's use of the react-query library for the useQuery hook.

We're currently using react-query (and previously useServerFetch) to wrap & schedule promises for execution so that we can render React components with the data they need on the server. Rendering React components asynchronously isn't supported, so the need of these sorts of hooks for fetching data on the server before rendering definitely seems reasonable.

Some of the reasons we originally adopted react-query was that it provided a more robust implementation for server-side requests that work with Suspense, that it works on the client, and that it comes with a built-in cache. However, some of the now-known drawbacks of adopting it as a library are that it doesn't support async fetches, and that the headers and other information about the response aren't being exposed.

One more drawback that the map+cache+batch prototype uncovered is that useQuery absolutely won't allow for batching GraphQL requests, or delaying any async operation across hook invocations for that matter. react-query includes its own scheduler to resolve promises as eagerly as possible and offers no configurability for deferring execution across hooks. If proper batching support were to be considered a priority, some of the options we have would be to either try to introduce batching support upstream in a library that wasn't designed for it, maintain our fork of react-query with batching support, or return to writing our own implementation à la useServerFetch.

Although things sound dire, being in control of our own custom hook for server fetches (again) could be beneficial for us:

react-query has it's own separate layer of caching that could stand to be consolidated with the rest of our caching framework for simplicity & debuggability.
In a RSC-world, support for client-side queries isn't something that we should necessarily strive to maintain.
The library adds significant complexity and overhead for what effectively amount to a fetch call (at least when it comes to how we're using its features in Hydrogen)
No proof-of-concept for this yet, but we should be able to automatically batch all GraphQL requests originating from anyt/all components within distinct Suspense boundaries.

I think it could be interesting to entertain the idea of building our own server-side data-fetching hook. The problem has been solved before: react-frontload is a tiny library that could server as inspiration for building one of these pseudo-synchronous hooks. Looking for feedback as to whether folks think about something like this!

All this being said, as we try to converge with RSC, it's unclear whether building a hook would be necessary for server fetches in the future, or if just a general data-fetching library will suffice.

Although data-fetching in an RSC world doesn't seem to have been finalized, all examples point to the ability to make use of asynchronous libraries for data-fetching in server components (eg. db.notes.get(props.id) in the official RFC, and a new React server renderer that supports Suspense and waiting on asynchronous data on the server without double-rendering).

I'm relatively new to the world of React so I may be overlooking a couple things. Looking for some feedback on the general appetite for replacing useReactQuery, and also hoping this issue sparks some discussion about our server-side data-fetching patterns, both for now and in the future with RSC.

My understanding is that there is no way to implement batching with hooks and suspense, at least not for hooks run in the same component. Asynchronous hooks signal to React that they need to activate the nearest suspense boundary by throwing a promise. This means that, in the actual execution of a component, only the first hook that suspends is ever "seen" — React can’t just move past the error to execute the rest of the component because it may depend on the (non-error) result of the first suspending hook.

I haven't checked, but my guess is that React will at least find all the suspending components that are siblings in a single pass, which should allow for batching across components, but it might be confusing if that same behavior doesn't work within a single component.

Thanks for the deep-dive @lancelafontaine! Super interesting to get all that context. I really like the ergonomics you propose, too.

I think @lemonmade nails it with regard to the hooks run in the same component. Thankfully, it seems unlikely that a developer would query the same data source with different queries in the same component.

However, this raises a bigger issue with how Suspense data fetching works and our decision to use react-query.

As you've unpacked, we decided to use react-query recently instead of our own server-fetching hook. This is because we struggled with our implementation of cache across suspended render attempts.

Suspense works by throwing pending Promises to the nearest <Suspense> boundary. When those promises resolve, they indicate to the React renderer that it's time to try rendering again. This behavior is abstracted away from the developer and instead lives within data-fetching libraries (like react-query, react-fetch, react-fs, react-pg, etc). This is why asynchronous functions look synchronous when they're really suspending.

While I was struggling to find a good solution for an inter-render cache, react-query seemed to handle it perfectly. That's pretty much the only reason I chose to adopt it, beyond having really slick useQuery ergonomics. It sounds like you've been diving into the source code more than I have, so it's possible you know even more about that implementation than I do at this point 😄

It's becoming more apparent that our limited use of react-query for server-side data fetching may not be the best fit for Hydrogen. We'll probably end up writing a custom hook which uses React's new (experimental) built-in Suspense cache or adopt Suspense-friendly IO helpers like react-fetch.

So, can batching work with Suspense?

Gosh — maybe?

From what I understand, DataLoader manages an internal clock while it accepts keys to fetch from consumer code. After a certain time has passed (e.g. a Node.js event loop) it decides to execute a batch request and return it to the consumer(s).

The consumers, meanwhile, are waiting on a Promise to resolve. This is where we'd need to adjust our expectations for a Suspense world: instead of returning a Promise, DataLoader would need to throw a promise.

Some notes, questions, caveats here:

Is this feasible with the way DataLoader works? e.g. does a thrown Promise in the React renderer result in the event loop being terminated, meaning the DataLoader would move ahead and execute the current (single) request set?
Based on how a developer formats their project, batching will be inconsistent. E.g. if a server component loads a child server component, the child will not be included in the batch (because the parent will be suspended). Still, if we find the benefits of batching to be of value, it's probably worth building toward this outcome.
What can we do next? We should investigate returning to a react-query-less world.

Thank you both for the context! I'm still trying to wrap my mind around Suspense/React's execution model but this discussion is definitely making things a lot clearer 🙇

I think @lemonmade nails it with regard to the hooks run in the same component. Thankfully, it seems unlikely that a developer would query the same data source with different queries in the same component.

Suspense works by throwing pending Promises to the nearest boundary. When those promises resolve, they indicate to the React renderer that it's time to try rendering again.

Gotcha! This was a great explanation, thanks.

I could imagine one given component making two queries to two separate services within one component. With this existing throwing-promises mechanism, it seems like those couldn't even be run in parallel (unless I'm misunderstanding).

I haven't checked, but my guess is that React will at least find all the suspending components that are siblings in a single pass, which should allow for batching across components

Yep, this would be a good opportunity to batch in "two dimensions":

distinct queries to the same service (eg. useBatchedGraphQLQuery) across sibling components. This would allow for a future in which each component could be responsible for encapsulating the data that needs to be fetched to be properly rendered, without incurring the cost of an additional query per component.
same queries to the same service but with different records (eg. useBatchedRecordsQuery) across sibling components. I could imagine this being useful for many identical components with the same selection sets but for different record being rendered one the same page (eg. one card per article in a blog, one card per product in a collection, etc)

This is where we'd need to adjust our expectations for a Suspense world: instead of returning a Promise, DataLoader would need to throw a promise.

I haven't had the chance to look into the mechanics of this yet, but I'd expect that for batching to occur, these hooks would need to throw a Promise wrapping a literal value that only describes the GraphQL request that would need to be made to fulfill that data (without actually making the fetch calls), and that all dataloading would have to occur at the Suspense boundary where all thrown promises are aggregated and combined into one batched request. Definitely sounds interesting, although I wouldn't be surprised if we needed to dig into React's internals to tweak this behaviour.

E.g. if a server component loads a child server component, the child will not be included in the batch (because the parent will be suspended).

I'll spend more time thinking about this one, but my gut reaction is that without having multiple passes, you're probably right that data fetching would only be applicable for sibling components and not child components 🤔 it could be worth investigating and trying to determine whether encouraging a pattern which can't fully be batched due to these constraints would result is a better experience than our current data-fetching hoisting-at-the-highest-level pattern.

Last thought: any idea whether any of this changes in an RSC world? or any indication as to what data fetching might look like then? Mostly asking because the few resources I could find seem to indicate that this sort of Suspense-bound scheduling might not be necessary then, and can just rely on async libraries that could possibly integrate dataloaders more simply (eg. db.notes.get(props.id) in the official RSC RFC, that RFC even mentions Waterfalls are still not ideal on the server, so we will provide an API to preload data requests as an optimization.)

I could imagine one given component making two queries to two separate services within one component. With this existing throwing-promises mechanism, it seems like those couldn't even be run in parallel (unless I'm misunderstanding).

Yeah agreed that they can't run in parallel.

Last thought: any idea whether any of this changes in an RSC world?

Nothing really changes in an RSC world. React 18 introduces streaming SSR, and RSC is a somewhat separate concept of server/client component separation which still leverages the streaming SSR added in React 18.

FWIW, the db.notes.get(props.id) in the RSC RFC is likely alluding to something like react-pg which is Suspense-based (throws Promises until it resolves).

I hadn't heard of the preload optimization you pulled out. I'll ask the React team about that!

Update: I asked about preloading and got an in-depth answer from Sebastian!

There's no automatic framework that does this for you but all the React I/O APIs should expose a preload() method.

E.g. react-fetch has a fetch(...) API and a preload(...) API.

https://github.com/facebook/react/blob/main/packages/react-fetch/src/ReactFetchNode.js#L226

The principle is that most data fetching should be done in two steps:

preload
read

A preload just starts downloading the thing where as a read (e.g. fetch(...)) blocks and returns the actual result.

To achieve optimal performance it's often necessary to start the preload as early as you think you might need the data. I.e. it's better to fetch data you won't actually need before you need it to lower the time spent waiting on it. However, if you do that for everything you might flood the network and spend too high cost. This is not a problem that can be directly automated - since it's a constraint solving problem and the ideal solution likely includes statistics. So there's a research/framework space to make this ideal.

However, in simple cases it's easy to just add manual preload.

Often that can be in a component closer to the root rather than deep in the tree. Like if you always is going to need some piece of data, you might as well start loading it as early as possible. However, it would be unnecessary to block on it until deeper when you really need it.

Since preloads doesn't block - you can parallelize by calling multiple preloads.

function App() {
  preload('a.json');
  preload('b.json');
  preload('c.json');
  return <Stuff><Component /></Stuff>;
}

function Component() {
  let a = fetch('a.json').json();
  let b = fetch('b.json').json();
  let c = fetch('c.json').json();
  return <div>{a + b + c}</div>;
}

Sometimes it might not work and you have to put them in the same component but you should still use the same pattern:

function Component() {
  let items = fetch('items.json');
  for (let item of items) preload(item.url);
  let children = items.map(item => {
    let data = fetch(item.url).json();
    return <Item data={data} />;
  });
  return <div>{children}</div>;
}

It might be tempting to try to unify them into some kind of Promise.all things but that pattern doesn't scale up to help you preload earlier in a completely different component. In fact, IMO, once you get your head around thinking this new way Promise.all starts looking kind of ridiculous.

Even in the example I used it's often better to do the read within each item rather than in the parent. That allows React to also parallelize the rendering of children and not block on all of them loading before it can start work on either.

function Component() {
  let items = fetch('items.json');
  for (let item of items) preload(item.url);
    let children = items.map(item => {
    return <Item url={url} />;
  });
  return <div>{children}</div>;
}

function Item({url}) {
  let data = fetch(url).json();
  // ...
}

I hadn't even considered preload in our world yet. I wonder if we can leverage this aspect in both 3P queries but also 1P. Unfortunately, tossing all of our 3P queries into a useQuery type function makes this difficult.

There's also considerations for interacting with a subrequest cache and whether this method is compatible.

Pinging @igrigorik for visibility!

First, perhaps as an obvious disclaimer: preloading and batching are entirely different optimizations. I suspect we should spin out preload discussion into a separate thread. With that as context...

Eager-fetch of critical 1P+3P data is definitely something we should explore. Unlocking this requires either an explicit signal from the developer that data fetch that might be nested in a component tree should be done early, or automated smarts to learn / heuristically determine these optimizations.

Naively, using Suspense already gets you pretty far by eliminating blocking fetch-and-render-before-proceed boundaries, allowing fetches to be kicked off and streamed in parallel. There might still be room for "start it earlier" but it's not clear how much that will unlock.. As an idea, perhaps one thing we can experiment is adding some for of preload: true option to useQuery which could then leverage via static analysis? Plumbing this knowledge into the runtime is another matter though.

In short, I propose we open a new RFC/discussion thread on preload to explore this space, but my intuition is that proper use of Suspense will already get you pretty far and we should first see & gather some telemetry here.

Update to interested parties: we shipped preloading support in https://github.com/Shopify/hydrogen/pull/700 using preload: true and preload: '*' 🎉

Additional work could be done to analyze and combine GraphQL queries in a way that doesn't create an even slower or more complex resulting query.

For now, we should see how this works for our use case.

Shopify / hydrogen-v1

Integrate support for batching server-side GraphQL requests #53

So, can batching work with Suspense?