nodejs / undici

An HTTP/1.1 client, written from scratch for Node.js
https://nodejs.github.io/undici
MIT License
6.18k stars 540 forks source link

fetch: caching #2760

Closed Ethan-Arrowood closed 1 week ago

Ethan-Arrowood commented 8 months ago

Based on the numerous todo: cache comments (such as: https://github.com/nodejs/undici/blob/eb884c5f792345e4f13d0e555a68ebb06020b796/lib/fetch/index.js#L1394) I assume our fetch implementation does not support any kind of caching?

I was reading this mdn documentation: https://developer.mozilla.org/en-US/docs/Web/API/Request/cache and trying to demo the default cache behavior with a test script like:

import http from 'node:http';
import events from 'node:events';

const server = http.createServer((req, res) => {
  console.log('hello from server');
  res.writeHead(200, { 'Content-Type': 'application/json', 'Cache-Control': 'max-age=604800' });
  res.end(JSON.stringify({
    data: 'Hello, World!'
  }));
});

server.listen();

await events.once(server, 'listening');

const address = server.address();

console.log(address);

const r = await fetch(`http://localhost:${address.port}`);
await r.json();

console.log(`status ${r.status}`)

const r2 = await fetch(`http://localhost:${address.port}`);
await r2.json();

console.log(`status ${r2.status}`)

server.close();

But this logs hello from server twice.

Ethan-Arrowood commented 8 months ago

Related: https://github.com/nodejs/undici/issues/1146

KhafraDev commented 8 months ago

this is already possible with CacheStorage, it's just not automatic (and it is an in-memory cache right now).

adapted from mdn:

import { caches, Request, fetch } from 'undici'

async function cacheFetch (url, init) {
  const request = new Request(url, init)

  return caches.match(request).then((response) => {
    if (response !== undefined) {
      return response
    } else {
      return fetch(request)
        .then((response) => {
          // response may be used only once
          // we need to save clone to put one copy in cache
          // and serve second one
          const responseClone = response.clone()

          caches.open('v1').then((cache) => {
            cache.put(request, responseClone)
          })

          return response
        })
        .catch(() => /* handle error case */ {})
    }
  })
}
KhafraDev commented 8 months ago

I don't think automatic caching is a good idea, it's possible via the dispatcher API in a way that is extendable and configurable (you can use whatever storage method you desire). From the issue linked, there are already quite a few ideas on how to handle caching. I don't think it's something we can do in a way that would be acceptable for most people.

Ethan-Arrowood commented 8 months ago

I feel like we should have some path to interoperable caching. I agree it shouldn't be default behavior, but the issue I'm facing is Next.js is patching global fetch in order to basically implement it as specified. We really shouldn't have to patch it to accomplish this when the global implementation should be spec compliant.

KhafraDev commented 8 months ago

We are spec compliant, caching is completely optional. httpCache can be null and therefore caching is ignored:

If httpCache is null, then set httpRequest’s cache mode to "no-store".

I wouldn't be against it if there was a reasonable suggestion, but I don't think there will be an option that would satisfy everyone. Adding options to RequestInit is out of the question for me, and the only alternative then would be an option on dispatcher. But then you'd still have to rely on undici & our non-standard options to have caching regardless.

I'd prefer if caching was done in a lower level of undici, rather than just being for fetch. It's hard to get right and would be a massive benefit for everyone. That way we won't have to worry about standards while offering options to opt-out/opt-in to it and providing an extensible API.

Ethan-Arrowood commented 8 months ago

I was thinking about something similar to getGlobalDispatcher, we could add enableGlobalDispatcherCaching() or something like that

Ethan-Arrowood commented 8 months ago

Lower-level would also be supreme

metcoder95 commented 8 months ago

If it happens, at undici level through get/setGlobalDispatcher, I'd suggest creating a dispatcher-like implementation that can be interoperable with the Dispatcher base.

For instance, a new Agent-like or a new option within Agent would be a good start plus disabled by default

KhafraDev commented 8 months ago

yeah, that's what I had in mind. Rather than undici implementing it, provide some interface that other caching libraries can use instead.

mcollina commented 8 months ago

I think it would be great to add a caching interface at a lower level, with an optional reference implementation for in memory storage. Practically, this can be a special CachingDispatcher.


My understanding is that Next.js implementing request deduplication, and not just caching, and it overloads it even in the browser (but I might be wrong, it would be good to have a link for what it does). @Ethan-Arrowood can you organize a deep-dive session with the Next team to go through what they do?

Ethan-Arrowood commented 8 months ago

It is server side only for Next.js : https://github.com/vercel/next.js/blob/0a796d66f395e4eecb0c054aca30b23952fe9414/packages/next/src/server/lib/patch-fetch.ts

Browser is left alone afaik.

Can definitely deep-dive on this and maybe even drive this contribution.

mcollina commented 8 months ago

From a quick look it does not seems compliant with the RFCs, it's also doing quite a lot of things for static gen.

I think that if we do a cache it would need to be compliant to the http rfcs.

Ethan-Arrowood commented 8 months ago

yes agreed. I think the intention is that it is like the specified fetch cache behavior. No guarantees of course. I'm advocating that we move towards a compliant caching implementation.

luiscastro193 commented 7 months ago

Something like enableGlobalDispatcherCaching() would be awesome. It is hard to find a simple solution which takes into account headers like max-age, while in the browser you just take it for granted.

mcollina commented 7 months ago

My take is that we'll end up having a new CacheAgent(agent) or some other form of decoration for the agents that will intercept and store the data.

mcollina commented 2 months ago

This should be implemented as an interceptor.