mcollina / async-cache-dedupe

Async cache with dedupe support
MIT License
637 stars 40 forks source link

Using async cache dedupe for http request caching #99

Open ronag opened 1 year ago

ronag commented 1 year ago

I would like to use async cache dedupe for http request caching. However, I'm missing a way to set the stale value on a per entry basis after the request is completed, i.e.

cache.define('request', async url => {
  const { headers, body } = await undici.request(url)
  if (headers['cache-control') {
     // Set the stale/ttl for the return value
  }
  return await body.json()
})

Any ideas whether this would be possible to add?

mcollina commented 1 year ago

If not let's make it so. Wdyt @simone-sanfratello?

simone-sanfratello commented 1 year ago

There are a few decision we should take, let me explain

how to identify the request

The function should identify the request by:

querystring and headers fields need to be selected, that may be fields that can be ignored because don't affect the response (for example user-agent most of the times)

My idea is to provide a default function with options to do so, along with the possibility to use a custom function, where the output must be an object (or a string) that identify the request (note for further optimization, since request has a know structure, should use a custom serializer)

example - pseudocode


export function defineCache ({ cache, options }) {
  cache.define('request', {
      serialize: options.serialize ?? async (request) => {
        // TODO implement an ergonomic and performant way to describe so, to be passed in options

        for(const option in options) {
          if (match(request, option)) {
            return serializeRequest(request, option)
        }

        // don't use cache if no matched
      }
    },
    (request) => undici.request(request)
  })
}

// example of options, there will be probably more ways or different ones, to be defined during implementation
const options = [
  {
    url: /$https:\/\/server\/user\/[0-9a-f]+/,
    method: 'GET',
    headers: ['content-type','authorization'],
    query: [] // ignore querystring
  },
  {
    url: 'https://server/users',
    method: 'GET',
    headers: ['content-type'],
    query: '*' // consider the whole querystring
  },
]

using a custom logic

options.serialize = (request) => {
  if (request.url.match('http://server/users/:id') && request.url == 'GET') {
    return serializeRequest({ 
      method, 
      url, 
      querystring,
      headers: pick(request.headers, ['authorization']) })
  }

  if (request.url == 'http://server/items' && request.url == 'GET') {
    return serializeRequest({ 
      method, 
      url, 
      querystring,
      headers: pick(request.headers, ['content-type']) })
  }

  // else don't use cache
}

how to get and set headers

Imo the function should provide the capabilities for the user to:

  cache.define('request', {
    ttl: 30,
    // ...
  },
    (request) => {
      // HERE request.headers['cache-control'] will be set to 'max-age=30'
      return undici.request(request)
    }
  })

This is not currently supported, should be implemented

This function should support the http caching headers, for example: if etag, last-modified Same for vary and optionally for old http caching headers (pragma, expires)

  cache.define('request', {
    ttl: 30,
    // ...
    onResponse: (response) => {
      // if response.headers['cache-control'] is 'max-age=30'
      if (response.headers['cache-control']) {
        const cacheControl= parseCacheControl(response.headers['cache-control'])
        const ttl = cacheControl['max-age']
        return { ttl, response }
        // same for stale as in ronag example
      }
    }
  },
    (request) => {
      // request here will have here
      // request.headers['cache-control'] = 'max-age=30'
      return undici.request(request)
    }
  })

onResponse is not supported, should be implemented

Same as above: if cache is been used, and server request is not performed, response headers may add caching headers like x-cache: hit from cache, age: 12, cache-control: max-age=60

how to release it / how to use it

Could be releases as a plugin of this library like 'undici-cache' or so, so the usage can be

import cache from 'undici-cache'
import undici from 'undici'

const undiciCached = cache(undici, options)

// undiciCached has the undici methods
undiciCached.request(...)

@mcollina @ronag you opinion?

While it's not a small piece of work, that would be a really cool feature, I'd be very happy to write it, let me write a poc

ronag commented 1 year ago

I think you can be more generic that onResponse, maybe onResult?

simone-sanfratello commented 1 year ago

Sure, it's just a draft to share ideas

mcollina commented 1 year ago

Agreed!