[useResponseCache] Add the computed `scope` in `buildResponseCacheKey` function params

Dunk14 commented 5 months ago

The problem

Currently the scope PUBLIC creates a cache key per user, since I include the session in the key string.

But I'd like to have only ONE key globally shared when it is PUBLIC, and a key per session when it is PRIVATE.

A viable solution

To be able to generate a key that is shared across all users, I'd need a property that tells the final computed scope, named scope or isPrivate.

Maybe I'm missing something, I didn't find any way to reach this goal with the current state of the plugin. Is it something you considered or conversely that you don't want to include in the response-cache ?

To me it appears like a light solution, that could bring more flexibility when building the cache key.

This update should be reflected in @graphql-yoga/plugin-response-cache too.

EmrysMyrddin commented 5 months ago

If I understand your problem properly, you have a custom key builder right ? And it includes the session id in the key ?

The problem is that with current implementation, giving access to the scope is probably not that easy.

The key is built very early in the pipeline, which prevents us from giving access to some things that are computed later, such as the TTL or the scope.

I have to read the code again to verify if it is doable.

Dunk14 commented 5 months ago

Yes we have to build a custom key because of some context variables needed to fluctuate the key, the session id is one of them.

I checked the code and the key is indeed built early, the scope is computed later from the result.data. It is still possible to get it I think considering that scopePerSchemaCoordinate already has the scopes from the schema.

The final scope could be calculated from parsing the query itself before getting the cache key, then getting the object keys in an array and passing them to the isPrivate function. Maybe memoize it to prevent later calculation with the same query.

Dunk14 commented 5 months ago

On apollo-response-cache plugin when the scope is PUBLIC it saves only one key in cache, the parsing behaviour could replicated from their source code.

EmrysMyrddin commented 5 months ago

I have checked, I confirm that this is not possible without sacrificing performance. Our implementation of response-caching is based on the actual response returned after the execution, which means that the cache key can't be based on the scope.

This allows us for example to entirely skip the parse/validation phases, which can be a major performance boost on medium to large requests.

On Yoga, it also allows us to take advantage of the HTTP caching mechanism, meaning we don't even have to actually consume the body of the request to send back the response on cache hit.

The scope is here only to enforce the fact that a response should never be cached if there is no sessionId with a PRIVATE scope.

Dunk14 commented 5 months ago

The main problem we currently have with the response cache plugin is that it creates a cache entry per session even on PUBLIC scopes.

We are migrating from Apollo for the performances of Yoga and the killer feature invalidationViaMutation, but we can't afford to have a query processed by session on these PUBLIC queries.

I understand your concerns about performances issues and latency in general. A potential solution would be to memoize the query or a hash with the final scope in a Map to avoid parsing it again.

Perhaps I don't get the whole picture, tell me if so.

Anyhow know that I'd be delighted to help adding this feature !

EmrysMyrddin commented 5 months ago

Yes that's probably the way to go.

I'm not sure how we should provide this feature.

I see tow approach:

The plugin itself keeps a map of operation hash => { scope, ttl } and provides this to the cache key builder.
The plugin adds some way to hook into the caching mechanism and provide some callback to be called with document and the derived metadata. Then, each users can decide to plug it's own logic, like in your case keep a list of public operation hash

In case of the second solution, we could change the Cache.set signature to also pass the scope to it. This way, you could just make a custom cache store implementation.

Example:

import { createInMemoryCache } from '@envelop/response-cache'

export function createCache(options) {
  const cache = createInMemoryCache()
  const publicDocumentsHash = [] // If you have a lot of different public document, a Set can be more efficient

  return {
    set(responseId, result, collectedEntities, ttl, metadata) {
      const { scope, documentString } = metadata
      if(scope === "PUBLIC"({
        if(!publicDocumentsHash.includes(hash) {
          publicDocumentsHash.push(hash)
        }
      }
      return cache.set(responseId, result, collectedEntities, ttl, metadata)
    },
    get(...args) {
      return cache.get(...args)
    },
    invalidate(...args) {
      return cache.invalidate(...args)
    },
    isPublic(documentString) {
      return publicDocumentsHash.includes(hash256(documentString))
    }
  }
}

EmrysMyrddin commented 5 months ago

@ardatan What do you think ? I tend to prefer the second solution, because it will not cause any memory penalty for users that doesn't need this. In the other hand, the boilerplate to handle this in user land is not that straight forward.

Aloxbro commented 5 months ago

@EmrysMyrddin could you review the job done by @Dunk14 ? It seems to be a working solution.

n1ru4l / envelop

[useResponseCache] Add the computed `scope` in `buildResponseCacheKey` function params #2181

The problem

A viable solution