RFC: Option to invalidate document cache with a query

urql-graphql / urql

The highly customizable and versatile GraphQL client with which you add on features like normalized caching as you grow.

https://urql.dev/goto/docs

MIT License

8.61k stars 449 forks source link

RFC: Option to invalidate document cache with a query #3223

Closed hizo closed 1 year ago

hizo commented 1 year ago

Summary

I found useful invalidating document cache by providing additionalTypenames to mutation. I have a specific usecase, where I'd want to do the same with a query. From default cache exchange source it's clear that queries only "mark" the results with additionalTypenames.

I currently have a working solution, by copying over the exchange's source and tweaking it and I'm interested in creating a PR to contribute back.

Why

when you don't want to switch to graphcache as document cache suits you
when you don't want to use requestPolicy: 'network-first' for the query you need to invalidate as that would require you to implement additional logic when to apply the policy (often complicated)
when you don't want to refetch a query straight away (ie. you might be on the page displaying the query results and don't want to switch to loading again, but you only want to refetch on the next page visit)
when you don't trigger a mutation (see specific usecase below)

Usecase: On a specific event (subscription in my case) I imperatively trigger a query via client.query. This query should in turn invalidate some results in cache (list of items), since it fetches a fresh item from backend. As I use document cache, subscription doesn't invalidate anything in cache and urql doesn't provide a way to access cache. The triggered query following a subscription callback is the only place I can invalidate.

Proposed Solution

By providing an option in OperationContext, eg.

client.query(query, variables, {
  additionalTypenames: ['typename_to_invalidate'],
  invalidateAdditionalTypenames: true
})

I was able to invalidate cache while the request with the provided context option was passing through cache exchange.

The logic to invalidate cache is already present, so I just abstracted it into a function and added

// invalidates query's typenames provided in additionalTypenames param
// when invalidateAdditionalTypenames in the operation context is set to true
if (
  response.operation.kind === 'query' &&
  response.operation.context.invalidateAdditionalTypenames &&
  response.data
) {
  invalidateCache(response.operation.context.additionalTypenames)
}

Would you be interested in this?

kitten commented 1 year ago

We actually explicitly haven't implemented a mechanism for this and there's some older issues around this request.

We do however want to encourage everyone to copy the default cache and modify it as needed, if they get to a point where it doesn't serve their use-case.

This is similar to Apollo's refetchQueries API and we explicitly don't believe in this approach. It basically is equivalent to making a function that just plainly explicitly refetches queries. Similarly, for more complex cases it's intended that Graphcache is sufficient to update or invalidate requests.

I understand that there's a middle ground here and we've discussed in the past how to lower the migration cost of moving to Graphcache, but I still believe this isn't it.

Basically, if your query is just "local" and doesn't affect the other queries, since they're only passively modified, then as you said the side-effect that causes queries to be considered stale comes from something else.

If this is not a mutation but a subscription, I'd actually be more inclined to apply the same invalidation logic that mutations have to subscriptions optionally.

In other words and TL;Dr:

invalidateAdditionalTypenames to add a side-effect to a query is not expected, and I'd always assume that a query (the operation, not the result) is repeatable without side-effects
- i.e. I don't think adding a context option for the cache to turn a query into an invalidating effect makes sense
Recognising additionalTypenames on subscriptions however is something I can get onboard with
- I still don't think we should automatically invalidate based on subscriptions (more on that below)

I'd still enforce that mutations represent changes of data, and queries and subscriptions represent data that is (to some extent) unchanged. Graphcache (and ignoring what other caches/libs consider their normalised caches to be) is basically the cache to go for if you have operations that affect one another in more complex ways

Does that make sense to you? ✌️ Would adding additionalTyoenames invalidation to subscriptions solve your issue mostly as well — in theory, of course, since you have a custom cache now?

hizo commented 1 year ago

Yes, that would do it as well and is indeed a better option.

Do you want me to open a PR for this? It's in fact just about adding || response.operation.kind === 'subscription' in the cache invalidation logic.

I tried switching to Graphcache to get access to cache invalidation. In our case it would be a huge bandwidth saver if I could invalidate a single item in the list instead of the whole list as with document cache.

This however proved challenging because the page is displaying paginated data in the form of relay's Connection.

eg.

query GridData {
  grid(args) {
    data(args) {
      pageInfo {
        endCursor
        hasNextPage
      }
      edges {
        node
      }
    }
  }
}

By grid, data and edges entities not having an ID, all nodes are actually embedded within this structure starting with grid in cache. To make it even more difficult, data's arguments is not a trivial scalar input. To actually update the embedded list in the cache, I'd need to pass all of the arguments used in the query to the subscription in order to construct a correct query to read from the cache, am I right?

Is it possible I was trying to update the cache wrong? Because it seems that every cursor-paginated list must be suffering from this issue. Is there maybe another way how to do cache updates for paginated data like this? Not to mention if you have multiple grid queries with different arguments - maybe it's ok to update cache results of 1 grid query by passing all needed arguments to subscription, but its impossible to know what other grid queries have been made.

kitten commented 1 year ago

@hizo: I just submitted a small PR for this. Basically, I want to make sure that an in-code comment explains why it's limited to additionalTypenames :v:

To actually update the embedded list in the cache, I'd need to pass all of the arguments used in the query to the subscription in order to construct a correct query to read from the cache, am I right?

You don't necessarily have to, no. You can use cache.inspectFields to actually read arguments and fields that have been used on any given entity. This way you can re-discover all the list fields you've got dynamically.

You can also re-fetch the entire query in Graphcache, which is what I was rather referring to, by using cache.invalidate — i.e. if a subscription invalidates your lists, you can delete parts of the data and your queries will re-execute.

hizo commented 1 year ago

@kitten I tried inspectFields as well.

For the above example grid was of __typename: 'Grid' and data of __typename: 'GridConnection'. Running cache.inspectFields({ __typename: 'Grid' }) and cache.inspectFields({ __typename: 'GridConnection' }) respectively just returned an empty array for me. Would you know why that is?

However running cache.inspectFields({ __typename: 'Query' }) did output every query for me. For grid there was this record

{
  arguments : {gridType: '<value>'}
  fieldKey: "grid({\"gridType\":\"<value>\"})"
  fieldName: "grid"
}

Let's say I am able to get arguments from here and run cache.invalidate('Query', 'grid', { gridType: "<value>" }) right?

There is indeed 1 other fact I didn't mention. Before grid data is fetched, I first fetch grid configuration under the same field. So:

query GridConfig {
  grid(args) {
    columns {
      ...
    }
  }
}

For some reasons we don't fetch data and config together as we have a custom utility to fetch data (via imperatively calling client.query) in batches until pagination says there is no more data (we are talking big data sets here, so need to fetch in batches due to response size / backend load).

When I run cache.invalidate('Query', 'grid', { gridType: "<value>" }) I only see the grid config query being triggered again, but not the one that fetched data.

Is this something to do with the commutativity as described in the docs ?

Similarly, I was only able to invalidate with the above call, nothing else worked. I tried:

cache.invalidate('Grid')
cache.invalidate('GridConnection')
cache.invalidate('grid({"gridType":"<value>"})')
cache.invalidate('Grid', 'data')
cache.invalidate('Grid', 'data', {
  gridType: '<value>',
})

kitten commented 1 year ago

I obviously don't know anything about your utility, but if it's using toPromise, or rather, only waits for a single result, there's no way to update that result, because whatever issued this operation is not subscribed to future results from the Client

hizo commented 1 year ago

That must be it yes, it's using toPromise!

hizo commented 1 year ago

However that actually doesn't solve the issue - I don't actually need the operation to refetch straight away, rather on a next page visit, that means the utility will trigger the queries again, but this time it will get results from cache. Why wouldn't data invalidate as well when the config obviously did? Both data and config are in the cache under the same grid({gridType: <value>})" key so I would expect for the data query to hit the backend again after invalidation (on next page visit ofc.).