[Response Cache] Better collection and empty values invalidation workflow

EmrysMyrddin commented 1 year ago

The workflow about how to handle cache invalidation for lists and empty values is not very clear.

Today, once #1961 will be merged, calling cache.invalidate({ typename: 'Type' }) allows to invalidate response containing any entity of the given type. This is for now the most straight forward way to handle cache invalidation of responses containing list or null values.

But doing so also invalidate responses that are actually not affected by the fact that a new entity has been created. Like querying an existing entity.

Example

const cache = createInMemoryCache()

const yoga = createYoga({
    schema: createSchema({
        typeDefs: /* GraphQL */`
            type Query {
                users: [User]!
                user: User
            }

            type Mutation {
                addUser(id: String!, name: String!) User
            }

            type User {
                id:  String!
                name: String!
            }
        `,
        resolvers: {
            Query: {
                users: () => db.listUsers(),
                user: (_, { id }) => db.getUser(id),
            },
            Mutation: {
                addUser: async (_, { id, name }) => {
                    const newUser = await db.createUser({ id, name })
                    cache.invalidate({ typeName: 'User' })
                    return newUser
                }
            }
        },
    }),
    plugins: [
        useResponseCache({
            session: () => null,
            cache,
        })
    ],
})

This allows to correctly invalidate this kind of queries when adding a user:

Query flow examples

With list

query {
    users { id, name }
}

=> { data: { users: [{ id: '1', name: 'User 1' }] } }

mutation {
    addUser(id: '2', name: 'User 2') { id }
}

=> { data: { addUser: { id: '2' } } }

query {
    users { id, name }
}

with invalidation, cache miss => { data: { users: [{ id: '1', name: 'User 1' }, { id: '2', name: 'User 2' ] } }
without invalidation, cache hit =>  { data: { users: [{ id: '1', name: 'User 1' }] } }

With null

query {
    user(id: 2) {   id, name }
}

=> { data: { user: null } }

mutation {
    addUser(id: '2', name: 'User 2') { id }
}

=> { data: { addUser: { id: '2' } } }

query {
    user(id: 2) { id, name }
}

with invalidation, cache miss => { data: { user: { id: '2', name: 'User 2' } } }
without invalidation, cache hit =>  { data: { user: null } }

With an existing entity

query {
    user(id: 1) {   id, name }
}

=> { data: { user: null } }

mutation {
    addUser(id: '2', name: 'User 2') { id }
}

=> { data: { addUser: { id: '2' } } }

query {
    user(id: 1) { id, name }
}

with invalidation, cache miss => { data: { user: { id: '1', name: 'User 1' } } } <= Here we miss the cache, while the response is still valid
without invalidation, cache hit =>  { data: { user: { id: '1', name: 'User 1' } } }

Proposal

We should find a way to improve the API to allow invalidating cache only for responses containing nulls or collections. Something in the line of cache.invalidateCollection('Type').

marhaupe commented 1 year ago

Should we start thinking about pagination? I think this is something I also missed in #1961, I will give it some thought and maybe create another issue.

type UserConnection {
  edges: [UserEdge!]!
  pageInfo: PageInfo!
}

type UserEdge {
  node: User!
  cursor: String!
}

type User {
  id: ID!
  name: String!
}

type Query {
  users(first: Int, after: String): UserConnection!
  user(datoId: String!): User
}

type Mutation {
  addUser(id: String!, name: String!): User
}

We'd need to "unwrap" the UserConnection to be able to determine whether it is a collection or not. I don't really think it's a big issue though because we can just do cache.invalidateCollection('User'); cache.invalidate('UserConnection'), (correct me if I'm wrong), but at some point we should maybe consider dedicating a section for common pitfalls in the documentation.

EmrysMyrddin commented 1 year ago

I see what you mean. Yes this kind of pattern doesn't work well for now with our approach. I'm not sure how we could do better without making it very complex and potentially misleading for users not aware of this kind of advanced usage.

We should probably think more broadly about how cache invalidation works, perhaps take a look how other well established caching solutions handle this kind of things.

ramiel commented 4 months ago

Today I stumble on a related issue. I have two entities like this

Project {
   id: string
   outputs: [Output]
}

I fetch the project and ask for the list of related output as well. The problem is that when a new output is added, the query is not invalidated and I never see the new created output.

In GraphCDN/Stellate, they solved the problem by ignoring this situations and leaving the user the due to invalidate the related entities. In this case, for example I'd need to invalidate the Output entity.

I hope I'm adding some useful informations here, if I'm not ignore this comment :D

EmrysMyrddin commented 4 months ago

Yes, that's the simplest way to do it: let users manually manage cache :-)

But I hope we will find a solution that doesn't require for manually error prone cache management.

By the way, in your example, today, the cache should be invalidated. We kind of eagerly invalidate to avoid cache leaks as you describe. So if it's not the case, can you open an issue ?

ramiel commented 4 months ago

I will open an issue soon with more details and hopefully a reproduction, but not in the next few days because I changed focus at the moment. Thanks. Meanwhile reading the guide I found the following paragraphs

Another solution is to add invalidation logic within our GraphQL mutation resolver. 
By the GraphQL Specification mutations are meant to modify our GraphQL graph.

A common pattern when sending mutations from clients is to select and 
return affected/mutated entities with the selection set.

For our example from above, the following could be a possible mutation 
for adding a new repository to the repositories field on the user entity.

mutation RepositoryAddMutation($userId: ID, $repositoryName: String!) {
  repositoryAdd(userId: $userId, repositoryName: $repositoryName) {
    user {
      id
      repositories
    }
  }
}

Taking the example of my output, does it means I have to forcefully query the related project to have the cache correctly invalidated?

EmrysMyrddin commented 4 months ago

No, the plugin collects every entity present in the response, and invalidate it when a mutation is done involving any of this entity.

But for this to work, you have to actually select the newly created Output from the mutation that is creating it :-) Otherwise, you can manually invalidate by using cache.invalidate('Output') in you mutation resolver

ramiel commented 4 months ago

I see,but in this case how can it invalidate the list of entity B that I fetch when I get entity A? If I create a new entity B I return it. In that case,does the system invalidates all B? Does this affect entity A that carries a list of Bs?

ramiel commented 4 months ago

Also,manually calling cache.invalidate('Output') is exactly what stellate is doing. I think the rationale behind it is that you're not always in control of the mutations sent to your system

EmrysMyrddin commented 4 months ago

Yes, and this is also more robust, because it doesn't depends on what the client is actually selecting when making a mutation. I've discussed with Stellate about this some time ago, and another approach would be to define cache behavior with directives. This would allow for a more declarative and less error prone way to handle this kind of use cases.

ramiel commented 4 months ago

This would be great actually. So at the level of mutation definition, e.g. OutputCreate we can declarative state "this mutation invalidates Output"

n1ru4l / envelop