Problem: @urql/exchange-graphcache updateCacheWithResult reads null after writing link in cache updater.

FreeRiderBysik commented 3 months ago

This is not a bug, because I cannot create reproduce link. Problem is very complicated to create small example and i am not able to share whole project.

Context

We have entities linked with Foreign Keys in schema.

Execution -> Question looks like:

type IdOfEntity {
  _id: ID!
}

type Question {
  _id: ID!
  # ... more fields
}

union QuestionOrId = Question | IdOfEntity

type Execution {
  _id: ID!
  question: QuestionOrId!
  # ... more fields
}

We load this entities separately, and then combine it (add link to real Question instead of IdOfEntity) using URQL "cache updaters" feature. Like this:

const __typename = 'Question';

function executionQuestionUpdater(
  result: DataFields,
  args: Variables,
  cache: Cache,
  info: ResolveInfo
): void {
  const current = (result as any)?.[info.fieldName];

  if (!current || current.__typename === __typename) {
    /* do not need to change link, it is already resolved with entity */
    return;
  }

  const entityLink = cache.keyOfEntity({
    __typename, // new typename
    id: current._id, // current FK id
  });

  // check cache has record with this __typename
  if (cache.resolve(entityLink, '_id')) {
    // link question entity
    cache.link(info.parentKey, info.fieldName, entityLink);
  }
}

Problem

Everything works fine, but we met very specific case when we load all data at same time:

Receive Questions, write it to cache, create layer with commutative flag (not squashes, because it was not queried first)
Receive Executions, and call updateCacheWithResult, which:
1. call initDataState for 'write' (create new layer)
2. call cache updater - executionQuestionUpdater (see details above)
  1. code cache.resolve(entityLink, '_id') CAN find record in previous commutative layer because it has currentOperation === 'write', according skip condition in function getNode:
```
var skip =
!currentOptimistic &&
currentOperation === 'read' &&
currentOptimisticKey &&
currentData.commutativeKeys.has(currentOptimisticKey);
```
  2. cache.link has been called and we updated the link
3. call initDataState for 'read'
4. call _query -> readSelection -> readLink -> getNode.
5. Get node will return undefined because of skipping commutative layer with questions according same skip condition above
6. question field is mandatory, so value will be null and this will break our subsequent logic (parent record optional field with executions will be returned as null)

My solution

I decided to use temporary solution - read record from commutative layer if it not found in other layers or data.base:

var getNode = (map, entityKey, fieldKey) => {

  // .... untouched code

  var commutativeNode;

  // .... untouched code with loop and condition

    } else if (optimistic && currentOperation !== 'write') {
      // save node value from commutative layer to use it instead of undefined
      var possibleNode = optimistic.get(entityKey);
      if (possibleNode && fieldKey in possibleNode) {
        commutativeNode = possibleNode;
      }
    }
  }

  // Otherwise we read the non-optimistic base value
  node = map.base.get(entityKey);
  if (node !== undefined) {
    return node[fieldKey];
  }

  // use node value from commutative layer to use it instead of undefined
  return commutativeNode !== undefined ? commutativeNode[fieldKey] : undefined;
};

It will work fine in our application, but I am not sure, that it is correct solution for everyone, because possible it can break invalidation logic.

Urql version

urql 4.0.6 @urql/core 4.3.0 exchange-graphcache 6.5.0

kitten commented 3 months ago

There's basically some very specific conditions here — as you say — which make this a bit tricky and this "outcome" can only occur when these are true.

Let me list them to make this a little clearer for future reference:

You've got multiple queries that are all queried out of order, basically, but in parallel
You've created a cache updater that writes a field upon receiving a Query result
You're expecting the subsequent query to reuse data of a parallel query, presumably, and instead it cache misses

This is kind of expected behaviour because this breaks several assumptions that we're making:

Queries are loaded in parallel. We don't expect to prevent cache misses to be relevant for parallel queries because:
- the queries can still settle because they're being made against the API anyway
- we'd expect fragment composition and hence there's a way to avoid sending too many queries per page (basically one query per screen)
We don't expect cache updates to be used for query fields (unless there's very specific cases where it's necessary)
- There are cases where it makes sense, but this has specifically not been designed to work for parallel queries

I might be forgetting a couple of things here but to keep it short, this condition is kind of meant to prevent cache misses after writes. After writes, when a layer is still commutative, we read from layers in a very specific manner because we value data integrity above all else.

Cache updaters for result data has come as a feature afterwards, so it can lead to some unexpected cases as you're seeing. Basically, I can't guarantee that it's safe to make a change like you have and that will need some further investigation. But this comes down to the preconditions here, because what you're describing basically doesn't sound like unexpected behaviour per se 🤔

Unless I'm misunderstanding where you've placed your updater in the cache config

FreeRiderBysik commented 3 months ago

Thank you for responding so quick, I appreciate that. 😊

"You've got multiple queries that are all queried out of order, basically, but in parallel" - yes. We use several queries on one page. I can not rely on queries order because it depends on different effects. Different endpoints processes this queries, so result order is unexpected too. "You've created a cache updater that writes a field upon receiving a Query result" - yes. Updater placed on typename Execution field question "You're expecting the subsequent query to reuse data of a parallel query, presumably, and instead it cache misses" - not exactly. The problem is not that we have a cache miss, but that we have both a hit and a miss in one query result processing.

Cache miss scenario itself is ok for me, cache updater will not update link, and it will be updated later, on second query processing. In my scenario on the client side I got null instead of two options of records, which both of which exist in the cache.

If we add tracing, we will see something like this (simplefied):

got query2 result object {__typename: Question, _id: 1}, save to commutative layer (was queried second)
got query1 result object {__typename: Execution, _id: 2, question: {__typename: IdOfEntity, _id: 1}} (execution is consistent)
cache updater triggered, HIT real question (in commutative layer), update link, now execution is: {__typename: Execution, _id: 2, question: {__typename: Question, _id: 1}} (execution is consistent)
call _query in updateCacheWithResult after writing, got MISS for question (which was HIT during writing this query).
query1 result is {__typename: Execution, _id: 2, question: null} (execution is not consistent). And it not updates after all queries resolving. User see wrong data (but cache holds right data)

"Basically, I can't guarantee that it's safe to make a change like you have and that will need some further investigation." - i agree with you. That is why I would like to find a better solution together with you. Can you explain, why it is safe to read links from the commutative layer during write and not during subsequent read? maybe we should allow it when we are in the same updateCacheWithResult?

P.S. When I am talking about write and subsequent read, i mean this:

kitten commented 3 months ago

call _query in updateCacheWithResult after writing, got MISS for question (which was HIT during writing this query).

is this during the first or second write? This seems a little odd to me as I don't quite see where this null value comes from if you're writing a link in both cases (one automatic, one manual, if I'm not mistaken?)

FreeRiderBysik commented 3 months ago

I will try to explain timeline one more time.

send query1 - Execution
send query2 - Question
query2 response come: {__typename: Question, _id: 1} - write it to commutative layer. No updaters, everything ok.
query1 response come: {__typename: Execution, _id: 2, question: {__typename: IdOfEntity, _id: 1}} start processing it in function updateCacheWithResult, as on screenshot in previous comment, it goes throw steps: 4.1. initDataState('write', ... - init write operation 4.2. _write, which will call cache updater on execution.question - executionQuestionUpdater 4.2.1. code cache.resolve(entityLink, '_id') CAN find record in previous commutative layer because it has currentOperation === 'write', according skip condition in function getNode: var skip = ... && currentOperation === 'read' && ... 4.2.2. cache.link has been called and we updated execution link to {__typename: Question, _id: 1}, which exists in previous commutative layer 4.2.3. query1 result written to cache 4.3. initDataState('read', ... - init read operation 4.4. _query data which just was writed 4.4.1. getNode will called on execution.question.xxx fields for entityKey='Question:1' 4.4.2. getNode will return undefined because it will skip commutative layers, because of same skip condition (currentOperation === 'read'`): var skip = ... && currentOperation === 'read' && ... and because 'Question:1' exists only at commutative layer 4.5. We got null instead of question as a result of _query

kitten commented 3 months ago

Hm, I think the main thing I'm struggling with here is the inherent inconsistency of trying to link data from query1 to query2 while also reading the selection set of query1.

In other words, the problem kind of stems from query1 having its own selection set and link to Execution.question, which then differs from query2. Have you considered inverting the write here, i.e. rather than writing Execution.question -> Question:1 with an updater on there, writing Question.id -> Execution.question instead given that the IDs match up?

It does seem like in the schema design there's a missing edge here potentially, but that could be one way of solving this.

Other than that and ignoring the above, this kind of creates an awkward situation, as this behaves “as expected” as per the all the pre-conditions we assume, namely:

Writing within query data inherently can produce situations upon which a cache write leads to an instantaneous miss, destroying the incoming result
Commutative layers guarantee a level of safety, but cache.resolve inherently circumvents that. Of course, we could block reading values of higher commutative layers, but this could cause issues with subscriptions triggering query writes that are then performed on a lower layer potentially
Raising the layer when cache.link is called could cause other issues, as we've never dynamically moved layers, so I'm kind of afraid of considering that too

farin commented 2 months ago

I think I hit same issue in out project.

Running two queries in paralell, both fetches some (different) attributes on same entity. In some approx 20% or cases a race condition occurs and second query subscription returns empty result and nothing else. {data: null, errors: undefined}

On network layer I can see all data are returned from server correctly but quiery.subscribie never emit them (instead null data is emitted once)

Some observations: When cacheExchange is not used then everything is fine. When schema was changed and attributes was moved to different entity then both queries are also always correct. Using latest @urql/exchange-graphcache = 7.0.1

JoviDeCroock commented 2 months ago

Do either of you have a way for us to reproduce this, that would make this easier to debug

farin commented 2 months ago

I spent some time and eventually isolated my error to simple example repo https://github.com/farin/graphcache-bug

I am going to create regular issue ticket, I am still not 100% sure if it is same bug as described here. https://github.com/urql-graphql/urql/issues/3562

urql-graphql / urql