apollographql / apollo-client

:rocket:  A fully-featured, production ready caching GraphQL client for every UI framework and GraphQL server.
https://apollographql.com/client
MIT License
19.33k stars 2.65k forks source link

Design for new updateQueries functionality #1224

Closed helfer closed 7 years ago

helfer commented 7 years ago

UpdateQueries currently doesn't support updating queries that are not actively watched (#1129). The following is a rough proposal for a new design of updateQueries that would bypass this problem by requiring the developer to provide the query to be updated via a GraphQL query document rather than just a reference by name.

before:

after:

The old approach has the downside that queries must be kept around even after there is no more active subscriber for them, because the cache would not be updated otherwise. In the new approach, this problem is avoided by requiring the query document to be provided along with the updateQueries function.

The main challenge of the proposed new approach is that a single field may have been written to the store with different sets of variables. It must therefore be possible to specify which sets of variables the update is for. This was not an issue with the old approach, because all possible sets of variables could simply be remembered, and updateQueries could be called once for each of them.

The new approach requires us to introduce the concept of wildcards, like ALL. Here's an example of what that could look like:

mutate({
  variables: ...,
  optimisticResponse: ...,
  updateQueries: [{
    query: gql`query fatQuery { … }`,
    variables: { x: 15, y: ALL}
    update: (previousResult, { mutationResult, variables, … }) => {
      // … do your stuff here
      return nextResult;
    }]
});

updateQueries would look through the store and find all sets of variables on fields existing in the store that match the specified wildcards (most likely by keeping an index of the set of variables a field in the store has been called with). The update function would be called once for each set of variables, and the results would be written back to the store.

It is up to the developer to ensure that each cache field is only updated once. If for instance an item is appended to a list, care must be taken not to append it once for each set of variables (of which there could be many).

The above approach to updating the store may not always be practical. A less flexible but easier-to-use approach would be to only call update once for the entire set of variables that matched a wildcard. In that case, a specific set of variables would have to be chosen from the set as representative for all possible combinations of variables in order to read the previousResult from the store. When writing the nextResult back into the store, all fields matching the wildcard variables would be updated.

The latter approach sounds like it might be workable, but I think there are too many corner cases that we're not anticipating at the moment, so it seems like it would be more prudent to go with the former approach, even though it is more difficult to use. Maybe there are things we could do to make the former approach more workable? Curious to hear your thoughts about this @calebmer @stubailo.


The reducer option to client.query would be deprecated, and its approximate functionality replaced with a global reducer, which could run on any action and update specific fragments in the store.

Here's what that could look like:

client.setGlobalReducer( action => {
  switch (action.type){
    case "...":
      client.store.writeNode({
        key: "user:5",
        fragment: gql`fragment xyz on User { name, age }`,
        data: { name: action.result.data.user.name, age: action.result.data.user.age }
      });
      break;
    default:
      // ...
  }
});

Combined with a function that tells you the set of variables a certain field in the store has been written with, this could be a sort of catch-all to update trees in the store that don't necessarily need to start at a root node.

booboothefool commented 7 years ago

@helfer

The old approach has the downside that queries must be kept around even after there is no more active subscriber for them, because the cache would not be updated otherwise. In the new approach, this problem is avoided by requiring the query document to be provided along with the updateQueries function.

^ I didn't realize that's how it works. Could the current way, keeping queries around, potentially cause performance issues? I noticed my React Native app actually gets slower with the amount of data I've fetched. https://github.com/apollostack/react-apollo/issues/418#issuecomment-273951195


Anyway, this all sounds great. With the current way, I actually get confused with the naming and am finding myself running into issues like the following:

I have a query where I am doing an initial prefetch:

query MyWeapons {
   axes {...}
   swords {...}
   daggers {...}
}

then I have 3 screens: Axes, Swords, Daggers, where I am only interested in the respective weapon type:

query MyAxes {
  axes {...}
}
query MySwords {
  swords {...}
}
query MyDaggers {
  daggers {...}
}

I find myself running into issues where doing stuff like:

  graphql(SHARPEN_AXE_MUTATION, {
     ...
        updateQueries: {
          MyAxes: (prev, { mutationResult }) => {

doesn't update the things I expect it to, but:

  graphql(SHARPEN_AXE_MUTATION, {
    ...
        updateQueries: {
          MyWeapons: (prev, { mutationResult }) => {

does! So I think just providing the query document, instead of the name, would help me out with this.

(Btw for now, based on what I just read in this thread, I'm actually holding on to MyWeapons, MyAxes, MySwords, MyDaggers, when all I really need is to reuse the same query name MyWeapons? So on the respective weapon type screens, e.g. on the axe screen, I should really be doing:

query MyWeapons {
  axes {...}
}

? Reusing the same query name so it doesn't hold on to an extra MyAxes for no reason? That's what I got out of this at least. Hopefully I won't be wondering anymore when I can just pass in the query document...

JesperWe commented 7 years ago

Hmm. And for those of us who think this looks like an over-engineered hard-to-maintain way of going about things, and would prefer just have refetchQueries come back to work properly as more sane DX, is there any hope for that to happen in addition to this?

armstrjare commented 7 years ago

Note that a somewhat related issue occurs with the reducer option for queries. If a query is created from the cache after a mutation has run, it will not apply the reducers. See #951

helfer commented 7 years ago

@booboothefool You should definitely not reuse the same query name for a different query, as that will cause problems if both queries are active at the same time. Also, it's possible that the old approach had some performance issues, but it's hard to tell for sure without measuring. @Poincare is doing some work in that direction

@JesperWe updateQueries and refetchQueries are not mutually exclusive. We could definitely bring refetchQueries back, but it would have to be in a different form, I guess. Essentially you would provide a fat query (instead of query names) as well as the particular set of variables you want to refetch.

@armstrjare I'm not exactly sure what you mean, since the issue you referenced is actually this very issue. Maybe you meant #1223?

armstrjare commented 7 years ago

@helfer oops, you're right. I meant #951

calebmer commented 7 years ago

A couple of thoughts.

What’s the advantage of an array here?

updateQueries: [{
  query: gql`query fatQuery { ... }`,
  variables: {
    x: 15,
    y: ALL,
  },
  update: (previousResult, { mutationResult, variables, ... }) => {
    // ... do your stuff here
    return nextResult;
  },
}]

Couldn’t it just be a single updateQuery object? Might be a little more ergonomic.

Also, what if we allowed functions for wildcard matching. So the definition of ALL would be:

const ALL = () => true

This allows users to do some more specific pattern matching against variables:

updateQueries: [{
  query: gql`query fatQuery { ... }`,
  variables: {
    x: value => [15, 30, 45].includes(value),
    y: ALL,
  },
  update: (previousResult, { mutationResult, variables, ... }) => {
    // ... do your stuff here
    return nextResult;
  },
}]

However, the more I think about the complexity of implementing wildcard variables like this the less I like the idea 😖

I knew we would need to track variables, but you introduced a problem I didn’t even think about by adding the requirement to call only once for each variable set.

This feature would require some major changes to the store to support variable tracking and some pattern matching logic. Not to mention that this approach is inefficient in some scenarios. Imagine a query like this:

query ($var: MyEnum) {
  counter
  a {
    b {
      c {
        d(var: $var) {
          e
        }
      }
   }
}

Where only d would be changing when we provide different variable sets. This may also be harder for a user to understand as if the update function increments the counter field every time it is run and there are 5 enum variants we know about then counter may increase 5 times in what would be a non-intuitive way. I agree that in this case should educate the user to not increase counter 5 times, but how do we enable the user to know when counter should be updated? Do we provide a firstUpdate argument to the update function? How much code branching would a user then need to add around a firstUpdate?

It might be worth exploring more imperative solutions instead of adding features/complexity to a declarative approach like this one.


What if we instead just expose the tools to allow users to interact with the store directly? So whenever a user mutates data they can also write code which looks like:

const fatQuery = gql`query fatQuery { ... }`;

const queryData = client.getQueryData(fatQuery, variables);

// Allow the user to mutate the query data however they want
// (we may want to freeze “escaped” JSON objects though.
queryData.x = 5;

client.setQueryData(fatQuery, variables, queryData);

We could also call the methods readQueryData and writeQueryData . In addition, we could provide methods like getFragmentData and setFragmentData to allow users to mutate data at arbitrary points in the store. Something that is not possible at all with the current updateQueries system and a problem that may be exacerbated by allowing arbitrary fat queries (imagine a user tries to update something they have not read to the store).

To support optimistic data we could do something like:

mutate({
   ...
  updateStore: (store, { mutationResult }) => {
    store.read(...);
    store.write(...);
  },
});

Where the methods in store change if we are applying optimistically or not.

JesperWe commented 7 years ago

Regarding updateQueries and refetchQueries: My worry, as I have mentioned in other places previously, is that any updateQueries approach forces me to duplicate business logic from the server: The client has to know what data the mutation might have changed, query for all possible changed stuff, and update the cache. This has two problems: It makes the data access component code rather verbose, and even more so with this suggestion. Also it couples client code to server code, if I change the servers actions I also need to update the client code, which is not at all a nice thing.

I did write one part of my app using updateQueries. I look at the code and cringe, thinking "Why am I using GraphQL again? This has less maintainability than REST."

Right now I am thinking: Maybe I should just stop bothering with all of this and implement my own refetchQueries using a couple of Redux props? Then I'd only be using Apollo as a thin GraphQL network layer and have nicely uncoupled code...

calebmer commented 7 years ago

@JesperWe I agree, but I think this is an inherent limitation of GraphQL. Every client needs to consider this case whether it be Apollo Client (web/iOS/Android) or Relay. We can’t get around the limitation without some specification on the server to expose metadata for this information. Yet even that would be a duplication of logic. The duplication would just be on the server 😖

Basically, the only free updates we can get from mutations are node-based updates (and even then we have to make up a specification!). We don’t have the same luxury for arrays without fetching the whole list.

I used to think this was a weakness of GraphQL, after all, Falcor does not have this problem at all. However, I’ve come around to the point where I start to see this as a strength of GraphQL (although I have been talking to Facebook/Apollo people about this so my conversations have been generally biased 😉). By requiring client users to manually write the updates it gives them much more power over the data that exists on the client. For example, what if one day a developer gets a new array item from the API and decides that because the device time is 3 pm the item should go at the top of the array instead of the bottom. By writing to the store directly developers have the power to do this. APIs like updateQueries allows product developers to make very specific decisions about how their product interacts with the API, which while messy, I believe makes for better products in the long term.

I agree that it really sucks for many developers, and it sucks for Apollo Client because it means we have to sacrifice some of our principles. For the majority of developers who just want to get up and running quickly writing updateQueries is inefficient. However, for a company at Facebook’s scale, this is a desirable property. If you want to get around this limitation I see a couple of options:

  1. Use Meteor. Everything is taken care of for you in that environment. I feel like I have to plug Meteor given Apollo’s origins 😊
  2. Invest in Netflix’s Falcor instead of GraphQL. This won’t get you very far, though. The properties of GraphQL make it way more powerful and viable long-term than Falcor.
  3. Adopt some patterns and specifications in your GraphQL API that makes updates automatic. This is kind of what Relay did and what Apollo Client did initially. I don’t think it’s a good idea for a general GraphQL client library, but it could be a great idea for your company (Relay mutations could be a great thing for Facebook given their schema conventions, but not so much for everyone else).

Let me know what you think and what your conclusion/solutions are. This is something I’ve thought a lot about in the past 👍

If you want to reach out personally with feedback send me an email: caleb@apollodata.com

JesperWe commented 7 years ago

@calebmer I've been a large scale Meteor app developer for 3 years, which is why this makes me cringe :-)

But again: The original ´refetchQueries´ (before #1129) was a working solution for not-so-massive and not-so-performance-sensitive apps. Just fixing that bug while moving forward with this proposal would give "the majority of developers who just want to get up and running quickly" an alternative.

Pajn commented 7 years ago

However, I’ve come around to the point where I start to see this as a strength of GraphQL

It is a strength if you need the power, but otherwise cumbersome. However with a powerful API in the bottom, you can always build abstractions/helpers atop of that. Three very basic usecases that is common in "get up and running quickly" and smaller apps where fetching everything is acceptable is append to this top level query and prepend to this top level query and delete from this top level query (or all queries if possible).

This concept could be implemented as functions that apollo provides that can generate the more complex updateQueries configuration. For example

mutate({
  ...,
  updateQueries: [appendTo('items'), prependTo('latestItems')]
  // or
  updateQueries: [deleteFrom('items'), deleteFrom('latestItems')]
  // or
  updateQueries: [deleteFromAllListings()]
});
Siyfion commented 7 years ago

Certainly this does need to be solved sooner rather than later, but I'm also very much in favour of a well-thought out, robust implementation. As @helfer has already stated the previous two issues (reducers re-running & then reducers never running) simply highlighted a fundamental issue with the way in which the store updates were being triggered.

I don't mind the idea of a "global reducer" that is responsible for all non-trivial store updates, though I do agree that it isn't perhaps the most elegant solution, it certainly would work and ensure that the issues of the past don't re-occur.

@helfer I know you'd originally mentioned having a sort of "lifecycle" timer on a reducer, I'm guessing that this idea was either more messy or much harder to implement than this "global" suggestion?

wmertens commented 7 years ago

I don't yet have an opinion about which flavor of updateQueries to implement, but I would like to see invalidateQueries which would be like refetchQueries except that it doesn't fetch until it's needed.

Furthermore, invalidateQueries should also accept variables and wildcards as you describe here, to specify exactly which data sets to invalidate.

Maybe this functionality could go in updateQueries, if you simply return false or a special symbol CLEAR it deletes the cache entry?

helfer commented 7 years ago

@wmertens yeah, that's a good idea! We might implement some imperative form of cache invalidation first (i.e. you give it a query or a fragment and tell it to invalidate everything touched by that query/fragment).

Siyfion commented 7 years ago

@calebmer's new PR over here: https://github.com/apollographql/react-apollo/pull/462 solved all of my outstanding issues. I would recommend that people pull the change locally and give it a test.

calebmer commented 7 years ago

Closing as we now have a mitigation for the problems with updateQueries (https://github.com/apollographql/react-apollo/pull/462) and a new imperative read/write API! (https://dev-blog.apollodata.com/apollo-clients-new-imperative-store-api-6cb69318a1e3#.33v3kjx3u)