Simplify code and data loading process by removing basic query diffing

apollographql / apollo-client

:rocket: A fully-featured, production ready caching GraphQL client for every UI framework and GraphQL server.

https://apollographql.com/client

MIT License

19.36k stars 2.66k forks source link

Simplify code and data loading process by removing basic query diffing #615

Closed stubailo closed 8 years ago

stubailo commented 8 years ago

Right now, we only have very rudimentary top-level query diffing, which will basically split up your root queries and only send them to the server if they aren't in the cache.

This is a very basic feature that is easily simulated by simply splitting up the query document into multiple queries, when you need them to be cached separately.

There are many benefits to removing this very small feature:

Queries in the code will exactly match what is sent to the server, making persisted queries achievable statically
No need to keep track of "minimized query" anymore, simplifying the redux store
Apollo client code will be much simpler, since it doesn't have to keep track of what data is missing anymore. This also lets us remove a bunch of code about filtering out unused variables, since that situation won't come up anymore.
It's more predictable what queries will get run, since what you see is what you get

This also, as a coincidence, makes it easier to re-implement store reading using graphql-anywhere, but that shouldn't be the only reason we do it.

As a side note, other clients such as Relay are likely to move in this direction (more static queries) as well, because of the tooling and predictability benefits it brings.

stubailo commented 8 years ago

Looking for feedback, @rricard, @jbaxleyiii, @abhiaiyer91, @helfer.

rricard commented 8 years ago

@stubailo this is something I was starting to think about. From our experience until now: query diffing blows up sometimes for some reasons (uncatched errors in a query followed by a mutation for instance, can't quite yet put my finger on it, otherwise I would have created an issue!). Next, the benefits of this feature are quite small in production today. But what I like here is the predictability, I think everything could be much better/efficient by simplifying those internals!

jbaxleyiii commented 8 years ago

@stubailo we also have seen an occasional issue with query diffing:

On the client:

fragment DefinedValues on DefinedValue {
  name: description
  value
  id
  _id
}

query GetCheckoutData($state: Int!, $country: Int!) {
  states: definedValues(id: $state, all: true) {
    ...DefinedValues
  }
  countries: definedValues(id: $country, all: true) {
    ...DefinedValues
  }
  person: currentPerson {
    firstName
    nickName
    lastName
    email
    campus { name, id: entityId }
    home { street1, street2, city, state, zip, country }
  }
  savedPayments {
    name, id: entityId, date,
    payment { accountNumber, paymentType }
  }
  campuses { name, id: entityId }
}

On the server:

query GetCheckoutData($state: Int!, $country: Int!) {
  person: currentPerson {
    firstName
    nickName
    lastName
    email
    campus {
      name
      id: entityId
    }
    home {
      street1
      street2
      city
      state
      zip
      country
    }
  }
}

fragment DefinedValues on DefinedValue {
  name: description
  value
  id
  _id
}

Which will result in Error: Fragment "DefinedValues" is never used.

The quick fix is to split these up into different queries (which would be best anyway) but I do think the diffing isn't nearly as big of a win as other parts of the project. Making it lighter and simpler is a big win in my book

stubailo commented 8 years ago

Yeah I think we started with the assumption that this would be a critical feature but it seems much less important several months later. And it has lots of potential bugs etc.

Jonas are going to work together to do some refactors next week I think.

rricard commented 8 years ago

This is cool, more reliability is always welcome!

deoqc commented 8 years ago

Just want to clear-up:

1 - Exactly the same query will find result in cache, right?;

2 - A subset of some previously query (therefore everything is in cache) will find the data in cache?

3 - A new query that intersects w/ previous query but is not contained (something in cache but not everything) will overfetch quering for everything?

rricard commented 8 years ago

I think doing this could kill an issue like #647 for example.

rricard commented 8 years ago

@deoqc

That would be cool to keep that yes
Not even sure we want that
That would mean this yes, but I'm not sure it'll be that bad, because now you usually only query one root field per query and so this is still clear, the current system does not diff what's non-root

stubailo commented 8 years ago

1 and 2 will happen. I'm 100% sure that it's important to be able to query subsets of queries.

3 is the one thing that will be removed, and that feature wasn't that sophisticated to start with.

deoqc commented 8 years ago

thx @rricard @stubailo

I may be doing things differently from everybody since I rely heavily on query diff. If 2 were dropped would break important things 😅 , while 3 being dropped is not as bad...

I rely on diffing to keep my components as decoupled as possible, especially when navigating between pages (somewhat detailed here).

ps: break refers to strategies for avoid network fetch and other optimizations not working;

ps2: AFAIK Relay have a sophisticated query diffing. Maybe this should come back down the road, but in a not obtrusive way as is now (complicating things and raising bugs all around);

rricard commented 8 years ago

ok, I get it now, yes 2 is quite important indeed

stubailo commented 8 years ago

AFAIK Relay have a sophisticated query diffing

Relay 1 was built around this, but not Relay 2 as far as I know. It's one of the major changes they have been talking about - making queries much more static.

stubailo commented 8 years ago

BTW I made pretty good progress on this task just now by deleting a ton of code: #693