graphql-go / graphql

An implementation of GraphQL for Go / Golang
MIT License
9.92k stars 840 forks source link

Question: How to do batching? #235

Closed yookoala closed 6 years ago

yookoala commented 7 years ago

Lets say you have a list of movie, each movie have a list of actor, producer and director. And each actor have a list of movie their acted in. If we draw this like a tree, it would look like this:

Movies
 | - Movie
 |     | - Actors
 |     |     | - Director
 |     |     |     | - Movies
 |     |     |     |     | - Movie
 |     |     |     |     | - Movie
 |     |     | - Actor
 |     |     |     | - Movies
 |     |     |     |     | - Movie
 |     |     |     |     | - Movie
 |     |     | - Actor
 |     |     |     | - Movies
 |     |     |     |     | - Movie
 |     |     |     |     | - Movie
 |     |     | - Actor
 |     |     |     | - Movies
 |     |     |     |     | - Movie
 |     |     |     |     | - Movie
 | - Movie
 |     | - Actors
 |     | - Actors
 |     | - Actors
 | - Movie
 |     | - Actors
 |     | - Actors
 |     | - Actors

The current implementation of the query resolves each item separately. If each resolver were to resolve by a database query. Let the number of movies is N, actors have joined M movies in average, directors have P movies in average. Then the above graph will be consist of:

And the number of queries goes up exponentially.

From what I read in other source, NodeJS server implementations usually have a loader layer (such as facebook/dataloader to help reducing the number of data fetching query. This pattern rides on the execution sequence of NodeJS (resolves all promise before return).

So the above graph would result in this

The key is to gather all specify the entities to load when doing resolve, then resolve them in 1 single batch. Also with a load of caching, the number of entities to query can be even drive down.

The question is, how do you similar thing in graphql-go?

wzulfikar commented 7 years ago

.

edgard commented 7 years ago

@yookoala https://github.com/nicksrandall/dataloader

nubbel commented 7 years ago

@edgard Yes, it is possible to use dataloader with this library. However, the current implementation of graphql-go effectively limits the batch size to 1, defeating the purpose of dataloader. This is due to the fact, that the executor resolves the fields sequentially and synchronously.

Here's an example:

Movies
 | - Movie#1
 |     | - Actors: ActorsLoader.loadForMovie(#1)()
 | - Movie#2
 |     | - Actors: ActorsLoader.loadForMovie(#2)()

This would result in two batches ([#1] and [#2]), because resolvers must return a value, not "something that yields a value eventually" (such as a channel or a thunk). Even worse, each call to ActorsLoader.loadForMovie(movieID)() blocks for the specified batch wait duration. It adds up to N*wait.

We can solve this by either resolving fields in parallel (#132) or asynchronously (#213) as done in the reference implementation https://github.com/graphql/graphql-js.

Please correct me if I'm wrong.

chris-ramon commented 6 years ago

Thanks a lot for the great feedback guys! :+1: — I've created a new repo: graphql-dataloader-sample which shows how to do batching, closing this one.