ghostdogpr / caliban

Functional GraphQL library for Scala
https://ghostdogpr.github.io/caliban/
Apache License 2.0
945 stars 246 forks source link

Support for batching #40

Closed ghostdogpr closed 4 years ago

ghostdogpr commented 4 years ago

This article explains the problem faced by most graphql system: https://blog.apollographql.com/optimizing-your-graphql-request-waterfalls-7c3f3360b051

Interesting libraries to look at:

Some discussions are going on about this topic in the Caliban Discord.

fokot commented 4 years ago

Sangria has deferred resolvers https://sangria-graphql.org/learn/#deferred-value-resolution or Fetchers https://sangria-graphql.org/learn/#high-level-fetch-api

In your example I would like to return CharacterDetails instead of Characters. I do not want to calculate statistics everytime only when they are queried and I do want to calculate statistics for all returned characters with single query (not having N+1 problem). Look at Repo.statistics. I find solution with memoizing statistics call to repo. Is there an easier way to do that?

case class CharacterDetail(name: String, nicknames: List[String], origin: Origin, role: Option[Role], statistics: Task[Int])

object CharacterDetail {
  def apply(c: Character, statistics: Task[Int]): CharacterDetail =
    new CharacterDetail(c.name, c.nicknames, c.origin, c.role, statistics)
}

object Repo {
  def statistics(characterNames: List[String]): Task[Map[String, Int]] = ZIO.effect({
    println(s"statistics for ${characterNames.mkString("[", ",", "]")}")     characterNames.map(n => (n, n.length)).toMap
  })
}

  case class Queries(
    @GQLDescription("Return all characters from a given origin")
    characters: CharactersArgs => URIO[Console, List[CharacterDetail]],
  )

val interpreter = interpreter = graphQL(
        RootResolver(
          Queries(
            args => service.getCharacters(args.origin).flatMap(characters =>
              Repo.statistics(characters.map(_.name)).memoize
                .map(allStatistics =>
                  characters.map(c => CharacterDetail(c, allStatistics.map(_(c.name))))
                )
            ),
          ),
          Mutations(args => service.deleteCharacter(args.name)),
          Subscriptions(service.deletedEvents)
        )
      )
fokot commented 4 years ago

I'm playing with something like this, but I do not know if it is possible to memoize it and run only once. Yes and I know it is not correct as it is not part of single query resolution but global..

  def defer[A, B](f: List[A] => Task[Map[A, B]]): Task[A => Task[B]] = {
    for {
      ref <- Ref.make(List.empty[A])
    }
    yield (a: A) => ref.update(a :: _).flatMap(_ => ref.get.flatMap(f).memoize.flatMap(_.map(_(a))))
  }

  override def run(args: List[String]): ZIO[Environment, Nothing, Int] =
    (for {
      service <- ExampleService.make(sampleCharacters)
      statistics <- defer(Repo.statistics)
      interpreter = graphQL(
        RootResolver(
          Queries(
            args => service.getCharacters(args.origin).map(
                  _.map(c => CharacterDetail(c, statistics(c.name)))
                )
//            ),
//            args => service.findCharacter(args.name).map(_.map(c => CharacterDetail(c, )))
          ),
          Mutations(args => service.deleteCharacter(args.name)),
          Subscriptions(service.deletedEvents)
        )
      )

Now it runs for every character

statistics for [James Holden]
statistics for [Naomi Nagata,James Holden]
statistics for [Amos Burton,Naomi Nagata,James Holden]
statistics for [Alex Kamal,Amos Burton,Naomi Nagata,James Holden]
statistics for [Chrisjen Avasarala,Alex Kamal,Amos Burton,Naomi Nagata,James Holden]
statistics for [Josephus Miller,Chrisjen Avasarala,Alex Kamal,Amos Burton,Naomi Nagata,James Holden]
statistics for [Roberta Draper,Josephus Miller,Chrisjen Avasarala,Alex Kamal,Amos Burton,Naomi Nagata,James Holden]
ghostdogpr commented 4 years ago

@fokot with the current implementation I don't think it's possible to support batching, because it will execute each field in sequence, so you can't "wait" to receive all requests before executing it. I plan to change the Executor logic so that it will "prepare" an execution plan before running it, possibly applying optimizations like batching, caching, etc in an easy way.

yurikpanic commented 4 years ago

A simple resolver based approach is not sufficient for such use cases. Traversing the query tree just once top-down executing resolvers "on the way" is not enough, one may need multiple passes to optimize the underlying query (or queries).

E.g one top-down to gather the information about fields present in the query (e.g. names and types), one bottom-up to build an underlying backend[s] or database query. And perhaps one top-down more - to extract the information from this optimized query reply and put into corresponding field positions in the reply.

I've used sangria's QueryReducer for extracting some information from the whole query (in simple cases) or just traversed the query AST using matryoshka (in more complex ones). Then this information extracted from the whole query is used in reducers.

So, perhaps just having a query AST available is enough? Or a query AST + some support code to traverse it top-down and bottom-up?

ghostdogpr commented 4 years ago

@yurikpanic you're right. I'm working on a version that introduces an intermediate data structure that will allow optimizations. The execution will no longer be a single pass but something as you describe. In parallel @adamgfraser is working on an encoding for queries that will support caching, batching, etc. I expect my WIP query AST to evolve when we integrate his work. We might be able to expose a hook to let users define their own reducers as well.