morpheusgraphql / morpheus-graphql

Haskell GraphQL Api, Client and Tools
https://morpheusgraphql.com
MIT License
409 stars 63 forks source link

Batching of resolvers / data fetching #348

Closed smatting closed 1 year ago

smatting commented 4 years ago
type Deity {
  name: String!,
  children: [Deity]
}

type Query {
  deities(q: String) : [Deity]
}

Is it possible to implement resolvers that fetch all children for all the deities in one batch? Let's assume that you can resolve the list of all deities in one call, too.

Currently I only know how to write a lazy resolver for children which would be called for every Deity found in the resolver of deities(q: String).

Could it by done by customizing the Monad m maybe?

nalchevanidze commented 4 years ago

@smatting sorry did not got. can you give me an example?

nalchevanidze commented 4 years ago
type Query {
  deities(q: String) : [Deity]
}
data Deity = Deity
  { name :: Text   
  , children   :: [Deity]
  } deriving (Generic,GQLType)

importGQLDocumentWithNamespace "schema.gql"

rootResolver :: GQLRootResolver IO () Query Undefined Undefined
rootResolver =
  GQLRootResolver
    {
      queryResolver = Query {queryDeities},
      mutationResolver = Undefined,
      subscriptionResolver = Undefined
    }
  where
    queryDeity QueryDeityArgs {queryDeityArgsName} = pure [ Deity {name ="", children=[...Deity...] },....]

do you mean this?

smatting commented 4 years ago
type Place {
   name: String!
}

type Deity {
  name: String!,
  placesVisited: [Place]
}

type Query {
  deities(q: String) : [Deity]
}

Let's assume that I have to fetch the placesVisited from a separate database B and the resolver for deities(q: String) fetches 1000 deities from database A with one query. Then to resolve placesVisited I don't want to make 1000 queries to database B.

nalchevanidze commented 4 years ago
data DBDeity = DBDeity { 
    dbID :: ID,
    dbName :: Name
}

dbDeities :: IO [DBDeity]
dbDeities = ....

dbPlacesByVisitors :: [ID] -> IO (Map ID Place)
dbPlacesByVisitors ids = ....

buildDeity ::   Map ID Place -> DBDeity -> Deity
buildDeity pls DBDeity{ id , dbName }  =  Diety {   
      places = maybe [] <$> lookup id pls,
      name = dbName 
}

deitiesResolver :: IORes e [Deity]
deitiesResolver = lift $ do
    dts <- dbDeities
    pls <- dbPlacesByVisitors (map dbID dts)
    pure $ map (buildDeity pls) dts

one way is that i think.

nalchevanidze commented 4 years ago

but then you loose laziness. you ask places at db even if query does not needs it. may there is an automated way how they may batched, but i don't know yet.

nalchevanidze commented 4 years ago

Could it by done by customizing the Monad m maybe?

monad that bundles all your db requests in one request could be a solution. i am not quit familiar with Haxl but may it could help.

one more thing what i'm thinking is this #258 proposal of @theobat, he is writing library that generates sql queries from schema + request.

dandoh commented 4 years ago

@smatting I believe we have complete control over this. We can make a Deity resolver:

deityResolver :: (DeityData, Maybe [PlaceData]) -> ResolverM Deity
deityResolver (deity, maybePlaces) = 
    Deity {..., placeVisited = placeVisitedResolver, ...}
  where 
    placeVisitedResolver = do 
        places <- case maybePlaces of 
            Just ps -> ps
            _ -> getDBDeityPlaces deity 
        mapM placeResolver places    

Then we know we can prefetch places for deities wherever we see fits:

getDBDeitiesWithPlaces :: IO [(DeityData, PlaceData)]
getDBDeitiesWithPlaces = ...

deitiesResolver :: ResolverM [Deity]
deitiesResolver = do 
    deitiesWithPlaces <- fmap (mapSnd Just) getDBDeitiesWithPlaces
    mapM deityResolver deitiesWithPlaces

No more N + 1 query.

In terms of getting additional query information (such as visitedPlaces above), https://github.com/tomjaguarpaw/haskell-opaley is a good choice.

smatting commented 4 years ago

Thanks @dandoh! That's a cool trick to add optional prefetching to deityResolver, but deitiesWithPlaces is still loaded in every request, right? Your example solves the N+1 selects problem similar to @nalchevanidze 's example, but also sacrifices laziness of placesVisited (it's fetched even if not requested). Would be cool to have both:

  1. placesVisited is resolved lazily (places are fetched only when requested)
  2. No N + 1 selects
smatting commented 4 years ago

I accidentaly closed this issue. Re-opening.

dandoh commented 4 years ago
  1. placesVisited is resolved lazily (places are fetched only when requested)
  2. No N + 1 selects

That would be really cool. @nalchevanidze What is the state of #258 ?

Is something like this work?

class InterpreterAST k m e where 
  interpreterAST :: Monad m => (RootResCon m e query mut sub) => 
    GQLRootResolver (ReaderT AST m) e query mut sub -> k

Then with access to query/mutation AST we can decide to prefetch or not.

nalchevanidze commented 4 years ago

That would be really cool. @nalchevanidze What is the state of #258 ?

@dandoh, actally i am planning to add new feature getContext which will you give access to current internal state of resolving.

deitiesResolver :: IORes e [Deity]
deitiesResolver = do
    Context { selection } <- getContext
    dts <- lift dbDeities
    pls <- lift dbPlacesByVisitors selection (map dbID dts)
    pure $ map (buildDeity pls) dts

disadvantage is that resolver will depend on internal AST.

nalchevanidze commented 4 years ago

@dandoh, actally i am planning to add new feature getContext which will you give access to current internal state of resolving. see, #372

AnthonySuper commented 4 years ago

Currently we can sort of do preloading via the internal context. However, as you said, the internal context should be considered "unsafe" as it exposes internal details. Are there any plans for a more public version of this API? The basic case would be to have some sort of recursive query type that told you the fields requested and the arguments passed, so you could do preloading. Unfortunately I am having a lot of trouble coming up with a typesafe API to do so, but in principle one should be possible, I think?

nalchevanidze commented 4 years ago

@AnthonySuper what if we define function that can search in sub selections.

path :: Text -> Resolver o m Bool
-- so you can ask
somRes = do
    (shouldPrefetch :: Bool) <- path "field1.field2"
    ...
AnthonySuper commented 4 years ago

That would be a great start! Eventually I'd love to get something where we can get arguments to prefetch too, but messing around with that on my end has given me a nontrivial amount of trouble in figuring out how the types would work.

Herlevsen commented 4 years ago

The way some other graphql libraries solves batching, is by using the concept of a dataloader. The dataloader can batch requests to the same resources, and also handles caching to avoid unnecessary lookups. Facebooks implementation, for javascript/node is located here https://github.com/graphql/dataloader. They actually mention Haxl, and it seems to be the exact thing it was created for

AnthonySuper commented 4 years ago

Haxl definitely seems to be a good option. I'm currently toying around with a simpler version that uses laziness (and unsafeInterleaveIO) to rewrite the actions in the graph, which seems to work out okay—I'll probably write a blog post about it if it might help out other people.

nalchevanidze commented 4 years ago

@AnthonySuper great idea :) i would love to read it

theobat commented 4 years ago

@AnthonySuper what if we define function that can search in sub selections.

path :: Text -> Resolver o m Bool
-- so you can ask
somRes = do
    (shouldPrefetch :: Bool) <- path "field1.field2"
    ...

Regarding this, I believe we should simply implement a typeclass instance of a Tree for the SelectionSet type (such as this one for example), this would likely simplify a few internal operations (print the AST back to a doc etc) and give simple (read canonical) utility functions to manipulate the query's AST while retaining flexibility for the concrete (internal) representation. I can probably try a PR for this.

Also it enables not exporting the concrete Rep while giving freedom for the end-user by exposing the Tree typeclass and its operations (which was your concern I believe @nalchevanidze)

nalchevanidze commented 4 years ago

@theobat what incase of UnionSelection?

https://github.com/morpheusgraphql/morpheus-graphql/blob/68ac1a95d1238db5c23cd18f6ec6b70567c11ff1/morpheus-graphql-core/src/Data/Morpheus/Types/SelectionTree.hs#L35-L37

we may should add:

ps. i think we should rename getChildrenList -> getChildren. is kind a plural anyway.

theobat commented 4 years ago

Yep I'll change the name. As far as the union is concerned, is it actually separated in the selection set ? or mixed ? I think it'd be simpler to just give a getTypeName :: node -> TypeName kind of operation though...

nalchevanidze commented 4 years ago

sorry. but union types in selection does not work that way.

nalchevanidze commented 4 years ago

but we can define virtual type`SelectionNode' that can support it. another idea is.

getChildren ::  node -> [(Maybe ConditionTypeName, node)]
nalchevanidze commented 4 years ago

or just

getChildren ::  node -> Either [(ConditionTypeName, [node])] [node]
theobat commented 4 years ago

Sorry, I made a typo I meant getTypeName :: node -> Maybe TypeName. If the use case described is this:

{
  search(text: "an") {
    __typename
    ... on Human {
      name
      height
    }
    ... on Droid {
      name
      primaryFunction
    }
    ... on Starship {
      name
      length
    }
  }
}

Then it should work, and I find it "more general" in the sense that union type or not, every graphql node has a type it inhabits

Note: but maybe the name is not precise enough then, it should probably be called getParentTypeName or something like that

nalchevanidze commented 4 years ago

i meant union types in Morpheus GraphQL internaly union selection is not represented that way.

we you should wrap it as.

newtype SelectionNode = SelectionNode {  typeName:: TypeName, selection :: SelectionNode  }

and you need additionally the schema to get selection Type.

nalchevanidze commented 4 years ago

another solution can be:

getChildren ::  Maybe TypeName ->  node -> [node]

where Nothing means give me selection without any condition

dandoh commented 4 years ago

@nalchevanidze @theobat And discussed with @nalchevanidze,

How hard is it to make another intepretation: resolver by field instead of by type as that of mu-haskell, on top of what we had?

User {
  name: String
  dog: Dog
}

Dog {
  name: string
}

From this we can derive the Selection type family:

data Selection String = Bool -- For any scalar
data Selection User = { name : Selection String, dog : Selection Dog }
data Selection Dog = { name : Selection String }

Then each resolver will have the coressponding Selection as an argument. https://github.com/higherkindness/mu-haskell/issues/190

Edit: Union types should receive product of selection types as arguments I think?

nalchevanidze commented 4 years ago

@dandoh you should show me what you mean. please give a concrete example like @smatting

nalchevanidze commented 4 years ago

@smatting @dandoh @AnthonySuper @Herlevsen i think i will provide batching api similar to https://github.com/graphql/dataloader

russellmcc commented 4 years ago

Just wanted to chime in and say that I used fraxl, a clone of Haxl with more general types, to implement batching in my morpheus project. This worked really well for the purpose, and fulfills both requirements listed above (queries happen lazily, no N+1 problem).

To make it work, basically you define all your possible db queries in a GADT, then define a fetcher function that executes the queries. The fraxl library then calls the fetcher with groups of queries that can be batched together. fraxl uses the applicative <*> to determine what group of queries can be batched together. For my (simple) program it seemed to give good results without much tweaking.

I'm not ready to share the whole project at the moment but I can go in to more detail on how this works if people are interested. It also supports optional caching which is a nice speed up if you happen to re-query anything.

nalchevanidze commented 4 years ago

@russellmcc cool. can you add a example code to this project?

nalchevanidze commented 4 years ago

I know you are not ready to share your project, but if you come up with some basic example could help :)

russellmcc commented 4 years ago

Can do! Look for a PR sometime over the next week

zhujinxuan commented 3 years ago

There is also something related about batching and deferring evaluation. Suppose we have a chained relation in the graphql, we may want to defer the evaluation.

type User {
   name: String!
   articles: [Article!]!
}

type Article {
  content: String!,
  writer: User
}

I think that we can change the generated datatype as

data UserM m =    UserM {
   name: m Text,
   articles:  m [ArticleM m]
}

data ArticleM m = ArticleM {
   content: m Text,
   writer: m (UserM m)
}

In this way, we can ensure that the evaluation of one field does not necessarily trigger the evaluation of other field. For batching, it seems possible to have a dataloader in m as a ReadT monad or something alike to solve the problem.

nalchevanidze commented 1 year ago

@smatting your original request will be addressed in #786 with named resolvers