graph-gophers / dataloader

Implementation of Facebook's DataLoader in Golang
MIT License
1.2k stars 75 forks source link

Can I use this with gqlgen? #79

Closed frederikhors closed 2 years ago

frederikhors commented 3 years ago

Can I use this with gqlgen?

tonyghita commented 3 years ago

Sure

frederikhors commented 3 years ago

Is there an example? Do I need a middleware?

frederikhors commented 3 years ago

I am talking about this: https://gqlgen.com/reference/dataloaders/

zenyui commented 3 years ago

@frederikhors I just got it working with gqlgen! let me clean up the sample and i'll post it somewhere public.

frederikhors commented 3 years ago

@frederikhors I just got it working with gqlgen! let me clean up the sample and i'll post it somewhere public.

Please! Thanks!

zenyui commented 3 years ago

@frederikhors here it is! https://github.com/zenyui/gqlgen-dataloader. I tried to make the project structure a little more realistic than the standard GqlGen TODO app... let me know if it makes sense to you.

IMO, there is no need for code generation in data loaders when you use gqlgen, so I far prefer using this dataloader implementation.

frederikhors commented 3 years ago

@zenyui I will try it in a few hours. You're amazing! Thanks!

zenyui commented 3 years ago

@zenyui I will try it in a few hours. You're amazing! Thanks!

@frederikhors lmk, and then let's pitch this as an alternative in the GqlGen docs :)

frederikhors commented 2 years ago

@zenyui I'm starting the test in a few hours (really this time). Can you please help me understand why the code generation is not needed here? It's using reflection?

zenyui commented 2 years ago

I think it comes down to separation of concerns:

In my example, I happened to make my dataloader reply with a models.User, so I see why you're asking if I could have generated it.

Consider the following example:

type User {
  id: String!
  name: String!
}

type Todo {
  id: String!
  users: [User!]!
}

Query {
  ListTodos(): [Todo]!
}

And also imagine you had an UserService that returned some alternate RemoteUser struct.

A reasonable data loader interface might be:

type UserServiceDataloader interface {
    // GetUsersByID queries many RemoteUsers by ID
    GetUsersByID(ids []string) []RemoteUser
}

and in your resolver, you would need to implement a transformation function of signature:

func(RemoteUser)models.User
frederikhors commented 2 years ago

I'm studying your example (thanks again!).

One thing I can't find and can't understand is how and if you can handle options like "maxBatch" and "wait" as you can do with dataloaden, example:

func (d Dataloader) LoaderUserByTodoID(ctx context.Context) Loader_User {
    return Loader_User{
        maxBatch: 100,
        wait:     1 * time.Millisecond,
        fetch: func(ids []int) ([]User, []error) {
            // get them...
            return results, []error{err}
        },
    }
}

Is it possible?

zenyui commented 2 years ago

@frederikhors yes, you can pass in Options when you instantiate a new dataloader with NewBatchedLoader (docs). In my sample, I'm invoking it here, but I'm not passing fancy options as this is just a sample.

I think the options you want are:

Here is the implementation: https://github.com/graph-gophers/dataloader/blob/v5.0.0/dataloader.go#L108-L137

frederikhors commented 2 years ago

Ok. Really thanks.

Something that scares me in your example is executing NewDataLoader() on every API request.

Can't it be avoided? In the case of many dataloaders it can be really heavy.

frederikhors commented 2 years ago

And I think we miss ctx: ctx in newDataLoader(), am I right?

https://github.com/zenyui/gqlgen-dataloader/blob/main/graph/dataloader/dataloader.go#L41-L43

zenyui commented 2 years ago

In this case, a DataLoader is not actually heavy, as it's just an adapter to the user storage interface. The point of this thing is for all the resolvers in a single request to consolidate/coordinate their calls for users. If you did have a heavy dependency that you wanted to pass into your DataLoader, though (or something like an API client/DB client connection pool), you could instantiate that first then pass it into your data loader implementation

zenyui commented 2 years ago

And I think we miss ctx: ctx in newDataLoader(), am I right?

https://github.com/zenyui/gqlgen-dataloader/blob/main/graph/dataloader/dataloader.go#L41-L43

https://pkg.go.dev/github.com/graph-gophers/dataloader#NewBatchedLoader

Not sure what you mean.

zenyui commented 2 years ago

@frederikhors let's hop on a call?

frederikhors commented 2 years ago

Not sure what you mean.

func NewDataLoader(ctx context.Context, db storage.Storage) *DataLoader {
    // instantiate the user dataloader
    users := &userBatcher{db: db}
    // return the DataLoader
    return &DataLoader{
        userLoader: dataloader.NewBatchedLoader(users.get),
    }
}

ctx in this func is not assigned, right?

zenyui commented 2 years ago

NewBatchedLoader accepts a BatchFunc, which i implemented in users.get... that is the function that accepts a context, and graph-gophers/dataloader passes it for us.

I realize now that I didn't need a ctx at all on that constructor. I'll remove it.

frederikhors commented 2 years ago

If you don't need it then you don't need to call it like this: loaders := NewDataLoader(r.Context(), db) in:

func Middleware(db storage.Storage, next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        loaders := NewDataLoader(r.Context(), db)
        nextCtx := context.WithValue(r.Context(), loadersKey, loaders)
        r = r.WithContext(nextCtx)
        next.ServeHTTP(w, r)
    })
}

Right?

frederikhors commented 2 years ago

Or at least not on each call... I think.

zenyui commented 2 years ago

yep, it's not used. that ctx argument can be removed from my NewDataLoader function.

frederikhors commented 2 years ago

I removed it here https://github.com/zenyui/gqlgen-dataloader/pull/1.

zenyui commented 2 years ago

@frederikhors I messaged you on the gqlgen discord. might be easier for us to chat there. https://discord.gg/DYEq3EMs4U

zenyui commented 2 years ago

Or at least not on each call... I think.

One thing to keep in mind is you do need a new dataloader for each request so that it is isolated from other requests. if you want to share a cache, etc, between requests, you can share it among many dataloader instances.

frederikhors commented 2 years ago

Ok. So the last PR was wrong, right? I need isolated middlewares...

But what if it's just methods that return results? Even if I instantiate it only once and not at each call, what can happen? In the end I am calling DB methods each time different for each call, nope?

zenyui commented 2 years ago

Ok. So the last PR was wrong, right? I need isolated middlewares...

But what if it's just methods that return results? Even if I instantiate it only once and not at each call, what can happen? In the end I am calling DB methods each time different for each call, nope?

Let's get on a call so I can walk you through it. It took me a while to understand how the dataloader interacts with a request and the resolvers.

frederikhors commented 2 years ago

Ok. So the last PR was wrong, right? I need isolated middlewares... But what if it's just methods that return results? Even if I instantiate it only once and not at each call, what can happen? In the end I am calling DB methods each time different for each call, nope?

Let's get on a call so I can walk you through it. It took me a while to understand how the dataloader interacts with a request and the resolvers.

I cannot call right now. Can we please continue here? Please... 🙏

In general, why should I create a dataloader for each request if all I do is call a method on the db with different parameters each time?

zenyui commented 2 years ago

Ok. So the last PR was wrong, right? I need isolated middlewares... But what if it's just methods that return results? Even if I instantiate it only once and not at each call, what can happen? In the end I am calling DB methods each time different for each call, nope?

Let's get on a call so I can walk you through it. It took me a while to understand how the dataloader interacts with a request and the resolvers.

I cannot call right now. Can we please continue here? Please... 🙏

In general, why should I create a dataloader for each request if all I do is call a method on the db with different parameters each time?

I would prefer at least text chatting on discord.

I think you're missing the core point of a dataloader, as if it was ONLY doing that dynamic parameter injection, you would not need a dataloader at all.

A dataloader consolidates data requests from remote data sources to avoid requesting the same remote entity multiple times. The typical use case is to consolidate data requests for a single graphql query, not to do so across requests (as that would be basically coupling two requests). If that's what you want, I think you're thinking of this more like a cache, which you can indeed share among data loader instances.

Remember, if you have 100 Todo instances that all share the same 10 User instances, the data loader will consolidate all 100 "lookup" calls into a single batched call for the 10 distinct users. The clever thing is that it reduces the number of calls to the user service, not that it parameterized the call.

frederikhors commented 2 years ago

I think we can close this. Thanks again @zenyui!

frederikhors commented 2 years ago

@zenyui I think we should add a LoadMany example in your amazing demo project: https://github.com/zenyui/gqlgen-dataloader.

Example:

  1. Get all Players for a Team ID
  2. Get all Players for many Team IDs
zenyui commented 2 years ago

@frederikhors let's move this to an issue on the sample repo.

frederikhors commented 2 years ago

https://github.com/zenyui/gqlgen-dataloader/issues/8