Pass args to dataloader

NickDubelman commented 5 years ago

What is the best approach if I need to pass arguments to my dataloader?

I've seen 2 issues on here where people suggest using anonymous structs but that feels pretty janky for a few reasons.

I could also see a solution of just throwing it into context but that also doesn't feel right because we lose type safety.

Is there a better officially supported approach?

NickDubelman commented 5 years ago

Ahhh I think I figured out the ideal way to do this.

The basic concept is that when you retrieve the loader, you use a function that binds whatever args you need to the loader that gets created.

Here is my basic setup in case anyone else can benefit from this pattern:

type loaders struct {
        // regular loaders that dont need args
    FamilyByID        *generated.FamilyLoader 
    BugsByProducts    *generated.BugSliceLoader

       // loader that needs args
    BuildsByProjectID func(start, end *time.Time) *generated.BuildSliceLoader
}

func LoaderMiddleware(next http.Handler) gin.HandlerFunc {
    return gin.WrapF(
        http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            ldrs := loaders{}

            // set this to zero what happens without dataloading
            wait := 20 * time.Millisecond

            ldrs.FamilyByID = getFamilyByIDLoader(wait)
            ldrs.BugsByProducts = getBugsByProductsLoader(wait)

            ldrs.BuildsByProjectID = func(
                start, end *time.Time,
            ) *generated.BuildSliceLoader {
                return getBuildsByProjectIDLoader(wait, start, end)
            }

            dlCtx := context.WithValue(r.Context(), ctxKey, ldrs)
            next.ServeHTTP(w, r.WithContext(dlCtx))
        }))
}

func getBuildsByProjectIDLoader(
    wait time.Duration,
    start, end *time.Time,
) *generated.BuildSliceLoader {
    return generated.NewBuildSliceLoader(
        generated.BuildSliceLoaderConfig{
            Wait:     wait,
            MaxBatch: 100,
            Fetch: func(projectIDs []int) ([][]models.Build, []error) {
                builds, err := getBuildsBatch(projectIDs, start, end)
                if err != nil {
                    return nil, []error{err}
                }
                return builds, nil
            },
        },
    )
}

// GetLoadersFromContext gets the loaders object for the given context
func GetLoadersFromContext(ctx context.Context) loaders {
    return ctx.Value(ctxKey).(loaders)
}

then to pass the args:

func (r *systemResolver) Builds(
    ctx context.Context,
    obj *gqlmodels.System,
    start, end *time.Time,
) (*gqlmodels.BuildsConnection, error) {
    builds, err := dataloaders.
        GetLoadersFromContext(ctx).BuildsByProjectID(start, end).Load(obj.ID)

    ret := &gqlmodels.BuildsConnection{PageInfo: &gqlmodels.PageInfo{}}
    for i, b := range builds {
        ret.Edges = append(ret.Edges, &gqlmodels.BuildsEdge{
            Cursor: gqlmodels.EncodeCursor(i),
            Node:   &gqlmodels.Build{Build: b},
        })
    }
    return ret, err
}

NickDubelman commented 5 years ago

Oops, nevermind. With the approach I had, you create a new data loader each time which defeats the point

vektah commented 5 years ago

I think you can create a named struct in the same package and use that as your key type?

NickDubelman commented 5 years ago

Yeah but even if you can, that feels wrong because the parameters are for the entire request, not for each individual key. You could just use the 0th key's params but still doesn't seem like the right way to do it. I was able to get my above solution to work by doing something like this:

func LoaderMiddleware(next http.Handler) gin.HandlerFunc {
    return gin.WrapF(
        http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            ldrs := loaders{}

            // set this to zero what happens without dataloading
            wait := 50 * time.Millisecond

            ldrs.FamilyByID = getFamilyByIDLoader(wait)
            ldrs.BugsByProducts = getBugsByProductsLoader(wait)
            ldrs.BuildsByProjectID = func(
                start, end *time.Time,
            ) *generated.BuildSliceLoader {
                // Check to see if we have already created a loader with the given
                // start and end values for the current request context
                key := ctxKeyType{fmt.Sprintf("BuildsByProjectID-%v-%v", start, end)}

                ldr := r.Context().Value(key)
                if ldr != nil {
                    return ldr.(*generated.BuildSliceLoader)
                }

                ldr = getBuildsByProjectIDLoader(wait, start, end)
                r = r.WithContext(context.WithValue(r.Context(), key, ldr))
                return ldr.(*generated.BuildSliceLoader)
            }

            dlCtx := context.WithValue(r.Context(), ctxKey, ldrs)
            next.ServeHTTP(w, r.WithContext(dlCtx))
        }))
}

I think the ideal solution would be for the CLI to include a flag, like -arg (can use it multiple times), that would generate a dataloader with a fetch function signature that includes the passed args (as opposed to just keys). All of the generated functions (like Load()) would include this arg information.

Does this seem reasonable and feasible? If you think so, I could take a stab at it.

vikstrous commented 4 years ago

I don't think that anyone should want to pass in every data source a loader needs with every use of the loader. That should be one time wiring that's done at the same time and place where the context is bound to the loader: per request. It's unfortunate that the only way to pass the loaders into http handlers is through the context object. The context is the only per-request part of http loaders. With gqlgen, resolvers are created per-request, which makes them a natural place to attach loaders.

NickDubelman commented 4 years ago

I’m not sure what you mean by “every data source a loader needs.” The situation I am describing is as simple as “trying to data load something that needs arguments.” For example, imagine you have a field that you want to dataload, but that field has arguments that it needs in order to resolve. Like a user has events and you want to get all the events between now and next month.

Without even considering the feasibility, I think the most desirable dev experience would be for this CLI to allow the user to pass N “-arg flags” and the generated code will include these args.

vikstrous commented 4 years ago

Ah, I misunderstood the use case. I'm not sure if I fully understand yet though. Loaders are designed to be used when there's a known set of keys to query by. A query like "all events between x and y" is not a fixed data set. Trying to use a loader for that would break caching in the current implementation. I'm trying to wrap my head around the idea of using a loader-like pattern for more complex queries. Do you have some ideas for how caching and deduplication of requests might work in the general case? Or maybe in your specific case of a range query?

dcwangmit01 commented 4 years ago

Being able to pass additional args to the Load() function would enable support for paging or filtering.

@NickDubelman Your suggestion:

Without even considering the feasibility, I think the most desirable dev experience would be for this CLI to allow the user to pass N “-arg flags” and the generated code will include these args.

would certainly help my situation

lacendarko commented 3 years ago

Any updates for this?

Currently I've been doing this, but it's kinda hacky:

go run github.com/vektah/dataloaden ValueLoader 'interface{}' 'modulename/pkg/models.Value'

As interface I use a object with key and custom parameters

kim-hetrafi commented 3 years ago

Here is my solution for anyone who is still looking (too lazy to make a merge request)

NOTE: didn't exhaustively test or think about this, if there are certain conditions where issues such as race conditions could arise, my bad

EDIT this sample code is not example of the use case, a better use case would be where each key is expected to fetch an array on things, and this array might have a desirable filter tied to it

All this code lives in the dataloader package in my case, keep this in mind why this member capitalization works.

changes i made to the generated code (user.generated.go) are to:

The LoaderConfig to allow for Fetch function to accept arguments map[string]interface{} in its arguments
The Loader struct to also allow the fetch function to accept arguments map[string]interface{} in its arguments
The Loader struct to have a member called arguments map[string]interface{} to handle custom dataloader arguments
The end function to send to arguments to the fetch function


// UserLoaderConfig captures the config to create a new UserLoader
type UserLoaderConfig struct {
    // Fetch is a method that provides the data for the loader
    Fetch func(keys []string, arguments map[string]interface{}) ([]*model.User, []error)

    ...
}

...

// UserLoader batches and caches requests
type UserLoader struct {
    // this method provides the data for the loader
    fetch func(keys []string, arguments map[string]interface{}) ([]*model.User, []error)

    // how long to done before sending a batch
    wait time.Duration

    // this will limit the maximum number of keys to send in one batch, 0 = no limit
    maxBatch int

    // add arguments to apply to dataloader
    arguments map[string]interface{}

    ...
}

...

func (b *userLoaderBatch) end(l *UserLoader) {
    b.data, b.error = l.fetch(b.keys, l.arguments)
    close(b.done)
}

dataloader.go

const LoadersKey = "dataloaders"

type Loaders struct {
    Map map[string]interface{} // contains dataloaders by name
}

func For(ctx context.Context) *Loaders {
    // gets Loaders from context
}

user.loader.go

// fetches a specific dataloader from the Loaders context  
func (l UserLoader) UserLoader (arguments map[string]interface{}) (*UserLoader) {
    if value, ok := l.Map["UserLoader"]; ok {

        // apply arguments
        loader := value.(*GamePlatformLoader)
        loader.arguments = arguments

        return loader
    }

    // create new loader if doesn't exist yet
    loader := UserLoader{
        wait: 2 * time.Millisecond,
        maxBatch: 100,
        fetch: func (keys []string, arguments map[string]interface{}) ([]*model.User, []error) {
            // do your data loading shit here
        },
    }

    // apply arguments to loader
    loader.arguments = arguments

    l.Map["UserLoader"] = &loader

    return &loader
}

this code can then be used externally very similar to the original package as such:

import "dataloader"

...

dataloader.For(context).UserLoader(arguments).Load(userID)

ignore that mention vektah, apologies, the error is resolved with a small change how arguments are applied to the Loader

ericraio commented 2 years ago

You could make the key a different struct all together

//go:generate go run github.com/vektah/dataloaden UserSliceLoader  *github.com/vektah/dataloaden/example.UserParams []*github.com/vektah/dataloaden/example.User

import "dataloader"

...
params := &example.UserParams{/* add whatever args you want */}
dataloader.For(context).UserSliceLoader.Load(params)

vektah / dataloaden

Pass args to dataloader #42