Support GetMany federated entities in entity resolvers

MiguelCastillo commented 2 years ago

What happened?

Currently, federated entity resolution is done in go routines, and resolution is 1 entity at a time but in parallel. This optimization works well! But it requires the use of dataloader/batchloader sort of abstraction in order to be able to then process these entities together in client code, which adds complexity in both logic and often tests.

At Khan Academy, for example, we have a lot of places where we need to:

Call to other services (cannot be federated)
ACL checks.
Process datastore calls for many items.

What did you expect?

Entity resolvers are provided the entire list of representations that need to be resolver upfront.

Minimal graphql.schema and models to reproduce

It would be quite helpful to get a list of representations up front to avoid the need for dataloader/batchloader type of abstraction entirely. And ideally, any approach here would be opt-in.

There are multiple ways to tackle this, and I list a couple below. But I'm sure there are others yall can think of and I would love to hear them.

Add option in the qqlgen.yaml configuration to specify which entities support processing in batches. This one can be complicated because the current federated plugin (and plugin system) heavily relies on injecting GraphQL in the build process so that exec.go can create the correct type Entity interface in Go. In order to support this, we would need to add another interface to plugins that get called right around here https://github.com/99designs/gqlgen/blob/59a30919a8d8a9b67972cc7e4dd1e425901f15c2/api/generate.go#L76 where the plugin interface is called with Data so that it can do extra processing before calling codegen.GenerateCode.
Add a directive like @entityResolver where we can specify right in the schema if a particular entity resolver supports a list of representations. This one is the more straightforward approach I found, and I have a working set of code changes.
Others?

Some of the complexities are (unless stated, these have been sorted out in a working implementation):

Ensuring that entity resolvers can handle heterogeneous lists.
Supporting entity resolvers that can take a list of representations and it coexist with entity resolver that only supports single entities the way we have right now including resolving each federated entity in go routines.
And perhaps support returning a list of errors in cases where say 3 of 10 entities fail resolution and each needs its own error. This can be complicated. (NOT currently supported - I'm assuming and all or nothing success/failure for now but will reassess based on need).
Resolve groups of heterogeneous representations in go routines so that there is no blocking when resolving entities at any level.

versions

gqlgen version v0.11.0
go version 1.15
dep or go modules - go modules

If there is interest in something like this, I'm happy to push up to a branch the changes to support option 2 above. Or explore other ideas as well! But roughly, client-side code is more or less like this right now:

schema.graphql

directive @entityResolver(multi: Boolean) on OBJECT

type RaceCar @key(fields: "id") @entityResolver(multi: true) {
    id: ID!
}

Entity resolver

func (r *entityResolver) FindManyRaceCarsByID(
    ctx context.Context,
    reps interface{},
) ([]*graphql.RaceCar, error) {
    recaCarsRepresentations := reps.([]struct {
        ID        string
    })

        // Do the work here and return the list of RaceCars
}

vektah commented 2 years ago

Thanks for writing this up. If you already have something working it would help visualise this.

func (r *entityResolver) FindManyRaceCarsByID(
    ctx context.Context,
    reps interface{},
) ([]*graphql.RaceCar, error) {
    recaCarsRepresentations := reps.([]struct {
        ID        string
    })

This interface is pretty weakly typed. If you know all the keys for the entity why not generate something like:

FindManyRaceCarsById(ctx context.Context, ids []struct { ID string } )
and 
FindManyRaceCarsByPlateNumber(ctx context.Context, plates []struct { PlateNumber string } )

Ensuring that entity resolvers can handle heterogeneous lists.

Do you mean heterogeneous on the return type? I can definitely see uses for this when multiple types share the same id / table, eg a chat event that could be many concrete types, though I'm not sure if I have a concrete example where the id is known but the type isnt. Is this a micro optimisation?

We also want to roll dataloaden into the core execution engine at some point, so we can avoid timeout based dispatch. The interfaces required by that would be very similar to this, theres probably a bit of crossover.

MiguelCastillo commented 2 years ago

@vektah thanks for taking a look! Apologies in advance for repeating stuff you already know, Im just covering as much context as I can for a more complete conversation.

FindManyRaceCarsById(ctx context.Context, ids []struct { ID string } )
and 
FindManyRaceCarsByPlateNumber(ctx context.Context, plates []struct { PlateNumber string } )

Yeah that is absolutely preferable and definitely the API I'm after as well. I didn't find an easy way to generate anonymous structs because entity resolvers are derived from the generated federation/entity.graphql schema here [1] and I am not quite sure yet how to achieve that without adding potentially adding a way for plugins to process codegen Data before calling codegen.GenerateCode. If you have thoughts about implementation, I'd love to hear them! In the meantime I am injecting input types in the federation plugin so that instead of interface{} we get some autogenerated struct like below.

func (r *entityResolver) FindManyRaceCarsByID(
    ctx context.Context,
    reps []*graphql.EntityResolverFindManyRaceCarsByIDInput,
) ([]*graphql.RaceCar, error) {
        // Do the work here and return the list of RaceCars
}

Ensuring that entity resolvers can handle heterogeneous lists.

Yeah that's my bad. What I mean is that representations can be heterogeneous, so the federation runtime needs to call the corresponding entity resolvers with the list of representations for the correct __typename... I will update the description.

We also want to roll dataloaden into the core execution engine at some point, so we can avoid timeout based dispatch. > The interfaces required by that would be very similar to this, theres probably a bit of crossover.

Definitely in principle there is crossover because in the end entity resolvers are just resolvers, and ideally they would have the same signature. Tho I haven't wrapped by head around all the nuances of implementing this for regular field resolvers as well, this has definitely been in my radar because at KA we have several places where field resolvers have dataloaders to address the same fundamental issue. So Im all for also solving for this as well.

Besides perhaps sharing code for generating the structs for the resolvers and ensuring a consistent resolver signature, Im not entirely how much entity resolvers and field resolvers can share right now. But if you have more specific details, I'd love to hear them and discuss them.

https://github.com/99designs/gqlgen/blob/59a30919a8d8a9b67972cc7e4dd1e425901f15c2/plugin/federation/federation.go#L87

MiguelCastillo commented 2 years ago

I will push up my changes to a branch sometime today to have a more concrete space to talk about implementation details.

clayne11 commented 7 months ago

It looks like this is solved, right? Can we close this?

99designs / gqlgen