graphql-nexus / nexus

Code-First, Type-Safe, GraphQL Schema Construction
https://nexusjs.org
MIT License
3.4k stars 275 forks source link

Advanced GroupBy/Aggregation in GraphQL (using Prisma) #1107

Open lloydrichards opened 2 years ago

lloydrichards commented 2 years ago

I have a use case at the moment where I need to build a query that can either return a list of objects or return a list of objects grouped and aggregated for a dashboard. While I could spend a lot of time building a unique -GroupedByObject for each one with its own unique AggregatedObject, I feel like there must be a better way of doing this. I did some research and can see examples of GraphQL schemas that provide a generic AggregationFunction to each class which solves the provided arguments.

In Nexus and code-first this seems to be a much harder ordeal due to the strict typing. I'll show below what my half solution is so far and then outline the problem I'm facing maybe someone knows a better way around this? or maybe I'm missing something

ideal.graphql

type Query {
  devices(..., _groupBy: [String!]): [Device!]
}
type Device {
  _aggregation: AggregationFunction
  ...
}
type AggregationFunction {
  count: Number
  avg(field: String): Aggregation
  max(field: String): Aggregation
  ...
}
type Aggregation {
  field: String
  value: String
}

DeviceType.ts

export const DeviceType = objectType({
  name: "Device",
  definition(t) {
    t.string("device_id");
    t.string("type");
    t.string("version");
    t.field("_aggregation", { type: "AggregationFunction" });
  },
});

export const DeviceQuery = extendType({
  type: "Query",
  definition(t) {
    t.list.field("devices", {
      type: "Device",
      args: {
        limit: intArg(),
        _groupBy: list(nonNull(stringArg())),
      },
      async resolve(_, { limit, _groupBy }, { prisma }) {
        if (_groupBy) {
          const grouped = await prisma.device.groupBy({
            by: _groupBy as Prisma.DeviceScalarFieldEnum[],
            _count: { _all: true },
            _avg: { device_id: true },
            _max: {
              device_id: true,
              type: true,
              version: true,
            },
            _min: {
              device_id: true,
              type: true,
              version: true,
            },
          });
          return grouped.map((g) => ({
            ...g,
            _aggregation: {
              count: g._count._all,
              max: { ...g._max, field: "", value: "" },   // <-- Pass all the fields to the aggregation object❓
              avg: { ...g._avg, field: "", value: "" },   // <-- Pass all the fields to the aggregation object❓
            },
          }));
        }
        const devices = await prisma.device.findMany({
          take: limit || undefined,
        });
        return devices;
      },
    });
  },
});

GenericTypes.ts

export const AggregationFunctionType = objectType({
  name: "AggregationFunction",
  definition(t) {
    t.bigInt("count"); , // <-- Works correctly 👍 
    t.field("max", {
      type: "Aggregated",
      args: { field: stringArg() },  // <-- Select the field from the object and return the value❓ 
      async resolve(par, args) {
        if (args.field) {
          return { field: args.field, value: (par as any).avg[args.field] };  // <-- How to access the full object passed❓
        }
        return null;
      },
    });
    ...
  },
});

export const AggregatedType = objectType({
  name: "Aggregated",
  definition(t) {
    t.nonNull.string("field");
    t.nonNull.string("value");
  },
});

Using Prisma for the database its fairly easy to solve the resolver for when the _groupBy argument is passed, however the issue I'm having is mapping the field to the argument provided to the Aggregated type. I've tried passing through all the values and the breaking all my type safety to access the object from within the resolver but this just ends up returning the empty string I provide in the query.

query Hardware($groupBy: [String!], $field: String) {
  devices(_groupBy: $groupBy) {
    type
    total: _aggregation {
      count
      maxVersion: max(field: $field){
        value
      }
      maxID: max(field: $field){
        value
      }
    }
  }
}
{
  "data": {
    "devices": [
      {
        "type": "LIFE_HIVE",
        "total": {
          "count": 1
          "maxVersion": {
             "value":""
          }
          "maxID": {
             "value":""
          }
        }
      },
      {
        "type": "LIFE_COMB",
        "total": {
          "count": 2
          "maxVersion": {
             "value":""
          }
          "maxID": {
             "value":""
          }
        }
      }
    ]
  }
}

The second issue is that in the Aggregated type I set it to String but in reality the fields could be returning back different types such as number, date, float etc But I'm not sure how to safely pass back the value as a generic 🤷 For the time being, I can work with the successfully returning _sum value but as I work further into this schema I would need to be able to great some additional aggregation functions between different queries.

If you have any ideas, please let me know as I'm loving making my schema with Nexus and Prisma, but am running into a few edge-cases that might make or break my application towards the end ❤️

santialbo commented 2 years ago

GraphQL doesn't have generics but you can probably come up with something quite like what you want building your own nexus plugin. Nexus plugin (among other things) are a way of creating types on the fly.

I have personally used the connectionPlugin as a base for some custom plugins. It's not easy and it's not documented but you can do crazy things.

Think something like

export const DeviceType = objectType({
  name: "Device",
  definition(t) {
    t.string("device_id");
    t.string("type");
    t.string("version");
  },
});

export const DeviceQuery = extendType({
  type: "Query",
  definition(t) {
    // Type DeviceAggregated is created behind the scenes
    t.aggregated("devices", {
      type: "Device",
      resolve: (...)
    });
  }
})
Frosty21 commented 2 years ago

Yeah as @santialbo mentioned this could be put into a extended connectionPlugin type as analyticsDevice multiple-connection-types. For the Aggregated you could use prisma.device.aggregate the problem is aggregate returns a number or null.

I'm kinda confused as to why your Aggregated type would be returning date or float?

lloydrichards commented 2 years ago

Thanks for your feedback, I will have a look later to see what making a nexus plugin will take. I definitely like the idea of being able to pass this between projects, but lack of documentation/examples is always red flag for me before falling down a rabbit hole 🐰

In the meantime I did make a workaround that is actually really nice so maybe if someone else stumbles on this usecase they can help expand it:

GenericTypes.ts

export const AggregationFunctionType = objectType({
  name: "AggregationFunction",
  definition(t) {
    t.bigInt("count");
    t.json("_raw");  // <----- store all the results in this _raw field
    t.field("avg", {
      type: "Aggregated",
      args: { field: stringArg() },
      async resolve(par, args) {
        if (args.field) {
          return { field: args.field, value: par._raw._avg[args.field] };  //  <------ access the field inside the _raw using the parent argument
        }
        return null;
      },
    });
    t.field("max", {
      type: "Aggregated",
      args: { field: stringArg() },
      async resolve(par, args) {
        if (args.field) {
          return { field: args.field, value: par._raw._max[args.field] };  //  <------ access the field inside the _raw using the parent
        }
        return null;
      },
    });
    t.field("min", {
      type: "Aggregated",
      args: { field: stringArg() },
      async resolve(par, args) {
        if (args.field) {
          return { field: args.field, value: par._raw._min[args.field] };  //  <------ access the field inside the _raw using the parent
        }
        return null;
      },
    });
  },
});

export const AggregatedType = objectType({
  name: "Aggregated",
  definition(t) {
    t.nonNull.string("field");
    t.nonNull.string("value");
  },
});

Here in the resolver I pass the results of the groupBy into a _raw property on the _aggregation object which I can then access based on the field argument I pass. It works really well for concisely pulling out just the right data I need on the client.

query Hardware($groupBy: [DeviceFieldEnum!], $field: String) {
  hiveDevices(_groupBy: $groupBy) {
    type
    total: _aggregation {
      count
      _raw
      max(field: $field) {
        field
        value
      }
    }
  }
}
{
  "data": {
    "hiveDevices": [
      {
        "type": "LIFE_HIVE",
        "total": {
          "count": 5,
          "_raw": {    // < ------ the _raw contains the whole aggregation
            "_count": {
              "_all": 1
            },
            "_avg": {
              "device_id": 2
            },
            "_max": {
              "timestamp": "2022-06-29T09:13:11.643Z",
              "version": "MK5"
            },
            "_min": {
              "timestamp": "2022-06-27T12:10:09.345Z",
              "version": "MK4"
            },
            "type": "LIFE_HIVE"
          },
          "max": {
            "field": "version",
            "value": "MK5"
          }
        }
      },
      {
        "type": "LIFE_COMB",
        "total": {
          "count": 2,
          "_raw": ... // <  ------ minimized
            "type": "LIFE_COMB"
          },
          "max": {
            "field": "version",
            "value": "MK1"
          }
        }
      }
    ]
  }
}