Closed AJMiller closed 4 years ago
Running into something similar. I need my graphql context to resolve my loader func. Since I'm scoping my loaders to each request, this context would be the same for the lifetime of the loader, so maybe it's something that could be passed in at construction?
I have a similar issue. I want to pass in to the batch function which mysql connection or transaction instance to use, and whether to lock the db rows.
Was anyone able to find a good solution to passing in arguments into the DataLoader?
Okay I was able to solve this in a round-about kind of way (I'm using typeorm and type-graphql)
I wanted to be able to have pagination support (so skip and limit) plus an additional where query. I also wanted to be able to do it with nested resolvers. So my test case was something like this:
query Dashboard($loginDate: String, $jobStatus:String, $year:Int) {
Users(limit: 10, skip: 5, where: "LastLoginDate > :loginDate'", vars:{loginDate:$loginDate} {
Name
Country
ActiveJobs(limit: 10, skip:0, where:"JobStatus = :jobStatus AND YEAR(DateIn)=:year", vars:{jobStatus:$jobStatus, year:$year}) {
JobNumber
Customer {
Name
PhoneNumber
}
}
}
}
Firstly I wasn't able to get pagination working with any resolvers that use the DataLoader unless I can find a way to use typeorms querybuilder to allow me to GROUP BY and LIMIT results per group. But I still had support for filtering using a where clause.
So I ended up getting a query like this working:
query Dashboard($loginDate: String, $jobStatus:String, $year:Int) {
Users(limit: 10, skip: 5, where: "LastLoginDate > :loginDate'", vars:{loginDate:$loginDate} {
Name
Country
ActiveJobs(where:"JobStatus = :jobStatus AND YEAR(DateIn)=:year", vars:{jobStatus:$jobStatus, year:$year}) {
JobNumber
Customer {
Name
PhoneNumber
}
}
}
}
I also got this working using a single generic dataloader without having to write batchload functions for every separate resolver. It's not very optimized, but it might help someone:
index.ts
import "reflect-metadata";
import { ApolloServer } from "apollo-server-express";
import * as Express from "express";
import { createConnection } from "typeorm";
import { createSchema } from "./utils/createSchema";
import { loader } from "./loaders/loader";
const main = async () => {
const schema = await createSchema();
const connection = await createConnection();
const apolloServer = new ApolloServer({
context: ({ request, response }: any) => ({
request,
response,
loader: loader()
}),
schema
});
const app = Express();
apolloServer.applyMiddleware({ app });
app.listen(4000, () => {
console.log("Server Started on http://localhost:4000/graphql");
});
};
main();
loader.ts
import * as DataLoader from "dataloader";
import { In, getRepository } from "typeorm";
import * as GraphQLJSON from "graphql-type-json";
interface ArgList {
entity?: any;
key?: string;
where?: string;
vars?: GraphQLJSON;
id: number;
}
const batchLoad = async args => {
const ids = args.map(arg => arg.id);
const { where, key, entity, vars } = args[0];
const results = await getRepository(entity)
.createQueryBuilder()
.where({ [key]: In(ids) })
.andWhere(where ? where : "1=1", vars ? vars : {})
.getMany();
const resultsMap: { [key: number]: [typeof entity] } = {};
results.forEach(row => {
if ((row[key] as number) in resultsMap) {
resultsMap[row[key] as number].push(row);
} else {
resultsMap[row[key] as number] = [row];
}
});
return ids.map(id => resultsMap[id]);
};
export const loader = () => {
return new DataLoader<ArgList, any>(batchLoad);
};
This is a snippit from my User Entity which resolves the ActiveJobs
@Field(() => [Job], {
nullable: true
})
async ActiveJobs(
@Ctx() { loader }: BaseContext,
@Arg("where", { nullable: true }) where: string,
@Arg("vars", () => GraphQLJSON, { nullable: true }) vars: GraphQLJSON
) {
return loader.load({
where,
vars,
id: this.id, // Primary Key
key: "userId", // Foreign Key
entity: Job // Return Entity
});
}
This is a snippit from my Job Entity which resolves a single Customer (which is also a User Entity)
@Field(() => User)
async Customer(
@Ctx() { loader }: BaseContext,
@Arg("where", { nullable: true }) where: string,
@Arg("vars", () => GraphQLJSON, { nullable: true }) vars: GraphQLJSON
) {
return (await loader.load({
where,
vars,
id: this.id, // Primary Key
key: "customerId", // Foreign Key
entity: User // Return Entity
}))[0]; // We still return an array but we only want the first (and only) index..
}
Lastly, if anyone has a better way of doing this i'd love to hear it.
I went for a solution like this;
In my case I have batch functions that are curied with the database connection to use, like so,
const batchFunctions = {
myCuriedBatchFn: (dbConnection) => ids => { /* db fetching logic with dbConnection */ }
}
And on each request, when I do is that I map the batchFunctions object into a dataloader object with this function:
const createLoadersFromBatchFunctions = (batchFunctions, options) =>
Object.entries(batchFunctions).reduce((ack, [ fnName, fnBody ]) => {
let fnRef; // This function ref will be updated on each call
let dataloader = new DataLoader(ids => fnRef(ids), options);
return {
...ack,
[fnName]: (...params) => {
fnRef = fnBody(...params);
return dataloader;
}
};
}, {});
@RAMPKORV can you explain a bit how it works?
I wasn't able to get it to work.
We could have a set of batch functions like this.
const batchFunctions = {
getBooks: (con) => async (ids) => {
let [ rows ] = await con.query('SELECT * FROM books WHERE id IN ? ORDER BY FIELD(id, ?)', ids, ids);
return rows;
}
}
To turn it into an object of data loaders we do
let loaders = createLoadersFromBatchFunctions(batchFunctions, {})
And then we can load books in this manner;
let bookFromOldDb = await loaders.getBooks(legacyDbConnection).load(6);
let bookFromNewDb = await loaders.getBooks(newDbConnection).load(4);
How it works? When the dataloader is created, we create a batch function with ids => fnRef(ids)
, but fnRef
will be created at runtime when we run something like loaders.getBooks(newDbConnection)
@RAMPKORV thank you. lemme try it again.
Why isn't there an option to pass in a contextual object? java-dataloader
has both a generalized version that applies to all loads on a data loader as well as a per-object context that can be passed into a load call.
Why isn't there an option to pass in a contextual object?
Typically DataLoader instances are creates as part of a request context, in which case other elements of a context can be referenced directly without the need to pass them in. Passing them in by call would require fairly complex logic to partition batches based on equivalent contexts. Instead it's preferred to just create a new instance per context.
To the original question, DataLoader expects a strict key -> value relationship and does not support additional arguments. If additional arguments are needed, they can be considered part of the key (example, { id: 1, withCommentType: 'pending' }
) so they can be considered different keys (and thus cached differently) from keys that might represent a similar object with different arguments.
Alternatively (as well discussed above) if there are a small number of potential values for an argument, multiple DataLoaders can be created, one per potential value. However this may depend on your application domain.
So there is actually a sneaky solution to this :)
Assume in this example that we are using sequalize and graphql
// The ids here are objects not ids :)
const batchGetStatusById = async ids => {
// The goal was to pass down sql
return ids.map(({ id, sqlArgs }) => models.Status.findByPk(id, sqlArgs));
};
/**
* Input ke is complex object so we can pass down sql parameters :)
* however... we need to teach the dataloader that it should cache on the sql options && the id
*/
const cacheKeyFn = key => {
return JSON.stringify(key);
};
const options = {
cacheKeyFn,
};
const dataloaders = () => {
return {
status: new DataLoader(batchGetStatusById, options),
};
};
Now you have your sequal arguments on a per request basis and only if the sql query and id are the same will it pull from cache
I just copy pasted from stackoverflow
// This function creates unique cache keys for different selected
// fields
function cacheKeyFn({ id, fields }) {
const sortedFields = [...(new Set(fields))].sort().join(';');
return `${id}[${sortedFields}]`;
}
function createLoaders(db) {
const userLoader = new Dataloader(async keys => {
// Create a set with all requested fields
const fields = keys.reduce((acc, key) => {
key.fields.forEach(field => acc.add(field));
return acc;
}, new Set());
// Get all our ids for the DB query
const ids = keys.map(key => key.id);
// Please be aware of possible SQL injection, don't copy + paste
const result = await db.query(`
SELECT
${fields.entries().join()}
FROM
user
WHERE
id IN (${ids.join()})
`);
}, { cacheKeyFn });
return { userLoader };
}
// now in a resolver
resolve(parent, args, ctx, info) {
// https://www.npmjs.com/package/graphql-fields
return ctx.userLoader.load({ id: args.id, fields: Object.keys(graphqlFields(info)) });
}
I just asked openGPT for a solution and it give me this code
here is what I said
`
write a graphql server schema and resolvers for
- company that have name , owner
- owner that have name, job Title , company
write a resolver to retrieve companies and other one to retrieve owners
u must use dataLoader in dynamic way and database must be sql with sequlize lib
can u use single dataLoader for both company and owner with deferent batch function
`
here is important part :
const entityLoader = new DataLoader(async (keys) => {
const entities = await Promise.all([
Company.findAll({
where: { id: keys.filter((key) => key.type === 'Company') },
}),
Owner.findAll({
where: { id: keys.filter((key) => key.type === 'Owner') },
}),
]);
return entityLoader.load({ type: 'Owner', id: company.ownerId });
and this was the full answer
`
const { DataLoader } = require('dataloader');
const { sequelize } = require('sequelize');
const Company = sequelize.define('Company', {
name: {
type: sequelize.STRING,
allowNull: false,
},
});
const Owner = sequelize.define('Owner', {
name: {
type: sequelize.STRING,
allowNull: false,
},
jobTitle: {
type: sequelize.STRING,
allowNull: false,
},
});
Company.hasOne(Owner);
Owner.belongsTo(Company);
const entityLoader = new DataLoader(async (keys) => {
const entities = await Promise.all([
Company.findAll({
where: { id: keys.filter((key) => key.type === 'Company') },
}),
Owner.findAll({
where: { id: keys.filter((key) => key.type === 'Owner') },
}),
]);
const companies = entities[0];
const owners = entities[1];
return keys.map((key) => {
if (key.type === 'Company') {
return companies.find((company) => company.id === key.id);
} else if (key.type === 'Owner') {
return owners.find((owner) => owner.id === key.id);
}
});
});
const resolvers = {
Query: {
companies: async () => {
const companies = await Company.findAll();
return companies;
},
owners: async () => {
const owners = await Owner.findAll();
return owners;
},
},
Company: {
owner: async (company, args, context) => {
return entityLoader.load({ type: 'Owner', id: company.ownerId });
},
},
Owner: {
company: async (owner, args, context) => {
return entityLoader.load({ type: 'Company', id: owner.companyId });
},
},
};
module.exports = {
typeDefs,
resolvers,
};
`
In this example, the single DataLoader instance entityLoader takes an array of keys, where each key is an object with two properties: type and id. The type property is used to distinguish between Company and Owner entities, and the id property is used to fetch the entities from the database. In the batch function, keys are filtered based on their type, and Company and Owner entities are fetched separately. Finally, the entities are returned in the same order as the keys, so that the correct entity is matched with the correct key.
I don't know if that a good way or not so waiting ur feedback :)
I have a circumstance where we want to return a modified result based on a graphQL variable. It appears that the loaders only take in an
id
argument, leaving no way to pass through other options for the server to consume. Ideally, I'd like to do something like this:The way I have solved it now is to create a separate loader for each option, but this solution seems far less scalable:
Are there any plans to allow an options passthrough like this in the future? Or am I missing an alternate solution?