Lazy DataLoader

Connection pool wrapper with seamless query batching.

Usage

import { createLazyDataLoader } from '@slonik/lazy-dataloader';
import {
  createPool,
  sql,
} from 'slonik';
import { z } from 'zod';

const pool = createPool('postgres://');

const lazyDataLoader = createLazyDataLoader(pool);

const results = await Promise.all([
  lazyDataLoader.oneFirst(
    sql.type(
      z.object({
        id: z.number(),
        name: z.string(),
      })
    )`
      SELECT id, name
      FROM person
      WHERE id = ${1}
    `
  ),
  lazyDataLoader.oneFirst(
    sql.type(
      z.object({
        id: z.number(),
        name: z.string(),
        website: z.string().nullable(),
      })
    )`
      SELECT id, name, website
      FROM company
      WHERE id = ${2}
    `
  ),
]);

console.log(results);

In this example:

Both queries will be batched into a single query.
results will be an array with the results of the two queries.

How it works

Using the same idea as DataLoader, LazyDataLoader will batch all queries that are executed in the same tick. This is done by using sub-queries for every query. Example:

SELECT
  (
    SELECT json_agg(row_to_json(t))
    FROM (
      SELECT id, name
      FROM person
      WHERE id = 1
    ) t
  ) query_1,
  (
    SELECT json_agg(row_to_json(t))
    FROM (
      SELECT id, name, website
      FROM company
      WHERE id = 2
    ) t
  ) query_2

Use cases

This is experimental approach to help with the N+1 problem that is common in GraphQL APIs.

The same problem can be solved more efficiently by using a DataLoader directly and hand crafting the queries. The latter approach is more flexible and efficient, but requires more work. In our example, it would require crafting two separate loaders and invoking them explicitly. Meanwhile, this library is a middle ground that can be used in some cases to reduce the impact of the N+1 problem by reducing the number of round trips to the database.

Considerations

I have two primary concerns with this approach:

Queries batched this way are never going to be as efficient as hand crafted data loaders
This makes monitoring individual query performance near impossible

Regarding the first point, it is conceptually the difference between:

SELECT id, name FROM person WHERE id IN (N+1)

and a union equivalent to:

SELECT id, name FROM person WHERE id = 1
SELECT id, name FROM person WHERE id = 2
// ... N+1

The latter is still better than just doing a roundtrip for every query, but the former would be a lot more efficient.

Regarding the second point, because every query is going to be a unique batch of queries, it is going to be difficult to get query-level performance insights from the tools that we currently rely on.

gajus / slonik

feat: add @slonik/lazy-dataloader #596

Lazy DataLoader

Usage

How it works

Use cases

Considerations

⚠️ No Changeset found