Reusable column definitions

This one has been already discussed in #296 and #352. I wanted to gather the findings, outline the problem and propose what can be done.

It doesn't have to be implemented this way, this is just a start of discussion.

Problem

Developers want to define their own column types and reuse them across projects.
Columns should be decomposable, meaning it shouldn't be simply a huge myColumns collection, but either individual column definitions or at least individual "repos" so to say.
Ideally, the repos/columns should be infinitely extensible, but even 1-tier reusable approach is acceptable for now.

Proposal 1

I think this should be relatively easy addition as it keeps the way the columns are defined in the tables, just decomposes the way to predefine them:

import { defineColumnTypes } from "orchid-orm"
import { uuidv7 } from "uuidv7"

export const baseColumnTypes = defineColumnTypes(t => ({
  // Same API as createBaseTable
  id: () => t.uuid().default(uuidv7).primaryKey(),
  now: () => t.timestamps().createdAt,
}))

import type Decimal from "decimal.js"
import { defineColumnTypes } from "orchid-orm"

export type NumberLike = number | string | Decimal

// I honestly don't like copy/pasting this across projects.
export function parseNumber<T extends string | null>(value: T): null extends T ? number | null : number {
  return (value === null ? null : Number(value)) as any // quickfix
}

export const decimalColumnTypes = defineColumnTypes(t => ({
  // Same API as createBaseTable
  decimal: () => t
    .decimal()
    .encode((value: NumberLike) => value === null ? null : String(value)),
  decimalNumber: () => t
    .decimal()
    .encode((value: NumberLike) => value === null ? null : String(value))
    .parse(parseNumber),
}))

and then consume:

import { baseColumnTypes, decimalColumnTypes } from "@my/columns"
import { createBaseTable } from "orchid-orm"

export const BaseTable = createBaseTable({
  columnTypes: t => ({
    ...t,
    ...baseColumnTypes,
    ...decimalColumnTypes,
  }),
  snakeCase: true,
})

export class UserTable extends BaseTable {
  override readonly table = "user"

  override columns = this.setColumns(t => ({
    id: t.id(),
    balance: t.decimal(), // coming from decimalColumnTypes
  }))
}

Proposal 2

This extends Proposal 1 by exporting the built-in column types into a first class object and allow extending arbitrary sets of column types (examples reduced to show off the difference):

import { columnTypes as t, defineColumnTypes } from "orchid-orm"

export const baseColumnTypes = defineColumnTypes(() => ({
  // t is coming from the import
  id: () => t.uuid().default(uuidv7).primaryKey(),
  now: () => t.timestamps().createdAt,
}))

// Can we perhaps get rid of the outer callback?
// I know it's often needed to break dependency loops and such.
// Anyhow, what if:

export const baseColumnTypes = defineColumnTypes({
  id: () => t.uuid().default(uuidv7).primaryKey(),
  now: () => t.timestamps().createdAt,
})

// or simply:

export const baseColumnTypes = {
  id: () => t.uuid().default(uuidv7).primaryKey(),
  now: () => t.timestamps().createdAt,
}

// or even:

export const id = () => t.uuid().default(uuidv7).primaryKey()
export const now = () => t.timestamps().createdAt

export const decimal = () => t.decimal().encode((value: NumberLike) => value === null ? null : String(value))
// Yay! Finally can reuse without copy/paste!
export const decimalNumber = () => decimal().parse(parseNumber)

and then consume:

import { baseColumnTypes } from "@my/columns"
import { createBaseTable } from "orchid-orm"

export const BaseTable = createBaseTable({
  // Can keep t for simplicity/backwards compatibility I suppose:
  columnTypes: t => ({
    ...t,
    ...baseColumnTypes,
  }),
  snakeCase: true,
})

or more radical approach, if possible:

import { id, now } from "@my/base-columns"
import { decimal, decimalNumber } from "@my/decimal-columns"
import { columnTypes, createBaseTable } from "orchid-orm"

export const BaseTable = createBaseTable({
  columnTypes: {
    ...columnTypes,
    id,
    now,
    decimal,
    decimalNumber,
  },
  snakeCase: true,
})

I don't understand the need for this to be infinitely extensible. I understand you may need a few custom columns, may be up to 10 in rare cases, but what's the use case for having so many of them that you want to reorganize it?

So if it was easy to do, I'd still have some doubts on why you need that if you can define all reusable types in the createBaseTable.

But, there are complications with validators, column types are tightly coupled to validation libs, even if you don't want and use them, it's still coupled. And the way of how the validation libs are integrated should be changed in the future, I'll try to make it less coupled.

Currently, json method's signature depends on which validation lib you use, if any. As well as other signatures. So the reusable set of column types should be able to access the validation lib typing. Validation lib is configured in createBaseTable, and so does the column types.

For the decimal number parsers, if it's not yet included, let's add decimal().asNumber() helper, similarly how there already is timestamp().asNumber().

value === null checks: I'm not sure if there are reasons for null to get into the parser, because for column defaults there is default, so perhaps this should be changed so the parse will never receive null.

// I honestly don't like copy/pasting this across projects.

Maybe the problem why it has to be copy-pasted in monorepo isn't actually that big, this also needs investigation. I'd like use you help for this: could you setup some basic monorepo to demostrate the problem of sharing the types? I'll look what can be done to solve it. I guess, different monorepo setups will cause different problems with this.

I don't understand the need for this to be infinitely extensible.

As I demonstrated in the example:

export const decimal = () => t.decimal().encode((value: NumberLike) => value === null ? null : String(value))
// Yay! Finally can reuse without copy/paste!
export const decimalNumber = () => decimal().parse(parseNumber)

Again, I'm not saying it's a must. I realise there are complications and so if the trade-off is to have this 1-level only, that is also much better than now.

But if the architecture allows, why not? Being flexible often (but of course not always) means it's properly architected.

I'd still have some doubts on why you need that if you can define all reusable types in the createBaseTable.

Because I have multiple createBaseTable in multiple projects. I quickly grep'ped my projects folder and I see 6 projects with orchid-orm:

41:    "orchid-orm": "^1.35.18",
30:    "orchid-orm": "^1.27.5",
62:    "orchid-orm": "^1.32.18",
27:    "orchid-orm": "^1.27.3",
15:    "orchid-orm": "^1.18.0",
11:    "orchid-orm": "~1.28.2",

so now I end up copy/pasting column types between them (and often having multiple variants of the definition of the same column type). If they were trivial one-liners that would be more or less fine, but it's not always the case.

But, there are complications with validators, column types are tightly coupled to validation libs, even if you don't want and use them, it's still coupled.

Oh, I see, right. I never used the built-in Orchid validation (I somehow feel it should be handled on level other than query builder, although I admit that could be just my ignorance lol) so that complication didn't occur to me.

I still think that decoupling types would be a valuable addition, so I propose to have this idea laying around for some time. I will try to use the validation integration and see for myself.

For the decimal number parsers, if it's not yet included, let's add decimal().asNumber() helper

That would be great, but, that was only example after all. I could similarly parse decimals into Decimal objects. The asNumber() shortcut will be helpful in many cases but will not cover all of them.

I'm not sure if there are reasons for null to get into the parser

It actually was like that earlier, but then @bingtsingw elaborated that there was a case when parsers want to handle null to replace it with a dynamic default on the app level, and you made it so, see #96.

As the result, in base table columns, the custom parse callback needs advanced type trickery (similar to what I demonstrated in the examples above) otherwise if we simply return ParsedType | null it makes non-nullable fields in actual tables to be typed ParsedType | null, which is not correct. And if we don't handle null in the base table column parser, that misleads the column parser code. E.g. if we use decimal().parse((value) => Number(value)) in the base table column, value type is inferred to be string, which misleads that null doesn't need to be handled, and when someone adds a nullable field of such type, NULL coming from the database is parsed as Number(null) which is — tada! — 0.

So we end up with the bunch of non-trivial code in base table, and then copy/paste it between projects.

Perhaps we could revisit #96 for that matter and come up with something more robust and tailored for better DX considering base table case.

I'd like use you help for this: could you setup some basic monorepo to demostrate the problem of sharing the types?

It's not a problem with sharing the types in a monorepo, that works just fine. My case is that I have completely unrelated projects using similar custom columns, for example:

decimals accepting Decimal from decimal.js
base58 addresses accepting PublicKey object for input as well as strings (with format validation)
what I call "optional varchar" which uses NULL for empty values in database but "" on the app level.

I could copy/paste the entire base.ts across projects, but that's also not too good, e.g. only 2 projects need the base58 columns and for other its polluting top-level dependencies for no reason. Also, some projects need snakeCase: true and some not, so it can't be literal copy/paste.

The use case became much clearer as you clarified how many projects you manage at the same time, impressive.

The proposed defineColumnTypes can be defined as follows:

export const columnTypes = makeColumnTypes(defaultSchemaConfig);

export const defineColumnTypes = <T>(fn: (t: typeof columnTypes) => T): T => fn(columnTypes)

I still don't want to include it in the official distribution because it ignores validators.

As for decoupling validators from column types, looks like there is no way. The current coupled way is optimized for TS perf. In the past, it was more decoupled and heavily relied on TS types mapping from ORM's columns to validations schemas, and it eventually bumped into "instantiation is too deep" TS error. And in the current way, it's messier from architecture perspective, but works better.

What I generally would suggest, but I see it's not good for your use-case, is to keep this in the createBaseTable:

// probably you could re-use the BaseTable across monorepo projects
export const BaseTable = createBaseTable({
  schemaConfig: zodSchemaConfig, // I'm aiming to support validation schemas
  columnTypes(t) {
    const decimal = () => t.decimal().encode((value: NumberLike) => value === null ? null : String(value))
    // reusing decimal without copy/paste as well
    const decimalNumber = () => decimal().parse(parseNumber)

    t.json // <-- this is t.json overriden with the zod config, it has a different signature than when not using zod
    // all column types have validation schema types bound to them

    return {
      ...t,
      decimal,
      decimalNumber,
    }
  ),
  snakeCase: true,
})

It actually was like that earlier, but then @bingtsingw elaborated that there was a case when parsers want to handle null to replace it with a dynamic default on the app level, and you made it so, see https://github.com/romeerez/orchid-orm/issues/96.

Cool, thanks for pointing.

if we simply return ParsedType | null it makes non-nullable fields in actual tables to be typed ParsedType | null, which is not correct.

This should be fixed then.

It looks like simply ignoring the returned null type is not correct. What if your column is not nullable, but the parser returns, let's say, empty strings as null, so the returning type should affect on the column output type even though the column itself is not nullable.

But if the column itself is not nullable, and the parser returns null as null simply because it has to handle nulls, and it makes non-nullable column to output nullable type, this is not correct and should be fixed.

The parser can't know beforehand if the column will or won't be nullable.

So, there is a need for additional method, let it be parseNull. And let's remove null from parse function argument. This should satisfy all cases.

Closing as not planned.

To sum up, you can define such a helper:

export const columnTypes = makeColumnTypes(defaultSchemaConfig);

export const defineColumnTypes = <T>(fn: (t: typeof columnTypes) => T): T => fn(columnTypes)

It's not going to be included into distribution because it doesn't account for validation libs.

romeerez / orchid-orm