drizzle-team / drizzle-orm

Headless TypeScript ORM with a head. Runs on Node, Bun and Deno. Lives on the Edge and yes, it's a JavaScript ORM too 😅
https://orm.drizzle.team
Apache License 2.0
24.62k stars 650 forks source link

[Enhancement]: Allow different input/output types on customType #1621

Open SSardorf opened 11 months ago

SSardorf commented 11 months ago

What version of drizzle-orm are you using?

0.29.1

What version of drizzle-kit are you using?

No response

Describe the Bug

In the following code, when you insert, you should be able to input both Vector and number[]

export const customVector = customType<{
    data: Vector|number[];
    driverData: string;
    config: { dimensions: number };
}>({
    dataType(config) {
        if (!config) {
            return `vector(1536)`;
        }
        return `vector(${config.dimensions})`;
    },
    toDriver(value: Vector | number[]): string {
        return toSql(value);
    },

    fromDriver(value: string): Vector {
        return fromSql(value);
    },
});

function fromSql(value: string) {
    return new Vector(
        value
            .substring(1, value.length - 1)
            .split(",")
            .map((v) => parseFloat(v))
    );
}

function toSql(value: Vector | number[]) {
    return JSON.stringify(value);
}

Both of these work:

await db.insert(schema.myTable).values({ id: "1", embedding: [1, 2, 3] });
await db.insert(schema.myTable).values({ id: "1", embedding: new Vector([1, 2, 3]) });

This is great, however, given the fromDriver ALWAYS returns a vector, it should be able to correctly infer that it's a Vector type. However, when querying the database, it returns as Vector | number[]

const myEmbeddingQuery = await db.select({ embedding: schema.myTable.embedding }).from(schema.myTable);
//     ^? const myEmbeddingQuery: { embedding: Vector | number[];}[]

It seems to simply use the data: Vector | number[]; type from the customType, however I've looked in the codebase and it should be able to infer the type based on the fromDriver. https://github.com/drizzle-team/drizzle-orm/blob/0a4e3b265ce121675e7baa14f3a39669ea387e6d/drizzle-orm/src/pg-core/columns/custom.ts#L60-L83

Expected behavior

Custom types should be able to have different input/output types. Being able to transform data at the ORM-level would be such a powerful ability, in ensuring that data is always the format that you want it to be, without having to transform on every query.

Environment & setup

No response

SSardorf commented 11 months ago

Here's a repro: https://github.com/useverk/drizzle-pgvector/blob/e4a3c953f02684054b32922443a5a4f8b603ff04/index.ts

Angelelz commented 11 months ago

The problem is that T['data'] is used for both the insert and select type. As a workaround you can type it as Vector only and then if you want to insert a number[] just type cast it. See the example.

Can you change the title to be "Allow different input/output types on customTypes"?