drizzle-team / drizzle-orm

Headless TypeScript ORM with a head. Runs on Node, Bun and Deno. Lives on the Edge and yes, it's a JavaScript ORM too 😅
https://orm.drizzle.team
Apache License 2.0
21.47k stars 486 forks source link

[BUG]: design: working with dynamic schemas is practically impossible #423

Open mppub opened 1 year ago

mppub commented 1 year ago

What version of drizzle-orm are you using?

0.23.2

Describe the Bug

This is not might be re-labeled as and design/architecture flaw and let me know if I get it wrong but: Imagine the typical multi-tenant app having a separate schema for each customer account - in this case drizzleorm is practically impossible to use since the whole API design revolves around directly including the schema TS code that is static by definition.

I would imagine something like this:

// schema definition
const accountSchema = pgDynamicSchema('account'); // instead of pgSchema, where the string param is just an optional prefix

export const users = accountSchema.table("users", {
//...
}

// some rest endpoint
const currentUserSchema = 'customer_schema_name'
db.select().from(users(currentUserSchema)) // this would then access "account_customer_schema_name" schema

But I understand this might mean to rewrite the whole thing..

AndriiSherman commented 1 year ago

@mppub https://github.com/drizzle-team/drizzle-orm/blob/main/drizzle-orm/src/pg-core/README.md#customizing-the-table-name

AndriiSherman commented 1 year ago

If link don’t work, try to search for “ Customizing the table name” in Postgres docs

mppub commented 1 year ago

@mppub https://github.com/drizzle-team/drizzle-orm/blob/main/drizzle-orm/src/pg-core/README.md#customizing-the-table-name

Thanks I looked at that earlier but that is not what I need + you even write there:

Note:: this feature should only be used to customize the table name. If you need to put the table into a different schema, refer to the Table schemas section.

That is intened for predefined values (eg. environements) that you will have in the code not for dynamically generated ones (eg. I have 1000 customers that would mean I would need 1000 user tables representations in the code).

Technically I could do a factory like this:

const getUsersTable = (userId: string) => {
  const pgTable = pgTableCreator((name) => `${userId}_${name}`);

  return pgTable('users', {
    id: int('id').primaryKey(),
    name: text('name').notNull(),
  });
}

db.select().from(getUsersTable(currentUserId))

// for migrations - I would just need to somehow replicate the generic into every other customer schema
export const genericUsersTable = getUsersTable('generic')

That could work, even with the pgSchema- but it is just seems so excesive - Imagine - everytime user creates an request that access some table you would need to call this factory and create the table representation instead of just specify the schema in a "query". Whats your opinion?

dankochetov commented 1 year ago

@mppub I'm a bit confused. Initially you said that you need to put the tables into different schemas for different customers, but the solution you came up with in the last message uses a dynamic prefix instead of a dynamic schema. If you need the dynamic schema, you can create a factory like this:

function getUsersTable<TSchema extends string>(schemaName: TSchema) {
    return pgSchema(schemaName).table('users', {
        id: integer('id').primaryKey(),
        name: text('name').notNull(),
    });
}

I've tested this locally, and it turned out current types are a bit misconfigured for this function to work, so I've pushed a fix to drizzle-orm@beta. You can update and try it.

Regarding the performance - I doubt there will be any noticeable issues using this method. You can test the performance by comparing the response times between using a table with a static schema and using a table factory.

mppub commented 1 year ago

Perfect! I will try the beta today, yes the example was just an illustration of the method for both schema and table prefix..

maximilianmaihoefner commented 1 year ago

@dankochetov We are using the function you suggested to dynamically create the schema and it works great for querying. We are however unsure how we would use drizzle-kit to generate sql migrations for the schema if it is defined within the function. Executing drizzle-kit generate:pg tells us it didn't find any Tables and that there is nothing to migrate. We need to have the dynamic schema at runtime because we have n-Tenants that we do not know at compile time, do you have a suggestion on how we could generate the Migrations?

aflatoon2874 commented 1 year ago

@mppub I'm a bit confused. Initially you said that you need to put the tables into different schemas for different customers, but the solution you came up with in the last message uses a dynamic prefix instead of a dynamic schema. If you need the dynamic schema, you can create a factory like this:

function getUsersTable<TSchema extends string>(schemaName: TSchema) {
  return pgSchema(schemaName).table('users', {
      id: integer('id').primaryKey(),
      name: text('name').notNull(),
  });
}

I've tested this locally, and it turned out current types are a bit misconfigured for this function to work, so I've pushed a fix to drizzle-orm@beta. You can update and try it.

Regarding the performance - I doubt there will be any noticeable issues using this method. You can test the performance by comparing the response times between using a table with a static schema and using a table factory.

How to reference a table created via factory method in another table created via the same way in the refrences() function to define a foreign key.

rodolfofranco commented 9 months ago

How would this work If I need to define my tables as 1.Users,2.Users....N.Users??

Using export const genericUsersTable = getUsersTable('generic') as shown above would not work for me because my prefixes are not static such as 'generic'.

Is there any way to accomplish this?

aflatoon2874 commented 9 months ago

How would this work If I need to define my tables as 1.Users,2.Users....N.Users??

Using export const genericUsersTable = getUsersTable('generic') as shown above would not work for me because my prefixes are not static such as 'generic'.

Is there any way to accomplish this?

Instead of passing 'generic' as parameter, you would pass the actual schema name at runtime based on some criteria as per your use case.

rodolfofranco commented 9 months ago

How would this work If I need to define my tables as 1.Users,2.Users....N.Users?? Using export const genericUsersTable = getUsersTable('generic') as shown above would not work for me because my prefixes are not static such as 'generic'. Is there any way to accomplish this?

Instead of passing 'generic' as parameter, you would pass the actual schema name at runtime based on some criteria as per your use case.

If the only way is at runtime, how can I use drizzle-kit to generate my tables? Sorry if I seem confused.

aflatoon2874 commented 9 months ago

How would this work If I need to define my tables as 1.Users,2.Users....N.Users?? Using export const genericUsersTable = getUsersTable('generic') as shown above would not work for me because my prefixes are not static such as 'generic'. Is there any way to accomplish this?

Instead of passing 'generic' as parameter, you would pass the actual schema name at runtime based on some criteria as per your use case.

If the only way is at runtime, how can I use drizzle-kit to generate my tables? Sorry if I seem confused.

Good question. Although the docs do not talk of this kind of usage, but there is a way to handle this which I devised, may not be the best way but does the job.

  1. Define your schema as given below:
    
    const userColumns = {
    id: text('id')
    .$defaultFn(() => createId())
    .primaryKey(),
    emailAddress: text('email_address').notNull(),
    ...
    }

export const user = pgTable( 'user', userColumns, // (user) => { // return { // emailIdx: uniqueIndex('email_idx').on(user.emailAddress), // } // } )

export const getUserTable = ( schemaName: TSchema ) => { return pgSchema(schemaName).table( 'user', userColumns ) }


2. In this definition, `user` will satisfy the `drizzle-kit` generate command as well as the type definitions.
3. In your application you will use `getUserTable()` with appropriate schema name to target the correct tenant at runtime.
4. This way you would not be able to use `migrate()` to apply any schema changes to the database after generating the sql script via generate command because the `migrate()` does not know the schema name so it will always target the `public` db schema. I have raised an issue/enhancement request #908 on Github for `migrate()` to handle passing a db schema name so that desired db schema can be targetted. Until then, you need to run the generated script yourself against all the db schemas one by one using external sql client tool.

Hope this helps :)
jangrunicke commented 9 months ago

Question: I tried recreating this with MySQL since I am using planetscale. Found out, that this does not work at all, since schemas are the same as database in mysql.

Will this ever be possible when using mysql?

I do not need the schema use case in particular. Just got the same kind of use case where I would like to have each table for each tenant that I have and then access those depending on the tenant. Currently trying to handle this with table-prefixes, but I am struggling to do this as well.

Sparticuz commented 9 months ago

Table factory seems to work for me, but I'm struggling to understand how references (foreign keys) would work?

For (a partial) example:

export const getEventsTable = <TSchema extends string>(schemaName: TSchema) => {
  return mysqlSchema(schemaName).table("cs_events", {
    id: unsignedInt("id").primaryKey(),
    isProcessed: unsignedTinyint("isProcessed").default(0),
  });
};

export const getNotesTable = <TSchema extends string>(schemaName: TSchema) => {
  return mysqlSchema(schemaName).table("cs_notes", {
    author: varchar("author", { length: 200 }),
    notes: text("notes"),
    eventId: unsignedInt("event_id").notNull().references(() => *WHAT GOES HERE*),
  });
};
aflatoon2874 commented 9 months ago

Table factory seems to work for me, but I'm struggling to understand how references (foreign keys) would work?

For (a partial) example:

export const getEventsTable = <TSchema extends string>(schemaName: TSchema) => {
  return mysqlSchema(schemaName).table("cs_events", {
    id: unsignedInt("id").primaryKey(),
    isProcessed: unsignedTinyint("isProcessed").default(0),
  });
};

export const getNotesTable = <TSchema extends string>(schemaName: TSchema) => {
  return mysqlSchema(schemaName).table("cs_notes", {
    author: varchar("author", { length: 200 }),
    notes: text("notes"),
    eventId: unsignedInt("event_id").notNull().references(() => *WHAT GOES HERE*),
  });
};

Please look at my implementation and explanation here https://github.com/drizzle-team/drizzle-orm/issues/423#issuecomment-1718887839 You need to create a schemaless static definition too for each table. Reference this table instance name in your references clause. Please note that foreign key definition is only used by drizzle-kit to generate sql for migration. For querying data with db.query you need to define relations separately using relations() function.

Sparticuz commented 9 months ago

Got it. It's kindof annoying with the code duplication. It would be nice to just be able to attach the user table directly to the schema factory that way we could just define the table once instead of defining the columns and a schemaless table, then attach the table in the factory. Something like this:

export const user = pgTable(
  'user',
   {
  id: text('id')
    .$defaultFn(() => createId())
    .primaryKey(),
 emailAddress: text('email_address').notNull(),
  ...
},
  // (user) => {
  //   return {
  //     emailIdx: uniqueIndex('email_idx').on(user.emailAddress),
  //   }
  // }
)

export const getUserTable = <TSchema extends string>(
  schemaName: TSchema
) => {
  return pgSchema(schemaName).table(user)
}
aflatoon2874 commented 9 months ago

Got it. It's kindof annoying with the code duplication. It would be nice to just be able to attach the user table directly to the schema factory that way we could just define the table once instead of defining the columns and a schemaless table, then attach the table in the factory. Something like this:

export const user = pgTable(
  'user',
   {
  id: text('id')
    .$defaultFn(() => createId())
    .primaryKey(),
 emailAddress: text('email_address').notNull(),
  ...
},
  // (user) => {
  //   return {
  //     emailIdx: uniqueIndex('email_idx').on(user.emailAddress),
  //   }
  // }
)

export const getUserTable = <TSchema extends string>(
  schemaName: TSchema
) => {
  return pgSchema(schemaName).table(user)
}

True. It would be great. Better open an enhancement request. Hopefully, the drizzle team may include it in their roadmap down the line.

frenzzy commented 8 months ago

Perhaps it would be beneficial to add the support for the second argument in the from function, like so:

from<TFrom, TName>(source: TFrom, schemaName?: TName)

This modification would also enable the utilization of the placeholder feature in the following manner:

from(users, sql.placeholder("dynamic_schema_name"))

It seems that this aligns with the SQL approach that Drizzle aims to adhere to, as per the documentation. What are your thoughts on this suggestion?

cbaehnisch commented 6 months ago

I'm trying to achieve something similar with bun-sqlite. As there are no schemas available, i have to use sqliteTableCreator. As my prefixes are dynamic, I cannot use migrations to create the tables.

How can I create the tables on the fly? Is it possible to execute the sql created with sqliteTableCreator?

d4mr commented 5 months ago

This is how I am doing it:


export const getUserTable = <TSchema extends string>(schemaName: TSchema) => {
  return pgSchema(schemaName).table(
    'user',
    {
      id: text('id')
        .$defaultFn(() => createId())
        .primaryKey(),
      emailAddress: text('email_address').notNull(),
    },
    (user) => {
      return {
        emailIdx: uniqueIndex('email_idx').on(user.emailAddress),
      };
    },
  );
};

export const users = getUserTable(DEFAULT_SCHEMA_NAME);

A benefit of this approach is that I can define the DEFAULT_SCHEMA_NAME to be an environment variable. Then by setting the environment variable, I can also use drizzle-kit to generate correct sql.

However, I do not know how to get this to work with db.query and relational queries. Is it possible at all to use relational queries with this factory method?

frenzzy commented 3 months ago

I am currently employing the following workaround:

import { Table } from 'drizzle-orm'

const useDynamicSchema = <T extends Table>(table: T, schema: string): T => {
  // @ts-expect-error Symbol is @internal in drizzle-orm, see https://github.com/drizzle-team/drizzle-orm/blob/0.30.4/drizzle-orm/src/table.ts#L64-L65
  table[Table.Symbol.Schema] = schema
  return table
}

I find this approach to be more DX-friendly and memory-efficient, especially in the case of a large number of identical schemas, compared to the use of pgSchema:

const users = pgTable('users', {
  id: serial('id').primaryKey(),
  name: text('name'),
})

await db.select({ id: users.id }).from(useDynamicSchema(users, 'dynamic_schema_name'))

The only caveat is that useDynamicSchema must always be used for the same table once it has been applied.

Additionally, this method facilitates easy migration to an official solution, once supported, by simply replacing useDynamicSchema\(([^)]+)\) with $1. For example:

await db.select({ id: users.id }).from(users, 'dynamic_schema_name')
iursevla commented 2 months ago

@frenzzy yeah nice solution. I'm using you solution plus configuration from environment variables (NestJS project) and all works fine.

In my project now I have multiple schemas quite easily. I just set the environment variables to set the configuration for the schema to use.

Abstract dao is an example of an abstract DAO for reusable queries (getAll, getById - UUID, getBySingleKey, etc)

export class AbstractDao<TSchema extends Record<string, unknown>, Entity extends Table, InferEntitySelected> {
  constructor(
    @Inject(PG_CONNECTION) protected readonly db: PostgresJsDatabase<TSchema>,
    private readonly entity: Entity,
    protected readonly dbConfig: DatabaseConfig,
  ) {}

  protected get from() {
    return useDynamicSchema(this.entity, this.dbConfig.schemaName);
  }

  async getAll() {
    return this.db.select().from(this.from).execute();
  }
}
My configuration is like: ```ts @Injectable() export class DatabaseConfig { get schemaName(): string { // Load the schema name from environment variables return process.env.DB_SCHEMA_NAME; } } ```
My migrations file is like: (run_migrations.ts) ```ts async function runMigrations(): Promise { const client = new Client({ connectionString: DatabaseConfig.postgresqlConnection, }); await client.connect(); const db = drizzle(client, { schema }); // Makes sure the connection uses the Schema we want const schemaName = DatabaseConfig.schemaName; console.log(schemaName); await client.query(`CREATE SCHEMA IF NOT EXISTS "${schemaName}"`); await client.query(`SET schema '${schemaName}'`); // This will run migrations on the database, skipping the ones already applied await migrate(db, { migrationsFolder: `${__dirname}/src`, migrationsSchema: schemaName }); await client.end(); } void runMigrations(); ```
My main file: (main.ts) just to test everything ```ts import { INestMicroservice } from '@nestjs/common'; import { NestFactory } from '@nestjs/core'; import { DatabaseModule } from '@core/common/database/database.module'; import { JobDao } from '@core/common/database/entities/job/job.dao'; import { CustomLoggingService } from '@core/logging/custom-logging/custom-logging.service'; async function bootstrap() { const app: INestMicroservice = await NestFactory.createMicroservice(DatabaseModule); const logger = app.get(CustomLoggingService); app.useLogger(logger); await app.init(); logger.log(`Starting Database Module tests`); const jobsDao = app.get(JobDao); const allJobs = await jobsDao.getAll(); console.log(allJobs[0]); } void bootstrap(); ```
Docker compose (docker-compose.yaml) for quick testing ```yaml database: image: postgres:16.2-alpine container_name: my-database-name environment: POSTGRES_USER: db-name POSTGRES_PASSWORD: password ports: - 5432:5432 ```

Now, I can run the migrations/introspect/etc using the schemas.

Example introspect:

env DB_NAME="db-name" env DB_PASSWORD="password" env DB_PORT="5432" env DB_SCHEMA_NAME="db-schema-name" env DB_USER="db-name" drizzle-kit introspect:pg

Example run migrations

env DB_NAME="db-name" env DB_PASSWORD="password" env DB_PORT="5432" env DB_SCHEMA_NAME="db-schema-name" env DB_USER="db-user"  node -r tsconfig-paths/register -r ts-node/register run_migrations.ts
iursevla commented 2 months ago

@frenzzy

Ended up Creating a repo to showcase Drizzle with multiple schemas and added a bunch of examples/tests/etc

bitaccesscomau commented 2 months ago

I am currently employing the following workaround:

import { Table } from 'drizzle-orm'

const useDynamicSchema = <T extends Table>(table: T, schema: string): T => {
  // @ts-expect-error Symbol is @internal in drizzle-orm, see https://github.com/drizzle-team/drizzle-orm/blob/0.30.4/drizzle-orm/src/table.ts#L64-L65
  table[Table.Symbol.Schema] = schema
  return table
}

I find this approach to be more DX-friendly and memory-efficient, especially in the case of a large number of identical schemas, compared to the use of pgSchema:

const users = pgTable('users', {
  id: serial('id').primaryKey(),
  name: text('name'),
})

await db.select({ id: users.id }).from(useDynamicSchema(users, 'dynamic_schema_name'))

The only caveat is that useDynamicSchema must always be used for the same table once it has been applied.

Additionally, this method facilitates easy migration to an official solution, once supported, by simply replacing useDynamicSchema\(([^)]+)\) with $1. For example:

await db.select({ id: users.id }).from(users, 'dynamic_schema_name')

Does this solution only work with db.select and not db.query?

I'm trying to use it in my SvelteKit endpoint as per below, and dynamically pass in the schema_id based on the user making the request, but I also need to only apply the params that are passed into the query and can't see a simpler way than doing it as per below.

+server.ts

import type { RequestHandler } from "@sveltejs/kit";
import { json } from '@sveltejs/kit';
import { db } from "$lib/server";
import { sale, saleLine, salePayment, useDynamicSchema } from "$lib/db/schema";

...

export const GET: RequestHandler = async ({ url }) => {
    const offsetParam = Number(url.searchParams.get('offset')) || 0;
    const limitParam = Number(url.searchParams.get('limit')) || 10;
    const idParam = url.searchParams.get('id');
    const typeParam = url.searchParams.get('type');
    const statusParam = url.searchParams.get('status');
    const handleParam = url.searchParams.get('handle');

    const sales = await db.query.from(useDynamicSchema(sale,'schema_id')).findMany({
        where: (sale, { eq, and }) => and(
            idParam ? eq(sale.id, idParam) : undefined,
            typeParam ? eq(sale.type, Number(typeParam)) : undefined,
            statusParam ? eq(sale.status, Number(statusParam)) : undefined,
            handleParam ? eq(sale.handle, handleParam) : undefined
        ),
        offset: offsetParam,
        limit: limitParam,
        with: {
            saleLine: {
                columns: {
                    id: false
                }
            },
            salePayment: {
                columns: {
                    id: false
                }
            }
        }
    });

    return json(sales);
}
iursevla commented 2 months ago

@bitaccesscomau See this: https://github.com/intruder-detection/nestjs-drizzle-multiple-schemas/blob/main/src/core/common/database/entities/abstract.dao.ts

Might be helpful

bitaccesscomau commented 2 months ago

@bitaccesscomau See this: https://github.com/intruder-detection/nestjs-drizzle-multiple-schemas/blob/main/src/core/common/database/entities/abstract.dao.ts

Might be helpful

Unfortunately your example doesn't address my issues with using db.query, however I did figure out what I think is a very elegant way of supporting multi-tenant (multi-schema) applications on a per request basis. Please see my Vercel DB example, but this will work with any connector (or at least the postgres one) it seems.

I first modify the db function to take in a schemaId when called.

lib/server/index.ts

import { sql } from '@vercel/postgres';
import * as schema from '$lib/db/dynamicSchemaGenerator'; // Pay attention to this path
import { drizzle } from 'drizzle-orm/vercel-postgres';

export function db(schemaId: string) {
    return drizzle(sql, { schema: schema.generateSchema(schemaId) });
}

lib/server/schema.ts

BYO in normal format (note this is not the file imported above, it's used to export into the file below).

lib/server/dynamicSchemaGenerator.ts

import { relations } from "drizzle-orm";
import * as tables from './schema'; // Pulls in all tables from actual schema.
import { Table } from 'drizzle-orm';

// Code snippet from https://github.com/drizzle-team/drizzle-orm/issues/423
export const useDynamicSchema = <T extends Table>(table: T, schema: string): T => {
  // @ts-expect-error Symbol is @internal in drizzle-orm, see https://github.com/drizzle-team/drizzle-orm/blob/0.30.4/drizzle-orm/src/table.ts#L64-L65
  table[Table.Symbol.Schema] = schema;
  return table;
};

export function generateSchema(schemaId: string) {
  const dynamicTables = Object.keys(tables).reduce((acc, tableName) => {
    acc[tableName] = useDynamicSchema(tables[tableName], schemaId);
    return acc;
  }, {});

  // Relationships - must use affix `dynamicTables` instead of `tables` to the table name
  const saleRelations = relations(dynamicTables.sale, ({ many }) => ({
    saleLine: many(dynamicTables.saleLine),
    salePayment: many(dynamicTables.salePayment),
  }));

  const saleLineRelations = relations(dynamicTables.saleLine, ({ one }) => ({
    sale: one(dynamicTables.sale, {
      fields: [dynamicTables.saleLine.saleId],
      references: [dynamicTables.sale.id],
    })
  }));

  const salePaymentRelations = relations(dynamicTables.salePayment, ({ one }) => ({
    sale: one(dynamicTables.sale, {
      fields: [dynamicTables.salePayment.saleId],
      references: [dynamicTables.sale.id],
    })
  }));

  return {
    ...dynamicTables,
    saleRelations,
    saleLineRelations,
    salePaymentRelations
  };
}

You can then call db in you applications as per normal, but now you must specify a schemaId in the db function as per below. This suits my use case as all existing connections are now forced to have a schemaId preventing cross-contamination of joins.

It works with both select and query.

+server.ts

const result = await db(schemaId).query.sale.findMany({
        where: (sale, { eq, and }) => and(
            idParam ? eq(sale.id, idParam) : undefined,
            typeParam ? eq(sale.type, Number(typeParam)) : undefined,
            statusParam ? eq(sale.status, Number(statusParam)) : undefined,
            handleParam ? eq(sale.handle, handleParam) : undefined
        ),
        offset: offsetParam,
        limit: limitParam,
        with: {
            saleLine: {
                columns: {
                    saleId: false
                }
            },
            salePayment: {
                columns: {
                    saleId: false
                }
            }
        },
        orderBy: sale.created
    });

Essentially now you can pass in a schemaId from a cookie or token validator or something like that per request. Haven't noticed any speed issues, but I have lost type safety since I tried to import all the tables and pass them into the dynamicSchemaGenerator. If you butcher your schema.ts file instead and put a list of all the tables in the generator function and apply the dynamicSchema function, type-safety works fine but I wanted it to be compatible with migrations and introspections seamlessly. Someone smarter than me can figure out how to get the schema into its dynamic state without individually itemising each table and keeping type safety I would be very grateful.

I also have no idea if this is bad practice, generating heaps of connections, memory leaks, etc - so YMMV but it's the best I can see for my particular use case at the moment.