kysely-org / kysely

A type-safe typescript SQL query builder
https://kysely.dev
MIT License
10.72k stars 271 forks source link

Multiple migrations are run in a single transaction #1154

Open shoooe opened 1 month ago

shoooe commented 1 month ago

If you have a migration 0001 with this:

export async function up(db: Kysely<unknown>): Promise<void> {
  await db.schema.createType("some_enum").asEnum(["value_1", "value_2"]).execute();
}

and a migration 0002 with this:

export async function up(db: Kysely<unknown>): Promise<void> {
  await sql`ALTER TYPE some_enum ADD VALUE 'value_3';`.execute(
    db
  );
}

and then some migration 0003 that uses some_enum with value value_3 then:

error: unsafe use of new value "value_3" of enum type some_enum

From what I understand from the code this is caused by migrateToLatest running all pending migrations in a single transaction.

Shouldn't each migration be independent and run in its own transaction? IME this is the behaviour that most other migration systems adopt (e.g. Django migrations).

Despite this ^ I'm loving this library more and more every day, so thank you again for this magical tool. ❤️

koskimas commented 1 month ago

Having individual transaction for each migration would certainly make other things easier too. For example we could easily implement disabling transactions for individual migrations. There has been a feature request open for that forever.

However, there was a reason why it was implemented the way it is. I just can't remember it right now.

Changing the behaviour could also be a major breaking change and I don't know if we can do it without a massive deprecation period.

You could already migrate up one migration at a time to get individual transactions.

shoooe commented 1 month ago

Oh true. For reference to others this is what I went with:

  const migrations = await migrator.getMigrations();
  const pendingMigrations = migrations.filter((m) => isUndefined(m.executedAt));

  if (isEmpty(pendingMigrations)) {
    console.info(`✅ Migrations are up to date!`);
  }

  for (const migration of pendingMigrations) {
    const { error, results } = await migrator.migrateTo(migration.name);

    results?.map((it) => {
      if (it.status === "Success") {
        console.info(
          `✅ Migration "${it.migrationName}" was executed successfully`
        );
      } else if (it.status === "Error") {
        console.error(`❌ Failed to execute migration "${it.migrationName}"`);
      }
    });

    if (error) {
      console.error("❌ Failed to migrate");
      console.error(error);
      process.exit(1);
    }
  }
koskimas commented 1 month ago

Let's leave this open. It might make sense to have individual transactions.

igalklebanov commented 1 month ago

Maybe we could ship both and allow enabling this? no breaking change..

koskimas commented 1 month ago

Maybe. But enabling/disabling transaction for a single migration when we run a batch of migrations in one transaction would be horrible to implement. Basically we'd only allow that when the "transaction per migration" setting is enabled.

koskimas commented 1 month ago

Haha, I just had a discord discussion with someone who explicitly wanted all migrations to run in the same transaction. So we probably need to make both possible.

Papooch commented 1 month ago

Running all migrations in a single transaction is IMO the better option if the database engine supports it.

The reason is that if we update an application running in production, and we need to apply multiple migrations (say M1 and M2), then all need to succeed, or none.

If we apply them sequentially, there's a risk of the application entering an inconsistent state -> we apply M1, then M2, which fails, so we try to rollback M1, which also fails. A failing transaction makes the application code rollback to the previous deployment (or rather, not update to the new), but this puts it in a state where it expects the database schema of M0 (before M1), but it is really in M1.

But, it is apparent that not all statements can be run within a single transaction, so an escape hatch is needed.

igalklebanov commented 1 month ago

The reason is that if we update an application running in production, and we need to apply multiple migrations (say M1 and M2), then all need to succeed, or none.

M1 and M2 are two separate atomic change sets that were created in different times, maybe by different developers. But keeping M1 changes while M2 failed, might cause source code misalignment with database schema. So yeah, it's more sensible to wrap both under the same transaction and fail CI for given batch of changes (hopefully there's a merge queue that'll split the batch and try M1 alone later).

But, it is apparent that not all statements can be run within a single transaction

How come?

Papooch commented 1 month ago

@igalklebanov

How come?

It's the entire reason for the original post - apparently, in Postgres, you can't add a new enum value and then use it within the same transaction.

https://stackoverflow.com/questions/65130629/new-enum-values-must-be-committed-before-they-can-be-used