Add a built in command to `sqlx migrate` to consolidate migrations.

Is your feature request related to a problem? Please describe.

In a project I'm working on our migrations are meant to be fairly small, usually added with very specific features. For example when adding in web-push notification support, a reversible-migration that adds a column for a list of FCM Device Tokens, and their nickname. While this workflow is really great, and makes rollbacks easy, it also leads to having tons of migrations.

This makes discovery, and navigation when not connected to a database (where we can use the comments on tables/columns to navigate) difficult. Not to mention slowing down sites like GitHub/Editors/etc. whenever you need to open the gigantic "migrations" folder that has too many things in it.

Describe the solution you'd like

I'd like to every so often, when our migrations folder becomes incredibly large to be able to collapse the migrations folder with N number of migrations, to a single migration (or a few if some are still new and we may want to keep their small migrations for rollback separate).

This will allow us to keep the migrations folder small "stamping" them once they become stable enough to a file with the current schema, and it's reverse for unapplying, while still maintaining the benefits of the workflow we have currently.

Describe alternatives you've considered

Perhaps some-type of sharding of the migrations folder? Splitting out into folders may help in the short term, but i'm skeptical of how much you'd need to shard it to actually make a meaningful impact. Not to mention, if you ever make a mistake and shard on something that would add lots of folders you're stuck with it.
Just manually recreating the migrations and mucking with the internal sqlx state.

I, too, have been watching the migrations/ folders in our projects grow without end and wondering if we should add consolidation support and what that might look like.

To support modern deployment practices, I'd like the actual database-side changes to be represented in the migrations/ directory so they can be embedded and executed by the application itself.

This is what I'm thinking:

The user would move the existing migrations to migrations/.old/ and create new migrations in migrations/ that consolidate the changes in the old migrations into however many SQL files they want. We can create a CLI command to help with this.
The user would then create migrations/.old/BASELINE (bikeshedding?) file containing the highest version number in the new migrations that needs to be applied to match the state of the old migrations.
If used, the sqlx::migrate!() macro would embed both the old migration set and the new migration set.

Then the migration machinery would handle it like so:

When invoked either by sqlx-cli or Migrator::run(), the migration machinery would first check if the database is using the old migration set or the new one by comparing the contents of _sqlx_migrations to both migration sets; looking at just the first migration should be sufficient. If neither matches then it returns an error.
If it determines the old migration set is still in use, it first ensures that all of the old migrations have been successfully applied. If not all of them are applied and it is unable to do so, it returns an error. If successful, will then wipe the _sqlx_migrations table and record the new migrations up through the version number set in migrations/.old/BASELINE as being successfully applied.
If step 2 succeeded or step 1 determined the new migration set to be in use, it then applies any remaining migrations in the new set.

Once the database has been successfully migrated, the user may delete the .old folder. The application will then exclusively use the new migration set.

To make this migration minimally error-prone, we can provide a couple helper CLI commands.

As I alluded to, we can create a command to automatically generate the consolidated migrations. It can apply the old migration set against a provided database and then dump the DDL for all the objects in it, then make the changes to the migrations/ folder. If I had to state a preference, I think it should output a separate file per table/object as that's normally how I tend to organize baseline migrations but it could have a flag to consolidate into a single file instead. Or maybe it could start in interactive mode by default and prompt the user for that option.
We should also provide a command to actually check that the new migrations provide the same baseline as the old set. It would probably need to create a new database, run the new migration set against it, and diff the dumped DDL between it and the provided database.
- I would really like to somehow ensure that the user has run this check before executing the migrations against a production database, but it is a rather heavyweight thing to do automatically without being asked. I could imagine a user being confused and rather miffed if their application failed to start because the migrations got a permission error while trying to execute a CREATE DATABASE statement they didn't ask for.
- Perhaps BASELINE could also include a checked flag (I would then make the file TOML format) that starts false, and sqlx migrate run could refuse to run and sqlx::migrate!() refuse to compile unless checked = true. The user could then either run the command which would change the flag if successful, or just change it manually to override.

A coworker brought up the potential for ambiguities where whole or part of the new migration set matches the old set (maybe some files were copied verbatim).

I think a more robust logic is that the migration machinery should check if the current state of the migrations table matches the new migration set first, and if so, just treat the new migration set as the canonical set. If there's a conflict with the new migration set, it then falls back to upgrading from the old migration set. This will produce a nice fast path for databases that have already been upgraded.

Thanks for the comments here! I think a lot of this makes sense to me after thinking about it throughout the night. The only thing I'm quite not sold on yet is the need for a migrations/.old/BASELINE. I at least naively would've imagined the workflow to being something like:

I notice the migration folder is getting large, I choose a certain set of migrations to be the baseline. I should have this be sufficiently old enough that I can be confident it isn't "in-flight" and in fear of being reverted.
I move the migrations I want to become the baseline into migrations/.old, and run (excluding bikeshedding): sqlx migrate new-baseline, this gives me a set of migration files (I'm one of those people who want it all in one file rather than one per table, though I could imagine that being useful too as an option). It's timestamp is the maximum of all the timestamps in .old. There should be nothing left in the migrations folder that has a timestamp lower than my baseline.
The CLI can now remove all the data for old migrations in the database, and replace it with the new baseline. Comparing along the way.
I could now delete the old folder if I'd want too, and just rely on my SCM for history. I'd probably keep it around until I recut again, on recut replacing the directory of .old with what i want to be my new baseline. Then relying on SCM for the extra extra old files.

The baseline file becomes obsolete in this model as far as I can tell (though I could very easily be missing something!), and in general not-needed. I guess it makes it so that you don't have to traverse the .old folder in order to find the maximum timestamp, but I imagine most applications being able to take the fast path of not even needing to look at the old folder. And when they do, it be a one time thing (they notice they have a migration timestamp smaller than their oldest present in the database).

It's timestamp is the maximum of all the timestamps in .old.

While this is clever, I don't want to place a restriction on version numbers. In fact, I rather like using integers starting from 1 for the bootstrapping migrations on a project, and then timestamps for post-bootstrap or post-MVP migrations. It can be quite useful for retrospectives to compare the initial vision with where it ended up; a large number of post-MVP migrations might indicate a project with a lot of feature creep or an incomplete original specification. I would similarly use 1-based versions for the consolidated migrations and then go from there.

The BASELINE file is thus necessary if there's no correlation between the old and new version numbers. A command to generate it would be pretty simple though. I don't just want to assume that the old and new migration sets are always equal because you could have development concurrent to the consolidation that's adding new features, and the migration machinery would need to know to run those new migrations and not just pretend they've already been run.

Simply concatenating all the existing migrations into a single file is already trivial to do with a few shell commands so I'm not sure a special CLI command is necessary:

mkdir migrations/.old
mv migrations/*.sql migrations/.old/
# This is exactly what the `cat` command was designed for.
cat migrations/.old/*.sql > migrations/12345__new_baseline.sql

I wouldn't be content with that though, as you'd still be left with a mess of redundant ALTER ... statements that should really be collapsed into their original CREATE ... statements for readability, as well as removing any intermediate DML that may have been required for data consistency in the old migrations but which would obviously be wasted on a fresh database.

While this is clever, I don't want to place a restriction on version numbers. In fact, I rather like using integers starting from 1 for the bootstrapping migrations on a project, and then timestamps for post-bootstrap or post-MVP migrations.

Thanks for the follow up! I've been thinking about this for a few more days, and I'm still slightly confused as to the purpose of the baseline file. Sorry if I'm being slightly dense, I'm just going to list everything I think about (to make sure we're on the same page, and hopefully to make it clear to point out what I'm missing), here's my case:

I have lots of migrations all using timestamps.
My folder gets large enough that I want to cut a new baseline.
- Not all of my migrations should be in in the new baseline. Some of them may be in other teams/areas that are less stable, and need very long bake times before I'd consider consolidating them, so I can easily revert them.
All the files up to a point should give me a new migration file that will serve as the "first" migration.

In your case it sounds like your case is more:

I have lots of migrations during initial development.
I want all my migrations to become the new baseline, and while I'm at it, change versioning schemes.
All files should become one migration file, using the current timestamp, and all new migrations should use timestamps for version numbers.

In either situation, I don't see the benefits of a BASELINE file. I only see problems that could happen. Mainly because SQLx already has an ordering. It has to know what order to apply migration files in. If I start a new database, and have a folder with 10 unapplied migrations SQLx has to use the number at the beginning (which I've always thought of as a timestamp, but knowing it can be a number lets call it a number).

In my case having a baseline file not only adds extra work to me a maintainer (as an extra file I have to deal with through the process), but adds potential bugs in the case of "what happens when I move migrations with the numbers 1,2,4 into the old folder, but not 3?", "what happens if the baseline number doesn't exist in the folder at all?", etc. Hopefully these cases wouldn't happen, they seem pretty far-fetched, but why have the opportunity for them at all? In your case, there's also no need for a baseline file because all files are now becoming the new baseline, and you'll add on from there.

Even if we were to combine your requirements with mine, let's say having a series of migrations using version numbers 1, 2, etc. wanting to change them to timestamp versioning, but not merge in all of them. There still is going to exist an explicit ordering before, and after. For example if I have the migration numbers 1..4 I might end up with: 20220716045441_baseline_was123.sql, and 20220716045442_previously4.sql.

There is always going to be an ordering, because of that regardless of what versioning scheme you use, or if you change it... we always need to know what order to apply migrations in, and because of that we can always infer BASELINE (and will have problems creating a baseline that mixes that ordering up)! Even if we had a long-lived feature branch when the versioning scheme changed. Like let's say I had migrations in my main branch 1, 2, 3, and 4 is in another branch. I create a baseline file 3, and tell SQLx to start using timestamp ordering so I now have: 20220716045441_baseline.sql, in my main branch. My other branch still has 4 which I would need to manually rewrite, to a timestamp that comes after baseline. I gain no benefit here from using a baseline file.

I wouldn't be content with that though, as you'd still be left with a mess of redundant ALTER ... statements that should really be collapsed into their original CREATE ... statements for readability, as well as removing any intermediate DML that may have been required for data consistency in the old migrations but which would obviously be wasted on a fresh database.

Yeah it's this point exactly. This nails it on the head. Especially when you consider I'd have to manually futz with the _sqlx_migrations table on my existing databases to get it to show the proper new version numbers, and not show the old ones.

@Mythra You misunderstood them, they don't want to switch to only using timestamp-based versioning, they want to be able to use the same scheme of starting from version 1 for the collapsed migrations as well. So if they currently have a set of migrations [1, 2, 3, ..., 20220909, ...] they want to be able to collapse all of those into a new set of [1, 2, 3, ...] to be able to do the same kinds of change comparisons after the collapse.

But even if we don't take that use-case into account a BASELINE file would still be necessary if you want to be able to support collapsing the history into multiple migrations.

If you don't have something that says which of the new migrations constitute the same final schema as the old migrations there is no way to safely know which migrations the migrator should just add to the migration history table without actually applying them.

Thank you @kyrias for explaining! I think I roughly understand now, sorry it took me so long to get and polluting this issue 😅 !

It may be sufficient to allow users to manually combine some of their older migration files, folks can use --ignore-missing, and then the only thing which needs to be done is to add something like --update-combined or --update-checksums to the run command, depending on what folks here would prefer. I'm thinking something like:

--update-combined: any missing migration files will be removed from _sqlx_migrations, and existing files will have their checksums updated.
--update-checksums: will update the checksums of any files which have changed (due to the user manually combining files.

I think there are lots of variations on the above, and it is really quite subjective. However I've used the above pattern quite successfully in the past. Would be great to be able to do the same thing with sqlx.

launchbadge / sqlx

Add a built in command to `sqlx migrate` to consolidate migrations. #2066