dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.8k stars 3.2k forks source link

Unapply a specific list of migrations #35098

Closed jfheins closed 6 days ago

jfheins commented 1 week ago

Hi, we recently encountered an issue with our EF core migration handling and I'd like to ask for some ideas.

Background

We use a branching strategy where we branch-off a release branch and then give it to testing which can take 3 weeks easily. Meanwhile, development continues on the main branch. This includes migrations that are added to the main branch as normal. Sometimes, however, it can happen that a feature misses the branch-off or a change request comes in shortly thereafter which needs a DB migration. In these cases, we develop the change and then cherry-pick it - ensuring that the same migration name occurs in main as well as in the release branch.

Suppose this is the timeline, with changes A till F each representing a code change with a DB migration where the migrations are independent from one another: Image

Observe that the main branch contains migrations that are created before the last migration that is copied into the release branch.

Dev environments will typically see the B migration, so renaming it would create issues (EF would run B again) At some later time, we will deploy 2024.4 and it will apply Migrations A, C and D to the production database. (All good so far) When we deploy 2024.5 a few weeks later, EF.core will detect that Migrations B, E and F are missing and apply them correctly.

The problem

We end up with a database that is up-to-date with all migrations in the picture but the migrations have not been applied in "chronological order by file name". Now, if anything is wrong we want the ability to roll back. In theory, the down migrations are made just for this use case.

With the existing interface, I found no way to unapply a given list of migrations. I don't want to roll back to migration A as C&D could have added columns whose data I don't want to loose. I cannot roll back to Migration D because then migration B is left in the DB and the 2024.4 code might not be compatible with this.

Thus, I'd like a way to tell EF.core that I want to unapply exactly the three migrations B, E and F.

Possible feature

The existing Interface IMigrator contains a method Migrate(string? targetMigration) that we currently use. I'd like to have a dedicated Method to unapply where I can supply a list of migration Ids: Unapply(IEnumerable<string> migrations)

Alternatively, I'd also be happy to move to a script based approach. To enable this, an API could be added that takes a specific list of migrations and creates an up and a down script. As above, having just 2 fix points like "fromMigration" and "toMigration" end up with the same issue.

roji commented 1 week ago

EF does not allow unapplying specific migrations, without also unapplying later migrations; this is quite complicated, since migrations can depend on one another (think about a later migration altering a column added by an earlier one, and now you want to roll back the earlier one). The complexity for implementing something like that is simply beyond the scope of what EF can do, at least at the moment.

Alternatively, I'd also be happy to move to a script based approach. To enable this, an API could be added that takes a specific list of migrations and creates an up and a down script. As above, having just 2 fix points like "fromMigration" and "toMigration" end up with the same issue.

This already exists on the command-line, via `dotnet ef migrations script] (docs). But once again, I think you may be missing all the complexity and pitfalls in your approach: if you simply start picking and choosing SQL scripts for arbitrary ranges of migrations in your history, you're very likely to quickly end up in a completely out-of-sync state. In addition, none of EF's tooling will work at that point, as EF assumes the database is at the state which the migrations history table records (i.e. the latest applied migration) - but you'd be doing schema changes independently.

jfheins commented 1 week ago

Yes, it is clear that rolling back B is impossible to do (right) when C depends on it. Yet if B is something like "In Table X, Rename Column Y to Z" And Migration C is "In Table T, Add Index to Column CreatedTime" is feels like a vexing restriction.

In the case above, if Migration D depends on B, it would likely not even migrate up properly so it's an early issue to catch in testing.

Note that the Migrate-up path is perfectly fine with applying the migrations out-of-order. This is also helpful when many Pull requests are open and they take a different time to complete. In this case, the migration creation date will not match the PR merge date and the latter dictates the application in our CD environment.

A little out-of-box thinking: Another option would be rename support or some other way to split the 2 questions "which migrations have been applied" and "have a total ordering across the migrations" If a migration had a unique ID, we could rename the B migration in the top example to have a "later" filename. Dev environments could still recognize that this migration was applied (by using the id) but for the purposes of the migration timeline, we can move it to be after the D migration (for production)

roji commented 1 week ago

Note that the Migrate-up path is perfectly fine with applying the migrations out-of-order. This is also helpful when many Pull requests are open and they take a different time to complete. In this case, the migration creation date will not match the PR merge date and the latter dictates the application in our CD environment.

Our general recommendation here is to recreate the migration in the PR as part of rebasing it on top of latest main; this makes sure that the very latest model snapshot is used as a basis for generating the migration. The problem goes far beyond simply out-of-order timestamps: if I add a migration in my PR, and someone else merges another PR that adds some other migration, those two migrations can conflict in various ways.

Overall, I can't really see any scheme here that wouldn't include full awareness of the total dependency graph between the migrations; if you carry it to its logical conclusion, we'd have to build a sort of git system for managing migrations, which simply isn't feasible, and also represents a degree of complexity that goes far beyond what most people require.

I'd highly recommend trying to work with the system in its current form, and not try to roll back selective migrations while keeping later ones.

roji commented 6 days ago

I'm going to go ahead and close this as it isn't something we intend to implement because of the complexity it involves.