Open tumelowill opened 8 months ago
That's correct, this is working as intended. However, I can see this may not be suitable for some applications.
I've used several tools that leave the database in a "dirty" state and when you're applying migrations on dozens, or even hundreds of systems recovering becomes very challenging.
When we do have to use +goose NO TRANSACTION
we typically isolate those changes to a single file and only down to the statements that truly must run outside a transaction. In practice, for most databases, the number of statements that need to run outside a transaction is very low. So low, that I'd rather not add additional complexity and overhead to handle these edge cases.
Interesting thanks. I'm curious how you think we should handle a failed long running migration that involves batch inserting data from a large table into another. Using the Goose's API, I don't think we have a way of knowing whether this migration was previously attempted and cannot stop ourselves from duplicating the work that we had already done. Of course this is something we can write tooling for ourselves to guard against, but it could be nice to have this given to us from Goose directly.
I don't have a good answer for this at the moment.
The times I've had to run migrations that lasted a few hours, we treated those as exceptions, i.e., apply all the migrations up to that point and then run the slow migration in isolation (monitoring it more closely).
When possible, we wrote those migrations to be idempotent, so that if they were interrupted, they could be run again without causing any issues. But obviously, that's not always possible.
In other places, I've resorted to doing "data migrations" in out-of-band jobs without goose, e.g., marking a feature as "no ready" until some data migration or backfill is complete. But the downside is that you must write your own logic to track this.
I'd like to keep this issue open for now and do a bit more thinking. Fwiw the is_applied
column in goose is a legacy thing that tracked down migrations, but we decided to remove that (initial https://github.com/pressly/goose/issues/121#issuecomment-434717659)
Migrations that do not have a transaction will not be rolled back. We should mark the database as dirty if the migration failed. Migrations on dirty databases should not be allowed.
Example:
If we have a policy of running migrations when the database schema is outdated, we are now filling the
posts
table with inserts on each migration run as we have no idea whether the database is dirty.What we would like to see after attempting to migrate multiple times:
What we actually see after attempting to migrate multiple times: