Open Metamogul opened 10 months ago
Since my team wants to prevent this in the future, it'd be nice to know if the maintainer(s) of the project accepted a fix as proposed above and would merge it, if another fix was preferrable or if you don't agree this should be fixed so we could look for another solution.
@Metamogul so tracking a hash of each of the migrations to verify that people didn't mess around with them? I'm cool with that, that's actually a good addition.
One challenge you'll run into (and it's quite the ironic one) is that we'll need to figure out a way to migrate the migrations table :grin:
But yes, happy to accept that.
Alright, thanks for the Feedback! I like the migration of migrations joke 😆 I'll put this in our projects backlog then and get back to it asap.
Actions performed
A migration SQL script already deployed and applied was altered, so that in the altered migration a
CREATE TABLE
statement was amended with an additional column. That altered migration was deployed later along with the rest of the app using the migrate lib and ingested at app startup usingmigrate.Exec
.Expected outcome:
The
migrate.Exec
call should have panic'ed to indicate an unrecoverable difference of the expected and actual DB schema state, preventing the app from starting at all, so the issues would have been noticed by devops and could have been investigated.Actual outcome:
The execution of the migration didn't neither yield and error that would have been logged, nor did it panic. However from this point on the DB was in a corrupt state.
Proposal to mitigate the issue:
In the process of preparing the execution of migrations in
migrate.go:611 planMigrationCommon
the ID of the migrations is used to identify migrations already applied. An additional integrity check – for each already applied migration in the migrations table comparing a stored content hash to a computed hash of the corresponding script – could be performed here before any migrations are applied. If such a check fails, this could be used to panic at the original caller. An additional flag disabling that behavior by default could be used to ensure backwards compatibility.