Open mfridman opened 2 years ago
Thankyou @mfridman đź‘Ť
I've been in situations where someone has modified an existing migration file, with the unfortunate end result that the state of databases in deployed environments are de-synced from the definition in the migrations, which becomes incredibly difficult to reconcile. Especially when the same set of migrations is meant to be run against different persistent databases.
While this is definitely a mistake on the part of the developer and not the tool, I think having a checksum validate that the defined migrations are the ones that have actually been run would be a really elegant solution to that problem.
There was some discussion around storing a blob of the migration contents .. but this is not feasible I think. The best we can do is store a hash of the contents. There are at least 2 issues to solve:
fwiw I'm in favour of 2, simply trying to understand the UX a bit, specifically this point:
... which becomes incredibly difficult to reconcile.
I could see a goose verify
or goose validate
or even goose status
that returns a status type reporting whether all migrations have been verified against their checksum. Now, say for migration 72 the state is "unverified" .. is it sufficient for the tool to report this and leave the rest to the operator?
I presume one could manually run through all the versions of a given migration and see which one produces a hash that matches?
The crux of the issue I think is the definition of the database's state.
Let's say I have a migration directory with 5 migrations defined. I point goose
at a database, running goose up
to get my database in sync with my migrations file. Assuming I've run some number of migrations already, the database has a table defined something like this:
CREATE TABLE goose_db_version (
id serial NOT NULL,
version_id bigint NOT NULL,
is_applied boolean NOT NULL,
tstamp timestamp NULL default now(),
PRIMARY KEY(id)
);
And in order to check what version we've run, we take a look at the version_id
and is_applied
columns to understand where we are in history between 0 and 5.
The problem lies in that the DB defines the state as "having run migrations 1, 2, and 3", but the end user more likely defines the state as "having run migrations number 1, 2, and 3 from this directory". That is to say, if the definition of migration number 2 has changed after it was run, the state of the DB is out of sync with the state of the migrations directory, but the goose_db_version
table doesn't capture that. And importantly, we run into the issue where goose
will run the next migration in the list, completely unaware of the fact that it's operating against an unknown DB state, which may be incompatible with the next migration.
All of which being a very long way of saying—the more accurate representation of state that Goose could capture would understand not only the version number of the migration that was run, but also the content of the migration.
If we revisit that scenario with an added checksum column next to each migration, we get:
goose up
I would argue that the only correct behavior in that scenario would be to abort the migration, because Goose does not have enough information to know what the correct behavior is. We can't rollback migrations, because there's no guarantee our "down" script matches the "down" script that existed when the "up" script was run. So Goose should say, "the operation you have requested is not possible because someone has run a migration other than the one you expect to have been run."
A user with a VCS can potentially figure out what migration was run, revert the migration to the previous state, and then goose can happily carry on. If that's lost, the best solution would probably be for the user to manually correct the DB state as desired. Once it reflects what the migrations file says it should, goose
can provide a command to replace the history in the DB with the history of the migration files. Flyway (another migration tool) does this and calls it "repair", but I think something like goose rewrite-history
would be a slightly more honest name.
Thanks for the insightful feedback.
I think we both agree a checksum of some kind is useful.
Sometimes we modify/add/remove comments for a given migration file and this would result in a checksum change. Should we instead store the parsed SQL statements? Furthermore, what about .go migrations .. what checksum to store there. I'm just doing a brain dump of some of the things we need to consider.
Also, another challenge is figuring out a migration path towards adding a new checksum
column, whether it should be done in a backwards compatible way and should goose
offer command(s) to help with this transition. Backwards compatibility is something we take very seriously and so would like to ensure this addition causes minimal (read none) disruption to existing users.
Once we have the checksum
in place then we can think about the intended user experience..
The current goose table stores the following pieces of information:
To improve
goose
, we could record a few additional items, such as:I'm not sure we even want to do this, but figured I'd open this issue as a placeholder.
Another thing we've considered previously is removing
is_applied
.. I don't recall why we need this.Some previous requests: #213