pressly / goose

A database migration tool. Supports SQL migrations and Go functions.
http://pressly.github.io/goose/
Other
6.72k stars 504 forks source link

feature: add additional columns to goose table #288

Open mfridman opened 2 years ago

mfridman commented 2 years ago

The current goose table stores the following pieces of information:

id
version_id
is_applied
tstamp

To improve goose, we could record a few additional items, such as:

filename
checksum
duration_in_ms

I'm not sure we even want to do this, but figured I'd open this issue as a placeholder.

Another thing we've considered previously is removing is_applied .. I don't recall why we need this.


Some previous requests: #213

Ganesh-Ponipireddy commented 2 years ago

Thankyou @mfridman đź‘Ť

rliebz commented 2 years ago

I've been in situations where someone has modified an existing migration file, with the unfortunate end result that the state of databases in deployed environments are de-synced from the definition in the migrations, which becomes incredibly difficult to reconcile. Especially when the same set of migrations is meant to be run against different persistent databases.

While this is definitely a mistake on the part of the developer and not the tool, I think having a checksum validate that the defined migrations are the ones that have actually been run would be a really elegant solution to that problem.

mfridman commented 2 years ago

There was some discussion around storing a blob of the migration contents .. but this is not feasible I think. The best we can do is store a hash of the contents. There are at least 2 issues to solve:

  1. How to migrate a migration tool and can this be done in a backwards-compatible way? (the crux of this issue in adding additional tables)
  2. Say we do hash the contents and store that in the database .. how does that help?

fwiw I'm in favour of 2, simply trying to understand the UX a bit, specifically this point:

... which becomes incredibly difficult to reconcile.

I could see a goose verify or goose validate or even goose status that returns a status type reporting whether all migrations have been verified against their checksum. Now, say for migration 72 the state is "unverified" .. is it sufficient for the tool to report this and leave the rest to the operator?

I presume one could manually run through all the versions of a given migration and see which one produces a hash that matches?

rliebz commented 2 years ago

The crux of the issue I think is the definition of the database's state.

Let's say I have a migration directory with 5 migrations defined. I point goose at a database, running goose up to get my database in sync with my migrations file. Assuming I've run some number of migrations already, the database has a table defined something like this:

CREATE TABLE goose_db_version (
                id serial NOT NULL,
                version_id bigint NOT NULL,
                is_applied boolean NOT NULL,
                tstamp timestamp NULL default now(),
                PRIMARY KEY(id)
            );

And in order to check what version we've run, we take a look at the version_id and is_applied columns to understand where we are in history between 0 and 5.

The problem lies in that the DB defines the state as "having run migrations 1, 2, and 3", but the end user more likely defines the state as "having run migrations number 1, 2, and 3 from this directory". That is to say, if the definition of migration number 2 has changed after it was run, the state of the DB is out of sync with the state of the migrations directory, but the goose_db_version table doesn't capture that. And importantly, we run into the issue where goose will run the next migration in the list, completely unaware of the fact that it's operating against an unknown DB state, which may be incompatible with the next migration.

All of which being a very long way of saying—the more accurate representation of state that Goose could capture would understand not only the version number of the migration that was run, but also the content of the migration.

If we revisit that scenario with an added checksum column next to each migration, we get:

  1. A user runs goose up
  2. Goose sees that there are 5 versions in the migrations directory, and 3 in the database
  3. Goose creates a checksum of the v1 migration file, then compares it to the one in the DB. It sees that they match
  4. Goose creates a checksum of the v2 migration file, then compares it to the one in the DB. They do not match, which signals that the migration run in the database was some migration other than the one the end user specified.

I would argue that the only correct behavior in that scenario would be to abort the migration, because Goose does not have enough information to know what the correct behavior is. We can't rollback migrations, because there's no guarantee our "down" script matches the "down" script that existed when the "up" script was run. So Goose should say, "the operation you have requested is not possible because someone has run a migration other than the one you expect to have been run."

A user with a VCS can potentially figure out what migration was run, revert the migration to the previous state, and then goose can happily carry on. If that's lost, the best solution would probably be for the user to manually correct the DB state as desired. Once it reflects what the migrations file says it should, goose can provide a command to replace the history in the DB with the history of the migration files. Flyway (another migration tool) does this and calls it "repair", but I think something like goose rewrite-history would be a slightly more honest name.

mfridman commented 2 years ago

Thanks for the insightful feedback.

I think we both agree a checksum of some kind is useful.

Sometimes we modify/add/remove comments for a given migration file and this would result in a checksum change. Should we instead store the parsed SQL statements? Furthermore, what about .go migrations .. what checksum to store there. I'm just doing a brain dump of some of the things we need to consider.

Also, another challenge is figuring out a migration path towards adding a new checksum column, whether it should be done in a backwards compatible way and should goose offer command(s) to help with this transition. Backwards compatibility is something we take very seriously and so would like to ensure this addition causes minimal (read none) disruption to existing users.

Once we have the checksum in place then we can think about the intended user experience..