mbucc / shmig

Database migration tool written in BASH.
BSD 3-Clause "New" or "Revised" License
461 stars 50 forks source link

Implement system for schema table migrations #47

Closed kael-shipman closed 6 years ago

kael-shipman commented 6 years ago

This is not a common use-case, but we should account for times when we want to add or change columns in the shmig schema table.

In my case, when I run shmig status, I'd like to see the names of migrations together with their version; something like this:

Applied migrations:
20180702_142703 initial_seed_data   2018-07-13 22:12:23
20180702_134037 init            2018-07-13 22:12:11

It's easy to make this change, but of course it would potentially break implementations currently running.

Not sure what the best solution to this might be, but I thought I'd put it out there for us to think about....

kael-shipman commented 6 years ago

Maybe something like this:

To avoid the ridiculous situation of tracking shmig schema migrations in a separate table, we could create a special class of migration called shmig_schema_migration with version 00_[current-shmig-version]. This will permit the following:

  1. We can check to see what the current schema table version is by querying MAX(version) WHERE version like '00_%'
  2. We can filter out shmig schema versions when running shmig status by adding WHERE version NOT LIKE '00_%'
  3. If we encounter errors in an internal sql query, we can run a check to see if the version is has changed, then run the internal migration by calling a new migrate_schema_table function.
mbucc commented 6 years ago

I can think of three more reasonable solutions:

  1. Treat the tuple (smig version table name, it's DDL) as immutable. Create a new shmig schema table name and make it the default.
  2. Screw it. Break thinks and keep the code simple.
  3. Add second table for meta data and point a foreign key to the main, existing table.

Think of the use case where someone has lots of migrations in their shmig schema table. Only option 3 does not affect them (if the query in the code uses a left outer join). Code change is pretty small I think. And it would also allow for further meta data expansion.

kael-shipman commented 6 years ago

I admit, I'm a little tempted by option 2, but it seems easy enough to solve this problem, even if it costs us 60 more sloc.

Regarding option 3, for some reason, having two tables for the migration system irks me.

I guess the reason I went for the solution I proposed was that it only uses one table, it's transparent to the user, and it would be fairly easy to implement. How could I improve the idea in a way that you'd be pleased with? Or can you think of any other one-table solutions?

kael-shipman commented 6 years ago

Actually, one minor improvement: Since the name shmig_schema_migration is uniform, we don't have to prefix the migration version with 00_, we can instead just insert the raw shmig version. That would save a little on text processing on retrieve and would still allow us to separate out shmig schema migrations from regular migrations (... WHERE name != 'shmig_shema_migration').

One more somewhat interesting idea would be to leave the table as is, but include the concept of "hooks". Since the config file is just a shell script that's included in the codebase, we could allow end users to define hook functions in it that we use to alter the way queries are formulated for version bumps and status gets. That would make the updates to shmig core essentially trivial, but it might make config files kinda sloppy if the feature is commonly used, and it also doesn't really solve the problem of schema table updates....

mbucc commented 6 years ago

I don't mind going with option 2 ... less code to maintain. Tag 1.0.0 first and crank to 2.0.0 and provide a migration script. This is not going to happen very often.

You may want to review the meta data Flyway and Liquibase store to see if there are other fields you find useful.

kael-shipman commented 6 years ago

Fair enough. Closing.