bikeshedder / tusker

PostgreSQL migration management tool
The Unlicense
208 stars 17 forks source link

Comparing schema/migration to a DB #1

Closed Swizz closed 4 years ago

Swizz commented 4 years ago

Hi !

Awesome job on Tusker, that looks highly valuable for our needs. But, like discussed in https://github.com/djrobstep/migra/issues/131, if djrobstep/migra allows to generate a migration between two databases ; here, Tusker is meant to generate a new migration by comparing the result of the old ones and the current schema.

Here, we would want to compare a schema (hold by a git repository) to the actual DB to create the migration of what it is needed to be done to apply the current state of the handwritten schema to the live database.

Would you recommend Tusker to achieve this ? I dont know, how we would use/mutate Tusker for this purpose. Or if you would want us to fork Tusker to make a new soft that meet our needs.

Cheers,

bikeshedder commented 4 years ago

This could be a very interesting feature for Tusker.

I usually expect the database to be 100% in sync with the migrations, but if any manual operations have been applied to the DB this no longer holds true. For this use case it would be useful to add a some commands and/or options to create diffs based on the DB rather than the schema file.

I think tusker diff should get three additional flags to support that:

  1. --from or --source which accepts the parameters migrations, schema or database (defaults to migrations)
  2. --to or --target which accepts the parameters migrations, schema or database (defaults to schema)
  3. --reverse inverts the from/source and to/target parameter.

A new command tusker check could go hand in hand with that:

tusker check is basicly a simplified tusker diff which doesn't output the diff but just reports if there are changes and provides an exit code which can be used for CI setups.

tusker check --three-way/-3 does a three-way check and makes sure migrations, schema and database are all in sync.

bikeshedder commented 4 years ago

I just realized I forgot to push the code for the 0.1.2 PyPi release to GitHub. If you're going to work on this feature please make sure to pull the updated code first. It fixes the DB connection close issue that used to be fixed by some rather crude code.

I might be able to take a look at this in the next few days. Though I don't have an urgent need for this myself right now. If you want to speed things up and implement this yourself I'd be more than happy to merge a PR.

bikeshedder commented 4 years ago

2 is now merged and I implemented the check command. I decided against passing --source and --target to the check command but instead opted for a nargs='+' list of so called backends. I'm not too happy about the terminology but couldn't come up with a better name for that. :see_no_evil:

Without any arguments the check command will compare the migrations against the schema:

tusker check

And if you want to compare all three (migrations, schema and database) against each other you can just run:

tusker check all

Btw. I also removed the --from and --to alias and added -s and -t abbreviations instead. Having two names for the same thing would just cause confusion. When proposing the arguments I didn't mean to support both names at the same time. I just couldn't decide which name to use. Since the code already uses source and target I just stuck with it and removed --from and --to arguments instead.

I think everything is set for a 0.2.0 release then.

bikeshedder commented 4 years ago

The only think I don't like about that is the different syntax:

tusker diff --source database --target schema

vs

tusker check schema database

One last minute change... I think I'm going to change the syntax to use positional arguments instead:

tusker diff [--reverse] [source] [target]

Just specifying the target on its own is confusing as hell anyways so this probably makes a lot more sense as it forces the user to specify the source if using a different target.

When testing the code I used it wrong a few times so that's a good indicator that using ---arguments leads to a very confusing syntax especially as source and target are both names that don't make a lot of sense when talking about diffs.

Swizz commented 4 years ago

I am not against, as the linux diff command is using positional arguments. Even git diff has an identical behavior.

bikeshedder commented 4 years ago

After studying the diff manpage I found the terms from and to and made those two arguments into positional ones and also changed the metavar accordingly:

$ poetry run tusker diff --help
usage: tusker diff [-h] [--reverse] [from] [to]

This command calculates the difference between two database schemas.
The from- and to-parameter accept one of the following backends: migrations,
schema, database

positional arguments:
  from           from-backend for the diff operation. Default: migrations
  to             to-backend for the diff operation. Default: schema

optional arguments:
  -h, --help     show this help message and exit
  --reverse, -r  swaps the "from" and "to" arguments creating a reverse diff

The code itself still uses the variable names source and target as from is a python keyword and naming it from_ everywhere just seamed silly to me. Anyone working on Tusker should get that easily. For users of the command from and to just make a lot more sense.

bikeshedder commented 4 years ago

I just released 0.2.0 on PyPI. Thanks a lot for the input and PR. :tada:

https://pypi.org/project/tusker/0.2.0/