bikeshedder / tusker

PostgreSQL migration management tool
The Unlicense
208 stars 17 forks source link

add option to apply the diff #35

Closed mingfang closed 8 months ago

mingfang commented 8 months ago

Add an option to the diff command to apply the diff(sql) to the database, instead of just printing it out. e.g.

tusker --verbose diff database --apply
bikeshedder commented 8 months ago

I'm not a huge fan of this use-case as it feels wrong directly applying a diff directly to the database.

I'm not so much against having this feature in tusker but I'm highly against overloading the diff command that way. A diff should not apply the patch directly. Also what does tusker diff --apply mean if not specifying database as from argument. That flag overloads the diff command in a strange way.

tbh. I'd recommend just piping the output of tusker diff into psql:

tusker diff database | psql

The only downside is that you need to specify the connection parameters to the psql command.

How about adding a tusker run command that connects to the database and executes some SQL either passed in as filename or via stdin?

That way you could just do the following:

tusker diff database | tusker run

It's still a bit against the spirit of tusker, but I like that much better than adding a diff+apply command which is kind of against the whole idea of tusker.

I'd always recommend doing this instead:

tusker diff > tmp.sql
...now check if the generated SQL is okay to run...
tusker run < tmp.sql

As of now tusker never modifies existing databases. It only creates temporary databases to perform the diffing (using migra). That run command would be the first command to actually perform mutating queries on the target database.

mingfang commented 8 months ago

we could add a new apply command. e.g. tusker apply database When database is not specified then it appears to ignore the sql.

The apply term/concept comes from Terraform https://developer.hashicorp.com/terraform/cli/commands/apply We can call it something else; It's just natural for me since my infra is managed by Terraform.

Agree that this brings Tusker into a completely new territory(modifying the database), but I this solves a huge problem for me.
While Terraform solves the problem of infrastructure as code, this change can enable Tusker to solves "database as code". You simply declare how you want the schema to look like and let the automation make it happen.
It's a game changer that solves a multi-decade old problem.

bikeshedder commented 8 months ago

I'm currently in progress of rewriting Tusker in Rust and in the rewrite it will come with a built-in migration manager. In a future version of Tusker (no ETA, yet) you will be able to do the following:

tusker diff > migrations/0815_some_migration.sql
tusker migrate

That migration file is crucial however as it might contain not just schema migrations but data migrations as well. Just think of adding a NOT NULL column to an existing database. The automatically generated migrations just doesn't work and you need to modify it anyways. So how would you tackle this if you just called tusker apply in your terraform script?

By keeping this a two step process it's always the same procedure. Do this on your local machine:

tusker diff > migrations/0816_something_complicated.sql
...fix the just generated migration...
tusker migrate

And then call tusker migrate as part of the terraform script. The migration manager will keep track of the already applied migrations.

Agree that this brings Tusker into a completely new territory(modifying the database), but I this solves a huge problem for me.

Just applying the new schema without a properly verified migration is a game changer, but in the very wrong direction. It's a loaded foot gun with the trigger half-pulled.

You should never trust the output of tusker diff to do the right thing. There are several well documented migra limitations which can ruin your database. If you really don't care about your data simply recreate the database and call psql directly with the SQL files.

While Terraform solves the problem of infrastructure as code, this change can enable Tusker to solves "database as code". You simply declare how you want the schema to look like and let the automation make it happen.

"Database as code" only works if you're starting fresh or you work with schemas that don't have any constraints and all fields are nullable. Which is a terrible database design to begin with.

It's a game changer that solves a multi-decade old problem.

Migration managers are nothing new. There are plenty of implementations available, which solve this problem quite well.


Maybe I don't get what you're trying to do. Do you use tusker to bootstrap an empty database or do you really want to modify an existing database with precious existing data?

mingfang commented 8 months ago

I am trying to avoid using migration managers.

I do agree that for production use then we should use a two steps process, 1-generate sql, 2-run/apply it.
Terraform does the same.

I also agree that migra has issues and I plan to fork and fix some since it appears that project is stale and abandoned. So far I've seen problems with altering generated columns and creation of inherited tables out of order.

At this point I will close the issue and the PR since it's clear that we're going in different directions. Thanks for Tusker. It's been a huge time saver for me.

bikeshedder commented 8 months ago

@mingfang I'm curious. What's wrong with the idea of adding a run command so you can call

tusker diff database | tusker run

It follows the unix philosophy and it's clear what it does. As a neat side effect you can also use it to run arbitrary SQL code.

mingfang commented 8 months ago

Nothing wrong with it. I'll remove the --apply arg to diff and will create a run/apply command in my fork. But the goal is still to avoid migration managers.

dlight commented 8 months ago

@bikeshedder

I'm currently in progress of rewriting Tusker in Rust

Not sure if this thread is the right place to ask, but, did you check out postgres_migrator? It says in the readme:

Credits

  • migra for making it possible to diff schemas.
  • tusker was the inspiration for using temporary databases as diff targets. postgres_migrator adds the ability to generate and run versioned migrations and to perform compaction.
bikeshedder commented 8 months ago

@dlight I opened a new issue to discuss this: