matthieu-foucault / postgit

Integrate declarative PostgreSQL schema migrations with Git
GNU General Public License v3.0
8 stars 0 forks source link
declarative-migrations git migra migration-automation postgresql rust

PostGit

The goal of PostGit is to integrate PostgreSQL schema diffing tools (e.g. migra) with Git to provide a modern development experience for PostgreSQL schemas.

This is a proof-of-concept, which started as my Hackathon Onboarding Project with Commit. Contributors are welcome.

Concept

The goal of PostGit is to enable PostgreSQL schema developers to write clean, refactorable SQL code which does not rely on a list of ordered migration files and does not require developers to write idempotent scripts.

By leveraging a schema diffing tool, the postgit push command generates a migration script between two committed schemas and applies the migration to a target database.

Usage

Diff command

Prints the migration between two committed SQL files

postgit diff [OPTIONS] --from <FROM> --to <TO> <PATH>

Arguments: <PATH> Path to the schema file or directory, relative to the repo root

Options:

Push command

Applies the migration between two committed SQL files onto the target database

postgit push [OPTIONS] --from <FROM> --to <TO> <PATH>

Arguments: <PATH> Path to the schema file or directory, relative to the repo root

Options:

Watch command

Watches a directory and applies the migrations to the target database

Usage: postgit watch <PATH>

Arguments: <PATH> Path to the directory to watch

Configuration

The behaviour of PostGit can be configured through a combination of configuration file and command line arguments.

PostgreSQL config

PostGit relies on three databases:

A local postgit.toml file can be used to define the PostgreSQL connection parameters. The default configuration is equivalent to the following

[diff_engine]

[diff_engine.source]
dbname='postgit_diff_source'
host='localhost'
port=5432
user='postgres'

[diff_engine.target]
dbname='postgit_diff_target'
host='localhost'
port=5432
user='postgres'

[target]
dbname='postgres'
host='localhost'
port=5432
user='postgres'

PostGit supports the following libpq environment variables for all three databases (the postgit.toml file takes precedence over env variables): PGHOST, PGUSER, PGPORT.

The PGDATABASE env variable can be used to specify the target database name.

Diff engine

PostGit relies on customisable external CLI tools to perform the schema diffing.

The current default schema diffing tool is migra, which can be installed by running pip install migra psycopg2-binary.

The diff tool can be configured with the diff_engine.command configuration option. The custom command must use two postgresql connection strings for the source and target databases as the positional arguments $1 and $2, respectively.

For instance, to use migra, the config.toml would contain the following (the default behaviour is equivalent to this configuration):

[diff_engine]
command='migra --unsafe $1 $2'

Using the CLI version of pgAdmin4 can be done with

[diff_engine]
command='docker run --network=host supabase/pgadmin-schema-diff $1 $2'

SQL files management

As your database schema grows, you will most likely want to split your SQL code into multiple files. To allow you to load multiple files in the desired order, PostGit supports:

Lexicographic path order (e.g. using numbered prefixes)

You can rely on file naming to define the file loading order, e.g., given the following file hierarchy

The files will be imported in the "a,b,c,d,e" order.

--import syntax

schema/schema.sql

create schema my_app;

schema/user.sql

-- import schema/schema.sql

create table my_app.user (
  id int primary key generated always as identity,
  given_name text not null,
  family_name text,
  email text not null
);

Import file resolution: