stripe / pg-schema-diff

Go library for diffing Postgres schemas and generating SQL migrations
MIT License
381 stars 26 forks source link
golang postgres postgres-schema postgres-schema-diff

pg-schema-diff

run_tests Go Report Card Go Reference GitHub Release

Computes the diff(erences) between Postgres database schemas and generates the SQL required to get your database schema from point A to B with minimal downtime & locks. This enables you to take your database and migrate it to any desired schema defined in plain DDL.

The tooling attempts to use native postgres migration operations to perform online migrations and avoid locking wherever possible. Not all migrations will be lock-free and some might require downtime, but the hazards system will warn you ahead of time when that's the case. Stateful online migration techniques, like shadow tables, aren't yet supported.

Online index Replacement

Your project's diff:

$ git diff
diff --git a/schema/schema.sql b/schema/schema.sql
index cc3a14b..cf4b32d 100644
--- a/schema/schema.sql
+++ b/schema/schema.sql
@@ -2,5 +2,5 @@ CREATE TABLE foobar(
    created_at timestamp,
    message text
 );
-CREATE INDEX message_idx ON foobar(message);
+CREATE INDEX message_idx ON foobar(message, created_at);

The generated plan (queries using message_idx will always have an index backing them, even while the new index is being built):

$ pg-schema-diff plan --dsn "postgres://postgres:postgres@localhost:5432/postgres" --schema-dir ./schema
################################ Generated plan ################################
1. ALTER INDEX "message_idx" RENAME TO "pgschemadiff_tmpidx_message_idx_IiaKzkvPQtyA7ob9piVqiQ";
        -- Statement Timeout: 3s

2. CREATE INDEX CONCURRENTLY message_idx ON public.foobar USING btree (message, created_at);
        -- Statement Timeout: 20m0s
        -- Lock Timeout: 3s
        -- Hazard INDEX_BUILD: This might affect database performance. Concurrent index builds require a non-trivial amount of CPU, potentially affecting database performance. They also can take a while but do not lock out writes.

3. DROP INDEX CONCURRENTLY "pgschemadiff_tmpidx_message_idx_IiaKzkvPQtyA7ob9piVqiQ";
        -- Statement Timeout: 20m0s
        -- Lock Timeout: 3s
        -- Hazard INDEX_DROPPED: Dropping this index means queries that use this index might perform worse because they will no longer will be able to leverage it.

Online NOT NULL constraint creation

Your project's diff:

diff --git a/schema/schema.sql b/schema/schema.sql
index cc3a14b..5a1cec2 100644
--- a/schema/schema.sql
+++ b/schema/schema.sql
@@ -1,5 +1,5 @@
 CREATE TABLE foobar(
-   created_at timestamp,
+   created_at timestamp NOT NULL,
    message text
 );
 CREATE INDEX message_idx ON foobar(message);

The generated plan (leverages check constraints to eliminate the need for a long-lived access-exclusive lock on the table):

$ pg-schema-diff plan --dsn "postgres://postgres:postgres@localhost:5432/postgres" --schema-dir ./schema
################################ Generated plan ################################
1. ALTER TABLE "public"."foobar" ADD CONSTRAINT "pgschemadiff_tmpnn_BCOxMXqAQwaXlKPCRXoMMg" CHECK("created_at" IS NOT NULL) NOT VALID;
        -- Statement Timeout: 3s

2. ALTER TABLE "public"."foobar" VALIDATE CONSTRAINT "pgschemadiff_tmpnn_BCOxMXqAQwaXlKPCRXoMMg";
        -- Statement Timeout: 3s

3. ALTER TABLE "public"."foobar" ALTER COLUMN "created_at" SET NOT NULL;
        -- Statement Timeout: 3s

4. ALTER TABLE "public"."foobar" DROP CONSTRAINT "pgschemadiff_tmpnn_BCOxMXqAQwaXlKPCRXoMMg";
        -- Statement Timeout: 3s

Key features

Library

go get -u github.com/stripe/pg-schema-diff@latest

Using CLI

1. Apply schema to fresh database

Create a directory to hold your schema files. Then, generate sql files and place them into a schema dir.

mkdir schema
echo "CREATE TABLE foobar (id int);" > schema/foobar.sql
echo "CREATE TABLE bar (id varchar(255), message TEXT NOT NULL);" > schema/bar.sql

Apply the schema to a fresh database. The connection string spec can be found here. Setting the PGPASSWORD env var will override any password set in the connection string and is recommended.

pg-schema-diff apply --dsn "postgres://postgres:postgres@localhost:5432/postgres" --schema-dir schema 

2. Updating schema

Update the SQL file(s)

echo "CREATE INDEX message_idx ON bar(message)" >> schema/bar.sql

Apply the schema. Any hazards in the generated plan must be approved

pg-schema-diff apply --dsn "postgres://postgres:postgres@localhost:5432/postgres" --schema-dir schema --allow-hazards INDEX_BUILD

Using Library

Docs to use the library can be found here. Check out the CLI for an example implementation with the library

1. Generating plan

// The tempDbFactory is used in plan generation to extract the new schema and validate the plan
tempDbFactory, err := tempdb.NewOnInstanceFactory(ctx, func(ctx context.Context, dbName string) (*sql.DB, error) {
    copiedConfig := connConfig.Copy()
    copiedConfig.Database = dbName
    return openDbWithPgxConfig(copiedConfig)
})
if err != nil {
    panic("Generating the TempDbFactory failed")
}
defer tempDbFactory.Close()
// Generate the migration plan
plan, err := diff.Generate(ctx, connPool, diff.DDLSchemaSource(ddl),
    diff.WithTempDbFactory(tempDbFactory),
    diff.WithDataPackNewTables(),
)
if err != nil {
    panic("Generating the plan failed")
}   

2. Applying plan

We leave plan application up to the user. For example, you might want to take out a session-level advisory lock if you are concerned about concurrent migrations on your database. You might also want a second user to approve the plan before applying it.

Example apply:

for _, stmt := range plan.Statements {
    if _, err := conn.ExecContext(ctx, fmt.Sprintf("SET SESSION statement_timeout = %d", stmt.Timeout.Milliseconds())); err != nil {
        panic(fmt.Sprintf("setting statement timeout: %s", err))
    }
    if _, err := conn.ExecContext(ctx, fmt.Sprintf("SET SESSION lock_timeout = %d", stmt.LockTimeout.Milliseconds())); err != nil {
        panic(fmt.Sprintf("setting lock timeout: %s", err))
    }
    if _, err := conn.ExecContext(ctx, stmt.ToSQL()); err != nil {
        panic(fmt.Sprintf("executing migration statement. the database maybe be in a dirty state: %s: %s", stmt, err))
    }
}

Supported Postgres versions

Supported: 14, 15, 16
Unsupported: <= 13 are not supported. Use at your own risk.

Unsupported migrations

An abridged list of unsupported migrations:

Contributing

This project is in its early stages. We appreciate all the feature/bug requests we receive, but we have limited cycles to review direct code contributions at this time. See Contributing to learn more.