stripe / pg-schema-diff

Go library for diffing Postgres schemas and generating SQL migrations
MIT License
364 stars 26 forks source link

New columns add are not detected #154

Closed bheemvennapureddy closed 2 months ago

bheemvennapureddy commented 2 months ago
Screenshot 2024-08-22 at 7 59 14 AM

I was trying to add a new column and the apply/plan doesn't detect the changes

bheemvennapureddy commented 2 months ago

@bplunkett-stripe Can you share some insights on this ?

bplunkett-stripe commented 2 months ago

Hmmm that's curious. It might be possible the change isn't being picked up by pg-schema-diff because the filepath looks pretty interesting. Here are a couple things worth trying in order:

Sorry for this!

bheemvennapureddy commented 2 months ago

Will give that a try. out for this weekend will Provide feedback on Monday CST thanks

bheemvennapureddy commented 2 months ago

Seems to be a odd behaviour

One with dbo schema and one without dbo schema

Screenshot 2024-08-24 at 3 57 49 PM Screenshot 2024-08-24 at 3 59 00 PM Screenshot 2024-08-24 at 4 00 38 PM
bplunkett-stripe commented 2 months ago

Ahhh so I see the problem. It looks like you haven't declaratively defined your "dbo" schema, so it will totally drop it (and all objects contained within it). You need to have

CREATE SCHEMA dbo;

in your sql.

  1. When your desired scehma contains dbo objects, the schema can't be extracted. pg-schema-diff tried to create the target schema on a fresh database, i.e., execute(your sql), but your sql isn't valid because the dbo schema isn't created within it.
  2. When you only have the public.Channel, all existing dbo objects are dropped because they are not within your desired schema.

I'm a little confused what changed between the first example you gave (no diff found) and the next ones. My explanations only apply to your most recent reply and the first. Let me know if CREATE SCHEMA dbo; works for you, but I am still a bit curious about that No plant generated error seen initially.

bheemvennapureddy commented 2 months ago

@bplunkett-stripe Can you elaborate what you mean i haven't declaratively defined dbo schema ? only change i did was moved the scripts out of directory without periods. When the schema already exists, why does it need a declaration ?

Navbryce commented 2 months ago

If you don't declare your 'dbo' schema, then you are telling pg-schema-diff that it shouldn't exist. If it's NOT in your schema SQL AND it exists in your database, then you are implicitly stating it should be deleted.

This is analogous to having some schema 'foobar' that you want to delete. You would remove it from your schema SQL.

bheemvennapureddy commented 2 months ago

Seems annoying if we go that route - but if you look at already existing prod implementations, most of the time schemas are already created all we do is add change to existing schema. So you are saying there should be a script saying CREATE SCHEMA X IF NOT EXISTS ? What if someone is migrating to schema with an existing schema or someone have already created tables, and now they are switching to this ?

Navbryce commented 2 months ago

All you need to include is 'CREATE SCHEMA dbo' in your SQL.

This is required because how is pg-schema-diff supposed to differentiate between a schema you want to keep around and a schema you want to delete?

Moreover, it encourages best "declarative" practices in the sense that you can run your schema SQL on a fresh database and get the schema you want, i.e. for tests.

bheemvennapureddy commented 2 months ago

I was going to ask for a support for PRE and POST scripts implementation post this issue is resolved which will help for local setup and seed scripts etc.

Navbryce commented 2 months ago

Where do your concerns lie with just including "CREATE schema" in your SQL?

It ensures your schema is truly declarative AND that you can run it on clean databases.

bheemvennapureddy commented 2 months ago

Makes sense, nothing actually - will that not error out when the schema already exists ? And how can we ignore other schemas ?

Navbryce commented 2 months ago

It will not error out. Pg-schema-diff just runs it on a fresh Postgres database to extract the schema you are trying to reach.

As long as it can run on an empty database, your schema SQL is always valid. It doesn't need to be idempotent.

If you want to totally ignore schemas, there is an '--exclude-schema' flag (on mobile right now, so it might be slightly different). Sorry!

bheemvennapureddy commented 2 months ago

Gotcha. How do we handle seed scripts if we have to add seed data for tests should that be done outside of this ? I mean for test environments

Navbryce commented 2 months ago

Yep, seed data should be handled independently from the declared schema! If you insert data in your schema, pg-schema-diff will totally ignore it and the data won't actually be inserted into your database.

For SQL schemas, the lib basically spins up a new db, runs your schema sql from it, reads from the system tables to extract the target schema, and spins down the temp database. So your schema SQL only needs to be valid SQL on a fresh database and anything else you do within that SQL will effectively be ignored.

Navbryce commented 2 months ago

Let me know if updating the SQL schema with "CREATE SCHEMA" works, and I'll mark the issue as done.

I think one takeaway is we probably need to improve our documentation/examples around declarative schemas.

bheemvennapureddy commented 2 months ago

Sure I will try that. Also how about supporting folders with periods in it ?

bheemvennapureddy commented 2 months ago

Weirdly the path with periods work too, not sure what happened the first time.

Navbryce commented 2 months ago

Nice! Everything seems to be working?

bheemvennapureddy commented 2 months ago

@Navbryce not really look at the below logs for brand new database

➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗ psql -U postgres -h localhost -d webanalytics
Password for user postgres:
psql (14.12 (Homebrew), server 16.4 (Debian 16.4-1.pgdg120+1))
WARNING: psql major version 14, server major version 16.
         Some psql features might not work.
Type "help" for help.

webanalytics=# /d+
webanalytics-# /dn+
webanalytics-# \dn
      List of schemas
  Name  |       Owner
--------+-------------------
 public | pg_database_owner
(1 row)

webanalytics-# exit
Use \q to quit.
webanalytics-# \q
➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗ cat test.db/Scripts/Create/script000_dbo.sql
-- DROP SCHEMA IF EXISTS dbo CASCADE;
CREATE SCHEMA IF NOT EXISTS dbo;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";%                                                                                                                                                                        ➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗ cat test.db/Scripts/Create/script001_channel.sql
-- Table: dbo.channel

CREATE TABLE dbo."Channel"
(
    "CHANNEL_KEY" INTEGER NOT NULL,
    "NAME" VARCHAR(20) NOT NULL,
    CONSTRAINT pk_channel PRIMARY KEY ("CHANNEL_KEY")
);%
➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗ pg-schema-diff plan --dsn "postgresql://postgres:mysecretpassword@localhost:5432/webanalytics" --schema-dir test.db/ --include-schema dbo
Schema matches expected. No plan generated
➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗
bheemvennapureddy commented 2 months ago

looks like a folder problem - it couldn't get through all the child folders

➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗ pg-schema-diff plan --dsn "postgresql://postgres:mysecretpassword@localhost:5432/webanalytics" --schema-dir test.db/Scripts/Create --include-schema dbo

################################ Generated plan ################################
1. CREATE SCHEMA "dbo";
    -- Statement Timeout: 3s

2. CREATE TABLE "dbo"."Channel" (
    "CHANNEL_KEY" integer NOT NULL,
    "NAME" character varying(20) COLLATE "pg_catalog"."default" NOT NULL
);
    -- Statement Timeout: 3s

3. CREATE UNIQUE INDEX CONCURRENTLY pk_channel ON dbo."Channel" USING btree ("CHANNEL_KEY");
    -- Statement Timeout: 20m0s
    -- Lock Timeout: 3s

4. ALTER TABLE "dbo"."Channel" ADD CONSTRAINT "pk_channel" PRIMARY KEY USING INDEX "pk_channel";
    -- Statement Timeout: 3s
➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗ pg-schema-diff plan --dsn "postgresql://postgres:mysecretpassword@localhost:5432/webanalytics" --schema-dir test.db/Scripts/ --include-schema dbo
Schema matches expected. No plan generated
➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗
bheemvennapureddy commented 2 months ago

Where do the plan file go ? Where does the apply command grab the plan from ?

bplunkett-stripe commented 2 months ago

Interesting with the folder problem! Let me look into that! That's definitely a bug.

Where do the plan file go ? Where does the apply command grab the plan from ?

The apply plan just regenerates the plan. If you wanted, you could hypothetically persist the plan yourself in a file -- directly call into the lib -- and then build your own little CLI. Alternatively, I think it might be a cool upstream change to have the ability to manually persist plans.

bheemvennapureddy commented 2 months ago

@bplunkett-stripe Agree it would help lot of dev's to see hey what plan file looks like - mainly for us that's a habit because we generate a create script for our SQL server changes using dacpac file and create script is what we deploy to prod as well it will handy for local development rather than just look at the verbose logs

bheemvennapureddy commented 2 months ago

@bplunkett-stripe Saw one more issue , if you look at the outcome here it doesn't pickup the create extension

looks like a folder problem - it couldn't get through all the child folders

➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗ pg-schema-diff plan --dsn "postgresql://postgres:mysecretpassword@localhost:5432/webanalytics" --schema-dir test.db/Scripts/Create --include-schema dbo

################################ Generated plan ################################
1. CREATE SCHEMA "dbo";
  -- Statement Timeout: 3s

2. CREATE TABLE "dbo"."Channel" (
  "CHANNEL_KEY" integer NOT NULL,
  "NAME" character varying(20) COLLATE "pg_catalog"."default" NOT NULL
);
  -- Statement Timeout: 3s

3. CREATE UNIQUE INDEX CONCURRENTLY pk_channel ON dbo."Channel" USING btree ("CHANNEL_KEY");
  -- Statement Timeout: 20m0s
  -- Lock Timeout: 3s

4. ALTER TABLE "dbo"."Channel" ADD CONSTRAINT "pk_channel" PRIMARY KEY USING INDEX "pk_channel";
  -- Statement Timeout: 3s
➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗ pg-schema-diff plan --dsn "postgresql://postgres:mysecretpassword@localhost:5432/webanalytics" --schema-dir test.db/Scripts/ --include-schema dbo
Schema matches expected. No plan generated
➜  WebAnalytics.Database.Postgres git:(Add_postgres_db_migrations) ✗
bplunkett-stripe commented 2 months ago

Does your database already have the extension enabled? It wouldn't pick it up if the extension has already been created. Also your SQL doesn't need to include IF NOT EXISTS if you do not want to include it. pg-schema-diff always runs that SQL on a fresh, empty database.

bplunkett-stripe commented 2 months ago

I cut #155 for the directory problem

bheemvennapureddy commented 2 months ago

Does your database already have the extension enabled? It wouldn't pick it up if the extension has already been created. Also your SQL doesn't need to include IF NOT EXISTS if you do not want to include it. pg-schema-diff always runs that SQL on a fresh, empty database.

That was entirely a fresh db and extension doesn't exist

bplunkett-stripe commented 2 months ago

What does:

SELECT
    ext.oid,
    ext.extname::TEXT AS extension_name,
    ext.extversion AS extension_version,
    extension_namespace.nspname::TEXT AS schema_name
FROM pg_catalog.pg_namespace AS extension_namespace
INNER JOIN
    pg_catalog.pg_extension AS ext
    ON ext.extnamespace = extension_namespace.oid
WHERE
    extension_namespace.nspname NOT IN ('pg_catalog', 'information_schema')
    AND extension_namespace.nspname !~ '^pg_toast'
    AND extension_namespace.nspname !~ '^pg_temp';

return?

Also

\dx

in psql would be useful!

I'm hoping this is something trivial...like it is installed by default somewhere.

bheemvennapureddy commented 2 months ago

SELECT ext.oid, ext.extname::TEXT AS extension_name, ext.extversion AS extension_version, extension_namespace.nspname::TEXT AS schema_name FROM pg_catalog.pg_namespace AS extension_namespace INNER JOIN pg_catalog.pg_extension AS ext ON ext.extnamespace = extension_namespace.oid WHERE extension_namespace.nspname NOT IN ('pg_catalog', 'information_schema') AND extension_namespace.nspname !~ '^pg_toast' AND extension_namespace.nspname !~ '^pg_temp';

Screenshot 2024-08-26 at 1 01 05 PM
bheemvennapureddy commented 2 months ago

Complete log for details

    -- Statement Timeout: 3s
➜  WebAnalytics git:(Add_postgres_db_migrations) ✗ psql -U postgres -h localhost -d webanalytics
Password for user postgres:
psql (14.12 (Homebrew), server 16.4 (Debian 16.4-1.pgdg120+1))
WARNING: psql major version 14, server major version 16.
         Some psql features might not work.
Type "help" for help.

webanalytics=# \dx
                 List of installed extensions
  Name   | Version |   Schema   |         Description
---------+---------+------------+------------------------------
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
(1 row)

webanalytics=# SELECT
    ext.oid,
    ext.extname::TEXT AS extension_name,
    ext.extversion AS extension_version,
    extension_namespace.nspname::TEXT AS schema_name
FROM pg_catalog.pg_namespace AS extension_namespace
INNER JOIN
    pg_catalog.pg_extension AS ext
    ON ext.extnamespace = extension_namespace.oid
WHERE
    extension_namespace.nspname NOT IN ('pg_catalog', 'information_schema')
    AND extension_namespace.nspname !~ '^pg_toast'
    AND extension_namespace.nspname !~ '^pg_temp';
 oid | extension_name | extension_version | schema_name
-----+----------------+-------------------+-------------
(0 rows)

webanalytics=# \q
➜  WebAnalytics git:(Add_postgres_db_migrations) ✗ pg-schema-diff plan --dsn "postgresql://postgres:mysecretpassword@localhost:5432/webanalytics" --schema-dir WebAnalytics.Database.Postgres/test.db/Scripts/Create --include-schema dbo

################################ Generated plan ################################
1. CREATE SCHEMA "dbo";
    -- Statement Timeout: 3s

2. CREATE TABLE "dbo"."Channel" (
    "CHANNEL_KEY" integer NOT NULL,
    "NAME" character varying(20) COLLATE "pg_catalog"."default" NOT NULL
);
    -- Statement Timeout: 3s

3. CREATE UNIQUE INDEX CONCURRENTLY pk_channel ON dbo."Channel" USING btree ("CHANNEL_KEY");
    -- Statement Timeout: 20m0s
    -- Lock Timeout: 3s

4. ALTER TABLE "dbo"."Channel" ADD CONSTRAINT "pk_channel" PRIMARY KEY USING INDEX "pk_channel";
    -- Statement Timeout: 3s
➜  WebAnalytics git:(Add_postgres_db_migrations) ✗
bplunkett-stripe commented 2 months ago

AHhh I see the problem. So --include-schema says "only diff this schema". Your create extension doesn't specify a schema, so it defaults to 'public', which we don't bother diffing because your --include-schema implicitly says to ignore it.

You can either (any of the following):

  1. --include-schema dbo'--include-schema public'
  2. Install the extension on dbo
  3. OR, just remove the flag entirely such that all schemas are diffed
bheemvennapureddy commented 2 months ago
➜  WebAnalytics git:(Add_postgres_db_migrations) ✗ pg-schema-diff plan --dsn "postgresql://postgres:mysecretpassword@localhost:5432/webanalytics" --schema-dir WebAnalytics.Database.Postgres/test.db/Scripts/Create --include-schema dbo

################################ Generated plan ################################
1. CREATE SCHEMA "dbo";
    -- Statement Timeout: 3s

2. CREATE EXTENSION "uuid-ossp" WITH SCHEMA "dbo" VERSION "1.1";
    -- Statement Timeout: 3s

3. CREATE TABLE "dbo"."Channel" (
    "CHANNEL_KEY" integer NOT NULL,
    "NAME" character varying(20) COLLATE "pg_catalog"."default" NOT NULL
);
    -- Statement Timeout: 3s

4. CREATE UNIQUE INDEX CONCURRENTLY pk_channel ON dbo."Channel" USING btree ("CHANNEL_KEY");
    -- Statement Timeout: 20m0s
    -- Lock Timeout: 3s

5. ALTER TABLE "dbo"."Channel" ADD CONSTRAINT "pk_channel" PRIMARY KEY USING INDEX "pk_channel";
    -- Statement Timeout: 3s
➜  WebAnalytics git:(Add_postgres_db_migrations) ✗

Ah that makes sense thanks. Appreciate your feedback @bplunkett-stripe

bplunkett-stripe commented 2 months ago

No problem! I feel like this indicates some sort of documentation gap on our end, like a FAQ/troubleshooting doc 🤔