supabase / cli

Supabase CLI. Manage postgres migrations, run Supabase locally, deploy edge functions. Postgres backups. Generating types from your database schema.
https://supabase.com/docs/reference/cli/about
MIT License
1.02k stars 201 forks source link

URGENT: Supabase DB Dump Corrupting Data Due to UNIQ #2632

Closed tmountain closed 1 month ago

tmountain commented 1 month ago

Describe the bug

UPDATE: supabase db dump is running uniq on its output which removed any contingous lines that are identical. This is corrupting output data and needs an urgent fix. (see my comments below).

When I run supabase db dump I get back a different function definition than is present on our production host. When I run pg_dump on the same host, I get the correct definition. If I log into the host via the Supabase UI, I see the correct function def. If I query the function def via SQL, I get the correct result:

SELECT proname, prosrc  FROM pg_proc
WHERE proname = 'handle_lesson_completion';

To Reproduce

  1. Dump the schema via pg dump dump
supabase db dump --db-url 'postgresql://postgres.<instance_id>:[Password Removed]@aws-0-us-east-1.pooler.supabase.com:6543/postgres' > prod.sql
  1. Dump the same schema via pg_dump
pg_dump 'postgresql://postgres.<instance_id>:[Password Removed]@aws-0-us-east-1.pooler.supabase.com:6543/postgres' --schema-only -f
  1. Review the handle_lesson_completion function. It is different between the two dumps.

Expected behavior

The functions should be the same.

Screenshots

Excerpt of correct function def:

Screenshot 2024-08-23 at 3 49 09 PM

Excerpt of incorrect function def:

Screenshot 2024-08-23 at 3 49 49 PM

System information Rerun the failing command with --create-ticket flag. No additional output is provided.

Additional context I can provide my supabase instance id if needed.

tmountain commented 1 month ago

I figured out the cause myself. This is a serious bug. If you do a supabase db dump --dry-run, you will see that it runs a series of sed commands to post-process the data that it generates from pg_dump.

pg_dump \
    --schema-only \
    --quote-all-identifier \
    --exclude-schema "_analytics|_realtime|_supavisor|auth|extensions|pgbouncer|realtime|storage|supabase_functions|supabase_migrations|cron|dbdev|graphql|graphql_public|net|pgsodium|pgsodium_masks|pgtle|repack|tiger|tiger_data|timescaledb_*|_timescaledb_*|topology|vault|information_schema|pg_*" \
     \
| sed -E 's/^CREATE SCHEMA "/CREATE SCHEMA IF NOT EXISTS "/' \
| sed -E 's/^CREATE TABLE "/CREATE TABLE IF NOT EXISTS "/' \
| sed -E 's/^CREATE SEQUENCE "/CREATE SEQUENCE IF NOT EXISTS "/' \
| sed -E 's/^CREATE VIEW "/CREATE OR REPLACE VIEW "/' \
| sed -E 's/^CREATE FUNCTION "/CREATE OR REPLACE FUNCTION "/' \
| sed -E 's/^CREATE TRIGGER "/CREATE OR REPLACE TRIGGER "/' \
| sed -E 's/^CREATE PUBLICATION "supabase_realtime"/-- &/' \
| sed -E 's/^CREATE EVENT TRIGGER /-- &/' \
| sed -E 's/^         WHEN TAG IN /-- &/' \
| sed -E 's/^   EXECUTE FUNCTION /-- &/' \
| sed -E 's/^ALTER EVENT TRIGGER /-- &/' \
| sed -E 's/^ALTER FOREIGN DATA WRAPPER (.+) OWNER TO /-- &/' \
| sed -E 's/^ALTER DEFAULT PRIVILEGES FOR ROLE "supabase_admin"/-- &/' \
| sed -E "s/^GRANT (.+) ON (.+) \"(_analytics|_realtime|_supavisor|auth|extensions|pgbouncer|realtime|storage|supabase_functions|supabase_migrations|cron|dbdev|graphql|graphql_public|net|pgsodium|pgsodium_masks|pgtle|repack|tiger|tiger_data|timescaledb_*|_timescaledb_*|topology|vault|information_schema|pg_*)\"/-- &/" \
| sed -E "s/^REVOKE (.+) ON (.+) \"(_analytics|_realtime|_supavisor|auth|extensions|pgbouncer|realtime|storage|supabase_functions|supabase_migrations|cron|dbdev|graphql|graphql_public|net|pgsodium|pgsodium_masks|pgtle|repack|tiger|tiger_data|timescaledb_*|_timescaledb_*|topology|vault|information_schema|pg_*)\"/-- &/" \
| sed -E 's/^(CREATE EXTENSION IF NOT EXISTS "pg_tle").+/\1;/' \
| sed -E 's/^COMMENT ON EXTENSION (.+)/-- &/' \
| sed -E 's/^CREATE POLICY "cron_job_/-- &/' \
| sed -E 's/^ALTER TABLE "cron"/-- &/' \
| sed -E "/^--/d" \
| uniq

The last command in this sequence is a uniq. This means that supabase db dump will REMOVE any two lines in a dump that are identical. In the case of my function, I have two values that are generated using identical statements.

                timezone('UTC'::text, NOW()),
                timezone('UTC'::text, NOW()),

This will come up very frequently in many databases, so at this moment supabase db dump is corrupting data .

The offending line of code can be found here.

tmountain commented 1 month ago

PR submitted here:

https://github.com/supabase/cli/pull/2633