yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9.06k stars 1.09k forks source link

[YSQL] How To Migrate From Yugabyte to Regular Postgres? #13407

Open pchang388 opened 2 years ago

pchang388 commented 2 years ago

Jira Link: DB-3036

Description

Platform: Kubernetes via Helm Version: yugabyte:2.12.7.0-b27

There is a bunch of docs on how to migrate from postgres to yugabyte it appears but nothing about migrating from yugabyte to postgres.

When I try to migrate a single yugabyte db to postgres instance, it always fails with errors like:

invalid command \
invalid command \
invalid command \
invalid command \
...
...
...
ERROR:  relation "public.REDACT_rules" does not exist
ERROR:  syntax error at or near "("
LINE 11:     CONSTRAINT "REDACT_pkey" PRIMARY KEY(("id") HASH...
...
...
...
ERROR:  syntax error at or near "index_kafka_REDACT_135e8f508552f4b_hekahlef"
LINE 1: index_kafka_REDACT_135e8f508552f4b_hekahlef 2022-04-01T11:40:...

I am using the ysql_dump tool and already tried with the below flags:

/home/yugabyte/postgres/bin/ysql_dump -d REDACT -h yb-tservers.yugabyte  > /tmp/dev_yugabyte_psql-ha_migrate_20220722.sql

/home/yugabyte/postgres/bin/ysql_dump --quote-all-identifiers -d REDACT-h yb-tservers.yugabyte > /tmp/dev_yugabyte_psql-ha_migrate_20220722_psql.sql

/home/yugabyte/postgres/bin/ysql_dump --inserts -d REDACT -h yb-tservers.yugabyte > /tmp/dev_yugabyte_psql-ha_migrate_20220722_psql.sql

If we wanted to migrate a database back to postgres, what is the expectation? Any help would be appreciated

pchang388 commented 2 years ago

I figured it out and I've learned that it will be different process depending on how you set up things in your tables/indexes/etc.

For us I had to the following:

  1. Generate dump for a specific database using command, no flags required at our current version of Postgres. If your Postgres version supports COPY commands then use that for sure, insert has a huge performance hit.

    /home/yugabyte/postgres/bin/ysql_dump -d REDACT-h yb-tservers.yugabyte  > /tmp/dev_yugabyte_psql-ha_migrate_20220722.sql
  2. Modify the file in place, files can be super large so recommended to use something low level (vim/sed/etc.) and not an IDE/Code editor:

  3. remove HASH keyword

  4. fix CONSTRAINT PRIMARY_KEY() - single parenthesis not double in CREATE table statements

  5. remove lsm index type - Postgres by default uses b-tree and doesn't have lsm. If not lsm, make sure it matches one of the types of Postgres index types: btree, hash, gist, spgist, gin, and brin. You can also just remove it like I did and psql will use btree by default

These will be in the statements that look like: CREATE INDEX REDACT ON public.REDACT USING lsm (task_id HASH);

either change it to your preferred index type or remove USING part (use default btree then) so it looks like: CREATE INDEX REDACT ON public.REDACT (task_id);

pchang388 commented 2 years ago

Question now is why is this not documented in yugabyte docs, can we add this?

ddorian commented 2 years ago

You do it in 2 stpes:

  1. Dump only the schema, and it's easier to edit the file manually
  2. Dump only the data and it should work out

Another way would be to try with pg_dump, the original binary from PostgreSQL, that should dump without our special keywords like lsm, hash etc.

pchang388 commented 2 years ago

You do it in 2 stpes:

  1. Dump only the schema, and it's easier to edit the file manually
  2. Dump only the data and it should work out

Another way would be to try with pg_dump, the original binary from PostgreSQL, that should dump without our special keywords like lsm, hash etc.

Hey @ddorian, thanks for the response. I tried with your advice on using pg_dump instead of ysql_dump. This did not work, for my use case, there is still HASH and lsm keywords being used:

Command to reproduce, export ysql database with pgdump remotely:

pg_dump -d druid_stg -h REDACT -p 5433 -U yugabyte -W > /tmp/stg_yugabyte_psql-ha_migrate_20220726_pgdump.sql

Import attempt/message:

...
...
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ERROR:  access method "lsm" does not exist
# sample CREATE that shows keywords still
CREATE INDEX idx_tasklocks_task_id ON public.tasklocks USING lsm (task_id HASH);
...
grep -i lsm stg_yugabyte_psql-ha_migrate_20220726_pgdump.sql | wc -l
7
grep HASH stg_yugabyte_psql-ha_migrate_20220726_pgdump.sql | wc -l
7
pchang388 commented 2 years ago

As stated this is not a blocker for me though since I was able to workaround this by find/replace (via sed) the keywords. But I do think this should be documented in yugabyte docs somewhere in a clean manner or even new scripts be added to accomplish the keyword deletions. I also hope this helps someone in the future if they stumble upon this and the docs are not up to date yet.

pablopla commented 1 year ago

Any update on this? We would like to experiment with Yugabyte but have the option to return to PostgreSQL in case we are facing issues.

ddorian commented 1 year ago

@pablopla can you do my suggestion in https://github.com/yugabyte/yugabyte-db/issues/13407#issuecomment-1193727643 ?

pablopla commented 1 year ago

@ddorian I didn't try dump/restore yet. I just want to verify that it's officially supported without manual steps and hacks.

ddorian commented 1 year ago

I didn't try dump/restore yet. I just want to verify that it's officially supported without manual steps and hacks.

@pablopla It's supported without hacks but with small manual steps.

pablopla commented 1 year ago

Can't we add a flag to ysql_dump to create a postgres compatible format that won't require manual steps?

ddorian commented 1 year ago

Can't we add a flag to ysql_dump to create a postgres compatible format that won't require manual steps?

It's not "just a flag", it needs coding,testing,reviewing,etc, including when we add new features. And there's not a lot of requests for it as of now.

FranckPachot commented 1 year ago

I think this should be implemented in PostgreSQL rather than YugabyteDB, to have pg_dump skip the non-standard access method (lsmtree here) and index options (hash here). You can have the same for any extension that implements alternative storage.