dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.45k stars 548 forks source link

Test migration performance with MySQL 5.7 production db #1520

Open 23tux opened 1 year ago

23tux commented 1 year ago

I'm on an RDS instance with MySQL 5.7 and I did already a lot of testing restoring snapshots to other RDS instances.

One thing I noticed is, when I have a completely new and cold RDS instance that was just restored from a snapshot, the whole import takes about 2.5 hours. But when I re-run the import, it takes about 30 minutes. My guess is, that MySQL uses some caches that are filled during the first import and can be used during the second import.

Now, to have a better idea how long the actual migration from the production DB is going to be, I want to test an import in production while the app is still running. The final import will be done with some downtime to have a clean cut, but I worry about possible problems when running pgloader on a production DB while it is still in use:

  1. Is there any locking going on when pgloader does it's work, that could prevent writes or reads?
  2. Does anyone have an idea, how much performance hit the DB will take with the pgloader file I attached below? I will monitor burst balances and Read IOPS and abort if necessary
  3. Just so that I have asked, I can't imagine otherwise: pgloader only READS from the FROM table, and does no writing in it, correct? I'm still anxious that I corrupt data in my production db
  4. Are there any other side-effects that I'm unaware of?
# pgloader.load
LOAD DATABASE
  FROM      mysql://root:root@mysql-db/production
  INTO postgresql://postgres:changeme@postgres-db/production?sslmode=allow

  CAST type json to jsonb
  WITH prefetch rows = 100,
       preserve index names

  BEFORE LOAD DO
    $$ create schema if not exists production; $$;