Shopify / ghostferry

The swiss army knife of live data migrations
https://shopify.github.io/ghostferry
MIT License
694 stars 65 forks source link

Problems with migrating data with foreign key constraints. #289

Open Manan007224 opened 3 years ago

Manan007224 commented 3 years ago

The main reason behind migrating tables with foreign key constraints is that they can have referential actions for a foreign key such as ON DELETE CASCADE, ON UPDATE CASCADE.

Cascading deletes and updates in child tablees caused by foreign key constraints don't show up in binlogs because these referential actions are dealt internally by InnoDB. Check out this link - https://dba.stackexchange.com/questions/51873/replication-binary-log-parsingtableid-generation-on-delete-cascade-handling.

Let's take a look at this sql example on a source database :-

mysql> SET FOREIGN_KEY_CHECKS=0;
Query OK, 0 rows affected (0.00 sec)

mysql> use gf_test;
Database changed
mysql> CREATE TABLE IF NOT EXISTS t1 (id1 bigint(20), primary key(id1));
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> CREATE TABLE IF NOT EXISTS t2 (id2 bigint(20), primary key(id2), CONSTRAINT fkc2 foreign key(id2) REFERENCES t1(id1) ON DELETE CASCADE ON UPDATE CASCADE);
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> INSERT INTO t1 VALUES (1);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO t2 VALUES (1);
Query OK, 1 row affected (0.01 sec)

mysql> UPDATE t1 SET id1 = 2 WHERE id1 = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> SELECT * FROM t1;
+-----+
| id1 |
+-----+
|   2 |
+-----+
1 row in set (0.00 sec)

mysql> SELECT * FROM t2;
+-----+
| id2 |
+-----+
|   1 |
+-----+
1 row in set (0.00 sec)

We can clearly see here that since foreign key checks are disabled on the database updating the parent table does not change the foreign keys in child tables.

In terms of ghostferry migration when the rows are updated or deleted from the parent table in source database cascade won't update,delete rows in the child table if they're already present in the target database, eventually leading the database in an inconsistent state.