Shopify / ghostferry

The swiss army knife of live data migrations
https://shopify.github.io/ghostferry
MIT License
693 stars 65 forks source link

Support tables with foreign key constraints #161

Open kolbitsch-lastline opened 4 years ago

kolbitsch-lastline commented 4 years ago

Ghostferry currently documents that it cannot work with foreign-key constraints (FKCs).

Intuitively this makes sense for two reasons:

However: mysql allows disabling foreign-key constraint checks on a per-session basis, and it does not re-validate constraints when this is disabled. As a result, we may temporarily disable constraint enforcement until the database is back in a consistent state. The only issue that does arise is that tables must be created in an order that satisfies their inter-dependencies.

The golang sql mysql driver even allows disabling constraints on a DB connection using a simple configuration change, making support even easier.

I have a working version of the above table creation change and was curious if you guys think it's a useful addition to the ghostferry-copydb tool. I understand it's somewhat hack-ish, but it could be useful in many scenarios where ghostferry cannot be used today.

Also: am I overlooking something with my assumption that disabling FKCs during the copy process is a problem?

shuhaowu commented 4 years ago

See https://github.com/github/gh-ost/issues/507.

We have no plans to add this as we don't use FK in our tables. We would like to avoid changes to the core library to support FK. I'm also not very familiar with what the implications of FK is in terms of data consistency and do not have the spare time to review its implications. We do not want to have a ghostferry "mainline" application on master that performs something we don't know for sure to be safe.

That said, you could fork copydb into a separate application that handles FK. We could put it into a "staging" directory in the main ghostferry repo so people can try it at their own risk.

kolbitsch-lastline commented 4 years ago

We have no plans to add this as we don't use FK in our tables. We would like to avoid changes to the core library to support FK. I'm also not very familiar with what the implications of FK is in terms of data consistency and do not have the spare time to review its implications. We do not want to have a ghostferry "mainline" application on master that performs something we don't know for sure to be safe.

that makes sense, but please note that the change is entirely in the database/table creation by introducing an order. Thus, no changes in the main pipeline are really needed and any breakage would have very limited impact (would be detected right away)

That said, you could fork copydb into a separate application that handles FK. We could put it into a "staging" directory in the main ghostferry repo so people can try it at their own risk.

the change is specific to the table creation and thus core to the setup. We could have 2 Ferry.CreateDatabasesAndTables methods, but it would mostly be a duplication of the methods

anything that was extractable, I already moved to the utils helpers/file.

shuhaowu commented 4 years ago

So you're suggesting that to handle foreign key, Ghostferry will create tables in a safe order and then the user has to disable FK constraint checking for the entire database manually? Having code that refers to FK in the code base may mislead the user into thinking that Ghostferry can natively do FK, which I'm not comfortable with doing at this moment.

That said, we're still interested in resolving this problem in a semi-official manner in the mainline Ghostferry code base. Thus, would this work: Ghostferry-copydb can take a list of tables in its configuration, and it will create those tables with the order specified in the configuration in start up. This way, you can create the ordered list of tables in the script that invokes Ghostferry and thus we need very minimal modifications to Ghostferry. We can document this as a way to deal with FK constraints as an "at your own risk" kind of thing.

kolbitsch-lastline commented 4 years ago

So you're suggesting that to handle foreign key, Ghostferry will create tables in a safe order and then the user has to disable FK constraint checking for the entire database manually?

no, it's a connection/per-session setting.

Having code that refers to FK in the code base may mislead the user into thinking that Ghostferry can natively do FK, which I'm not comfortable with doing at this moment.

fair point

That said, we're still interested in resolving this problem in a semi-official manner in the mainline Ghostferry code base. Thus, would this work: Ghostferry-copydb can take a list of tables in its configuration, and it will create those tables with the order specified in the configuration in start up. This way, you can create the ordered list of tables in the script that invokes Ghostferry and thus we need very minimal modifications to Ghostferry. We can document this as a way to deal with FK constraints as an "at your own risk" kind of thing.

up to you. I can delete the logic that infers this ordering automatically. But note that you don't need to document that we support FKCs just because of this. Merging the patch doesn't do anything beyond reordering table creation (as you suggest - just does it automatically). Why this happens doesn't need to be a big deal, but it's there if we ever need it. But I get your point

shuhaowu commented 4 years ago

I would prefer if the table creation ordering is done externally and passed into copydb only. It should be a relatively straight forward PR that we can review and merge quickly.

shuhaowu commented 4 years ago

Can we close this due to the merge of https://github.com/Shopify/ghostferry/pull/166