Shopify / ghostferry

The swiss army knife of live data migrations
https://shopify.github.io/ghostferry
MIT License
747 stars 70 forks source link

Abort on DDL in binlog stream #320

Closed insom closed 1 year ago

insom commented 2 years ago

related to https://github.com/Shopify/ghostferry/issues/318

Co-authored-by: Pawan Dubey pawan.dubey@shopify.com

insom commented 2 years ago
Please report a bug if this causes problems.
Started with run options -n test_copy_data_with_alter_fails_part_way_through --seed 36007

TrivialIntegrationTests
  test_copy_data_with_alter_fails_part_way_through                PASS (2.02s)

Finished in 2.08223s
1 tests, 1 assertions, 0 failures, 0 errors, 0 skips

The ruby tests are failing CI due to timeouts, but I don't think it's because of changes we've introduced.

We've moved the integration test from the deprecated Go section to being written in Ruby. This causes the early termination of Ghostferry, which lets the test pass.

We've also rewritten things to use a smaller SQL parser (although it is still not a trivial dependency). Shiv and I have looked at the SQL parser landscape and a correct parser is simply going to be a large dependency (vs. doing something with regex which I strongly dislike).

In the future, now that we have a parser, I could see this being used to check if a DDL is happening to a table which is in the allow list / deny list, for example.

Alternatively, if we don't want to have this dependency at all, then I think action that's taken on QueryEvent could take a callback, and then other implementers who are using Ghostferry as a library could choose to do the parsing in their own code base?

shuhaowu commented 2 years ago

I'm okay with this dependency for now. I would make the DDL event a callback instead of erroring right here. As you mentioned, we can pass the event based on the filters passed to Ghostferry as well, but we can do that in a follow up PR if you would like.