github / gh-ost

GitHub's Online Schema-migration Tool for MySQL
MIT License
12.31k stars 1.25k forks source link

Fix binlog end log pos over lost data #1367

Closed shaohk closed 6 months ago

shaohk commented 8 months ago

Related issue: https://github.com/github/gh-ost/issues/1366

Description

This PR [Fix binlog end log pos over lost data]

In case this PR introduced Go code changes:

dikang123 commented 8 months ago

when will this PR be merged?

timvaillancourt commented 8 months ago

@shaohk thanks a lot for this PR, this sounds like a difficult problem to debug!

Can you think of any ways that this check can be tested in go unit tests? I wonder if a test could:

  1. Test that a normal *replication.RowsEvent event will be accepted
  2. Test that an oversized *replication.RowsEvent event it not accepted

Luckily, handleRowEvents doesn't need a real MySQL server as it pushes events to entriesChannel chan<- *BinlogEntry, which should make this relatively simple to test

shaohk commented 7 months ago

@shaohk thanks a lot for this PR, this sounds like a difficult problem to debug!

Can you think of any ways that this check can be tested in go unit tests? I wonder if a test could:

  1. Test that a normal *replication.RowsEvent event will be accepted
  2. Test that an oversized *replication.RowsEvent event it not accepted

Luckily, handleRowEvents doesn't need a real MySQL server as it pushes events to entriesChannel chan<- *BinlogEntry, which should make this relatively simple to test

@timvaillancourt Thank you for reviewing this PR. I have added unit tests as requested. Could you please take a look again? Thank you for your hard work.

meiji163 commented 7 months ago

@shaohk Can you take a look at the failing tests and linter?

shaohk commented 7 months ago

@shaohk Can you take a look at the failing tests and linter?

@meiji163 OK, Please take a look again, Thks.

meiji163 commented 6 months ago

LGTM, can you take another look @timvaillancourt?

timvaillancourt commented 6 months ago

LGTM, can you take another look @timvaillancourt?

@meiji163 / @shaohk: I think erroring here is the easiest next step, so LGTM 👍

But longer term, what do we "want" gh-ost to do here? Or to put it differently, what does the MySQL replica do in this scenario? I'm assuming it doesn't error right? 🤔

meiji163 commented 6 months ago

I'm not sure how MySQL handles it I found this bug report from last year https://bugs.mysql.com/bug.php?id=112189

shaohk commented 6 months ago

LGTM, can you take another look @timvaillancourt?

@meiji163 / @shaohk: I think erroring here is the easiest next step, so LGTM 👍

But longer term, what do we "want" gh-ost to do here? Or to put it differently, what does the MySQL replica do in this scenario? I'm assuming it doesn't error right? 🤔

https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#sysvar_max_binlog_cache_size.

When [gtid_mode](https://dev.mysql.com/doc/refman/8.0/en/replication-options-gtids.html#sysvar_gtid_mode) is not ON, the maximum recommended value is 4GB, due to the fact that, in this case, MySQL cannot work with binary log positions greater than 4GB; when gtid_mode is ON, this limitation does not apply, and the server can work with binary log positions of arbitrary size.

So I believe that MySQL master-slave replication is based on the GTID mode, so there is no problem with MySQL master-slave replication. The gh-ost tool for synchronizing binlogs uses a non-GTID mode, which is why this issue arises.

timvaillancourt commented 6 months ago

So I believe that MySQL master-slave replication is based on the GTID mode, so there is no problem with MySQL master-slave replication. The gh-ost tool for synchronizing binlogs uses a non-GTID mode, which is why this issue arises.

@shaohk great, thanks for clarifying! I made an attempt to add GTID support to gh-ost long ago and I may revive this effort for many reasons. This adds a new reason to the pile