go-mysql-org / go-mysql

a powerful mysql toolset with Go
MIT License
4.59k stars 982 forks source link

Support for mysql MIXED binary log format #897

Open alarbada opened 3 months ago

alarbada commented 3 months ago

The default format in mysql is ROW, but it would be nice that the library could also support the "MIXED" format. My usecase is to develop a conduit connector for mysql.

Is there a specific reason for why only the "ROW" format is supported?

lance6716 commented 3 months ago

As a user of this library I think ROW based replication is safer, however I didn't take enough time to try MIXED format. Welcome to discuss. I may response later.

https://dev.mysql.com/doc/refman/8.4/en/replication-sbr-rbr.html here is the tradeoff of ROW vs STATEMENT.

atercattus commented 3 months ago

Hello, @alarbada. Statement-based and mixed formats require you to analyze and execute such sql queries from the binlog. We, as a library, can provide a sql query read from the binlog, but that's it. Further work will be on you.

If this is what you need, then we can do it. If you are expecting an API, as is the case with row-based, it will not work, because the library knows nothing about the data in the database at the time of parsing the statement/mixed-based message.

dveeden commented 3 months ago

From https://dev.mysql.com/doc/refman/8.4/en/replication-options-binary-log.html#sysvar_binlog_format:

"binlog_format is deprecated, and subject to removal in a future version of MySQL. This implies that support for logging formats other than row-based is also subject to removal in a future release. Thus, only row-based logging should be employed for any new MySQL Replication setups. "

This doesn't mean I'm against supporting it, but it does mean the focus should be the row based format.

Note that for statement based you need more than just the query as many queries need context to get the same result on the target, e.g. RAND() needs rand_seed1 and rand_seed2 to be set.

alarbada commented 3 months ago

Thanks a lot for the link @dveeden! I didn't know it was deprecated. Now I've got a good excuse not to have to support mixed or statement-based formats, at least initially.

@atercattus, I'm not sure now. After reading more about this, I feel like it yields very little value for the complexity it would add to my connector. This might be useful for people stuck with legacy mysql versions, which I might or might not have to support.

In any case, having the statement as a string in the canal.RowsEvent would be enought for me, I can do the parsing with something like this.

dveeden commented 3 months ago

This would probably be something like a QueryEvent instead of a RowsEvent.

While row based binlogs weren't the default before 8.0 it was an configuration option since MySQL 5.1 (~2006).

Note that in MariaDB the MIXED format is still the default.