osheroff / mysql-binlog-connector-java

MySQL Binary Log connector
641 stars 161 forks source link

Introduce support for MySQL 8 specific metadata on GTID events #130

Closed methodmissing closed 7 months ago

methodmissing commented 8 months ago

Why?

Getting actual high resolution millisecond binlog event time values is not supported by the binlog event header as the timestamp field is represented as 4 bytes and only supports second resolution. This basically cannot change without breaking binary compatibility and subsequently replication and binlog tooling as the 19 byte header offset is hardcoded in so many spots.

We're pushing Shopify's CDC stack (this client and Debezium MySQL connector based) for lower latency to drive new business use cases, but need better timestamp resolution for the hop from writer -> ingested by our pipeline.

Related: binlog event header struct, Maxwell issue, higher level explanation of the MySQL 8 replication timestamps

How?

The only other obvious alternative for MySQL > 8.0.1 is to extract additional metadata from the GTID event, specifically this set of metadata up to current MySQL 8 versions:

Aligned to be very close to the actual control event implementation in libbinlogevents

Binlog events used in the tests

@Naros suggested to include a unit tests with payloads and assertions specific to these MySQL 8 versions: 8.0.1, 8.0.2, and > 8.0.14

MySQL 8.0.1

# at 1069
#231128 23:54:52 server id 1  end_log_pos 1141 CRC32 0xc836869e 
# Position  Timestamp   Type   Master ID        Size      Master Pos    Flags 
#      42d cc 7d 66 65   21   01 00 00 00   48 00 00 00   75 04 00 00   00 00
#      440 01 aa e5 7b 2f 8e 44 11  ee a3 d6 a0 36 bc da 1a |......D.....6...|
#      450 41 04 00 00 00 00 00 00  00 02 03 00 00 00 00 00 |A...............|
#      460 00 00 04 00 00 00 00 00  00 00 97 ef 0c 25 3f 0b |................|
#      470 06 9e 86 36 c8                                   |...6.|
#   GTID    last_committed=3    sequence_number=4   original_committed_timestamp=1701215692713879   immediate_commit_timestamp=1701215692713879
# original_commit_timestamp=1701215692713879 (2023-11-28 23:54:52.713879 WET)

MySQL 8.0.2

# at 770
#231129 11:23:34 server id 1  end_log_pos 845 CRC32 0xd6203da2 
# Position  Timestamp   Type   Master ID        Size      Master Pos    Flags 
#      302 36 1f 67 65   21   01 00 00 00   4b 00 00 00   4d 03 00 00   00 00
#      315 00 99 4a b8 59 8e a8 11  ee a5 68 a0 36 bc da 1a |..J.Y.....h.6...|
#      325 41 03 00 00 00 00 00 00  00 02 02 00 00 00 00 00 |A...............|
#      335 00 00 03 00 00 00 00 00  00 00 40 55 04 c4 48 0b |...........U..H.|
#      345 06 fc 34 01 a2 3d 20 d6                          |..4.....|
#   GTID    last_committed=2    sequence_number=3   rbr_only=yes    original_committed_timestamp=1701257014433088   immediate_commit_timestamp=1701257014433088 transaction_length=308
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=1701257014433088 (2023-11-29 11:23:34.433088 WET)
# immediate_commit_timestamp=1701257014433088 (2023-11-29 11:23:34.433088 WET)

MySQL 8.1.0

# at 2402
#231104 11:38:29 server id 1  end_log_pos 2481 CRC32 0xaa459aee
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags
# 00000962 75 65 46 65   21   01 00 00 00   4f 00 00 00   b1 09 00 00   00 00
# 00000975 00 bd 97 94 e0 1d 65 11  ed a7 e7 0a db 30 5b 3a |......e......0..|
# 00000985 12 09 00 00 00 00 00 00  00 02 07 00 00 00 00 00 |................|
# 00000995 00 00 08 00 00 00 00 00  00 00 66 29 aa 69 55 09 |..........f..iU.|
# 000009a5 06 fc 3b 01 e4 38 01 00  ee 9a 45 aa             |.....8....E.|
#   GTID    last_committed=7    sequence_number=8   rbr_only=yes    original_committed_timestamp=1699112309893478   immediate_commit_timestamp=1699112309893478 transaction_length=315

Corresponding Debezium PR: https://github.com/debezium/debezium/pull/5036

methodmissing commented 8 months ago

:wave: @osheroff @Naros @gunnarmorling if and when there's a free moment 🙇‍♂️

Naros commented 7 months ago

@osheroff would you be able to merge and cut a release in the near future? We have an upstream PR waiting for this that I'd like to include in the Debezium 2.5 release going out to final in about ~2 weeks.

osheroff commented 7 months ago

done, 0.29.0