osheroff / mysql-binlog-connector-java

MySQL Binary Log connector
641 stars 161 forks source link

TABLE_MAP deserialization Error with conflicting charsets #134

Open jogrogan opened 7 months ago

jogrogan commented 7 months ago

Running into TABLE_MAP deserialization errors when a table contains enum columns with non-matching charsets.

Error:

com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1700660745000, eventType=TABLE_MAP, serverId=1698749684, headerLength=19, dataLength=314, nextPosition=
932578372, flags=0}
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:335) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeTableMapEventData(EventDeserializer.java:307) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:231) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:949) [com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:599) [com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:854) [com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.io.EOFException: Failed to read remaining 2 of 2 bytes from position 141. Block length: 0. Initial block length: 4.
        at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.fill(ByteArrayInputStream.java:115) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:105) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventMetadataDeserializer.readBooleanList(TableMapEventMetadataDeserializer.java:114) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventMetadataDeserializer.deserialize(TableMapEventMetadataDeserializer.java:100) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventDataDeserializer.deserialize(TableMapEventDataDeserializer.java:47) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventDataDeserializer.deserialize(TableMapEventDataDeserializer.java:27) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:329) ~[com.zendesk.mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        ... 6 more

Steps to reproduce:

  1. Create a table with an enum column set to utf8 which in the latest version of mysql refers to utf8mb3
    mysql> show create table enum_test\G
    *************************** 1. row ***************************
       Table: enum_test
    Create Table: CREATE TABLE `enum_test` (
    `int_column` int NOT NULL AUTO_INCREMENT,
    `smallint_column` int NOT NULL,
    `enum_column` enum('a','b','c') CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
    `datetime_column` datetime(6) DEFAULT CURRENT_TIMESTAMP(6),
    PRIMARY KEY (`int_column`)
    ) ENGINE=InnoDB AUTO_INCREMENT=864 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
    1 row in set (0.04 sec)

--- No errors when inserting/updating records ---

  1. Add a new enum column set to utf8mb4 which is the table default
    alter table enum_test add column `userType` enum('INACTIVE','TERMINATED','ACTIVENONLICENSED','ACTIVE')  NOT NULL;

--- Above deserialization error is thrown when running an update ---

  1. Update character set of the new enum column to match the old enum column
    alter table enum_test change  `userType`  `userType` enum('INACTIVE','TERMINATED','ACTIVENONLICENSED','ACTIVE')  CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL  NOT NULL;

--- No error again when updating records because both columns are utf8mb3 --- Same correct behavior verified if we had instead updated enum_column to utf8mb4 to match what userType was before this change.

I imagine this error is happening where the types are parsed in the TABLE_MAP event, the exception is a bit misleading since the EOFException happens later. This would likely happen for SETs too but this is unverified.

sean-k1 commented 7 months ago

@jogrogan I made a Pr for this Issue

jogrogan commented 7 months ago

@sean-k1 @osheroff We are a bit behind on versions, using 0.26.1 where we noticed this issue. If the fix gets merged in is it possible to hotfix a new version off of this one as well, say 0.26.1.1?