dolthub / dolt

Dolt – Git for Data
Apache License 2.0
17.41k stars 488 forks source link

Inserting BINARY into VARCHAR(26) should result in error Incorrect string value #8040

Closed muescha closed 3 weeks ago

muescha commented 3 weeks ago

Running dolt sql-server instead of a mySQL server for a tutorial

Hibernate: 

    create table beer (
       id varchar(36) not null,
        beer_name varchar(50) not null,
        beer_style smallint not null,
        created_date datetime(6),
        price decimal(38,2) not null,
        quantity_on_hand integer,
        upc varchar(255) not null,
        updated_date datetime(6),
        version integer,
        primary key (id)
    ) engine=InnoDB

[...]
Hibernate: 
    insert 
    into
        beer
        (beer_name, beer_style, created_date, price, quantity_on_hand, upc, updated_date, version, id) 
    values
        (?, ?, ?, ?, ?, ?, ?, ?, ?)

loggin for org.hibernate.orm.jdbc.bind:

binding parameter [1] as [VARCHAR] - [Galaxy Cat]
binding parameter [2] as [SMALLINT] - [0]
binding parameter [3] as [TIMESTAMP] - [2024-06-20T19:40:29.625315]
binding parameter [4] as [NUMERIC] - [12.99]
binding parameter [5] as [INTEGER] - [122]
binding parameter [6] as [VARCHAR] - [12356]
binding parameter [7] as [TIMESTAMP] - [2024-06-20T19:40:29.625717]
binding parameter [8] as [INTEGER] - [0]
binding parameter [9] as [BINARY] - [321ef8d6-9267-4fb5-8046-bcbee37b86ce]

current status:

expected:

SQL Error: 1366, SQLState: HY000
Incorrect string value: '\xD6\xF0...' for column 'id' at row 1
muescha commented 3 weeks ago

code of the example which should generate the error:

@Getter
@Setter
@Builder
@Entity
@ToString
@AllArgsConstructor
@NoArgsConstructor
public class Beer {

    @Id
    @GeneratedValue(generator = "UUID")
    @GenericGenerator(name = "UUID", strategy = "org.hibernate.id.UUIDGenerator")
    @Column(length = 36,
            columnDefinition = "varchar(36)",
            updatable = false,
            nullable = false)
    private UUID id;

this change should fix the error:

    @Id
    @GeneratedValue(generator = "UUID")
    @GenericGenerator(name = "UUID", strategy = "org.hibernate.id.UUIDGenerator")
+    @JdbcTypeCode(SqlTypes.CHAR)
    @Column(length = 36,
            columnDefinition = "varchar(36)",
            updatable = false,
            nullable = false)
    private UUID id;

this should change the binding to:

binding parameter [9] as [CHAR] - [09e23620-87a7-4e5b-88f4-78612e01ddf9]

Source: https://github.com/springframeworkguru/spring-6-rest-mvc/blob/122-server-port-mapping/src/main/java/guru/springframework/spring6restmvc/entities/Beer.java

timsehn commented 3 weeks ago

We'll get this fixed today. Thanks for the report.

muescha commented 3 weeks ago

sorry - i don't have a mysql running to test if this "bug" is on a real mysql

muescha commented 3 weeks ago

Data with [BINARY]:

id beer_name beer_style created_date price quantity_on_hand upc updated_date version
=�U��D���L�j�� Crank 0 2024-06-20 18:40:10.205510 11.99 392 12356222 2024-06-20 18:40:10.205512 0
���GBȲ�Z���g5 Sunshine City 0 2024-06-20 18:40:10.205516 13.99 144 12345 2024-06-20 18:40:10.205519 0
L^�\~��Dd��;�!}�f Galaxy Cat 0 2024-06-20 18:40:10.205470 12.99 122 12356 2024-06-20 18:40:10.205495 0

Data with SqlTypes.CHAR:

id beer_name beer_style created_date price quantity_on_hand upc updated_date version
859f3bad-5d09-4b15-8260-971aa09459c2 Sunshine City 0 2024-06-20 18:23:01.964107 13.99 144 12345 2024-06-20 18:23:01.964109 0
d745d589-aa06-42cd-aef8-dce409b85bd0 Galaxy Cat 0 2024-06-20 18:23:01.964076 12.99 122 12356 2024-06-20 18:23:01.964090 0
f0e61b94-d54f-4a8f-8dcd-f04ff0e37fb9 Crank 0 2024-06-20 18:23:01.964101 11.99 392 12356222 2024-06-20 18:23:01.964103 0
muescha commented 3 weeks ago

FYI: I did run mySQL in Docker and was able to reproduce the "error should exists" on my machine:

binding parameter [9] as [BINARY] - [508dbda1-aeed-428f-b5e1-54a1138610bd]
o.h.engine.jdbc.spi.SqlExceptionHelper   : SQL Error: 1366, SQLState: HY000
o.h.engine.jdbc.spi.SqlExceptionHelper   : Incorrect string value: '\x8D\xBD\xA1\xAE\xEDB...' for column 'id' at row 1
jycor commented 3 weeks ago

Hey @muescha, thanks for reporting this issue. I've been digging into it, and I believe the cause is a collation / utf8 encoding check dolt doesn't perform that MySQL does.

MySQL:

mysql> create table t (v varchar(10));
Query OK, 0 rows affected (0.0112 sec)
mysql> insert into t values (0x61);
Query OK, 1 row affected (0.0028 sec)
mysql> insert into t values (0x98);
ERROR: 1366: Incorrect string value: '\x98' for column 'v' at row 1
mysql> select * from t;
+---+
| v |
+---+
| a |
+---+
1 row in set (0.0005 sec)

Dolt:

tmp/main> create table t (v varchar(10));
tmp/main*> insert into t values(0x61);
Query OK, 1 row affected (0.01 sec)
tmp/main*> insert into t values(0x98);
Query OK, 1 row affected (0.00 sec)
tmp/main*> select * from t;
+---+
| v |
+---+
| � |
| a |
+---+
2 rows in set (0.00 sec)

0x98 is not a valid unicode character encoding, so MySQL throws an "Incorrect string value: ..." error, while dolt doesn't.

I am currently working on a fix and will keep you posted.

muescha commented 3 weeks ago

ok - that explains why the incorrect string value starts at 8D BD A1 and not with the first chars of 50 :)

muescha commented 3 weeks ago

thx :)

jycor commented 3 weeks ago

Hey @muescha, fix for this is in dolt main. We'll have a release out later today for you. Please let us know if it fixes the issue!

muescha commented 3 weeks ago
dolt version
dolt version 1.40.3

Still no error:

2024-06-23T13:51:27.121+02:00  INFO 20774 --- [  restartedMain] SQL dialect                              : HHH000400: Using dialect: org.hibernate.dialect.MySQLDialect
Hibernate: 

    create table beer (
       id varchar(36) not null,
        beer_name varchar(50) not null,
        beer_style smallint not null,
        created_date datetime(6),
        price decimal(38,2) not null,
        quantity_on_hand integer,
        upc varchar(255) not null,
        updated_date datetime(6),
        version integer,
        primary key (id)
    ) engine=InnoDB

    Hibernate: 
    insert 
    into
        beer
        (beer_name, beer_style, created_date, price, quantity_on_hand, upc, updated_date, version, id) 
    values
        (?, ?, ?, ?, ?, ?, ?, ?, ?)
 [...]
2024-06-23T13:51:27.956+02:00 TRACE 20774 --- [  restartedMain] org.hibernate.orm.jdbc.bind              : binding parameter [9] as [BINARY] - [4537159c-b6e7-4097-8246-94cecf34fa7d]
select dolt_version()
1.40.3
id beer_name beer_style created_date price quantity_on_hand upc updated_date version
E7���@��F���4�} Galaxy Cat 0 2024-06-23 11:51:27.938271 12.99 122 12356 2024-06-23 11:51:27.938282 0
L;r���F��6$G��ش Crank 0 2024-06-23 11:51:27.938291 11.99 392 12356222 2024-06-23 11:51:27.938293 0
ŀE���EI���0 P�s Sunshine City 0 2024-06-23 11:51:27.938297 13.99 144 12345 2024-06-23 11:51:27.938299 0
dolt sql-server --user=root
Starting server with Config HP="localhost:3306"|T="28800000"|R="false"|L="info"|S="/tmp/mysql.sock"
INFO[0000] Server ready. Accepting connections.
WARN[0000] secure_file_priv is set to "", which is insecure.
WARN[0000] Any user with GRANT FILE privileges will be able to read any file which the sql-server process can read.
WARN[0000] Please consider restarting the server with secure_file_priv set to a safe (or non-existent) directory.
INFO[0014] NewConnection                                 DisableClientMultiStatements=false connectionID=1
INFO[0014] NewConnection                                 DisableClientMultiStatements=false connectionID=2
INFO[0014] NewConnection                                 DisableClientMultiStatements=false connectionID=3
INFO[0014] NewConnection                                 DisableClientMultiStatements=false connectionID=4
INFO[0014] NewConnection                                 DisableClientMultiStatements=false connectionID=5
INFO[0187] NewConnection                                 DisableClientMultiStatements=false connectionID=6
timsehn commented 3 weeks ago

@jycor will figure out what’s up here.

jycor commented 3 weeks ago

Hey @muescha, looks like I missed a case for string types; it was made apparent after testing out prepared statemnts.

We'll get a release out for you to test out later this week.