mysql-net / MySqlConnector

MySQL Connector for .NET
https://mysqlconnector.net
MIT License
1.39k stars 333 forks source link

New UUID DataType in the new MariaDB 10.7 series #1151

Closed ghost closed 2 years ago

ghost commented 2 years ago

MariaDB has just introduced a UUID datatype in the 10.7 series .

Will there eventually be an update adding this to the MySqlGuidFormat supported formats?

bgrainger commented 2 years ago

According to documentation:

Data retrieved by this data type is in the string representation defined in RFC4122.

I haven't tested yet, but it seems like GUID Format = Char36; would already handle this data type.

bgrainger commented 2 years ago

It appears that the UUID column is exposed as CHAR(36) so MySqlConnector just works with it out-of-the-box.

ghost commented 2 years ago

Haha, so it's a CHAR(36). Didn't think to click through the 10.7 preview feature: UUID Data Type link in the See Also section. That posts this:

$ mariadb --column-type-info --execute "SELECT UUID()"
Field   1:  `UUID()`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       STRING (type=uuid)
Collation:  utf8mb3_general_ci (33)
Length:     108
Max_length: 36
Decimals:   39
Flags:      

which is what you've found.

Glad it'll work out of the box, makes it easy for everyone!

nkelemen18 commented 1 year ago

Hello,

Sorry for reopening this issue, but I think it is worth to think about uuid data type again.

I think @ICanHazCode's observation is a little bit misleading. The data type is string, but uuid type string.

Haha, so it's a CHAR(36). Didn't think to click through the 10.7 preview feature: UUID Data Type link in the See Also section. That posts this:


$ mariadb --column-type-info --execute "SELECT UUID()"
Field   1:  `UUID()`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       STRING (type=uuid)
Collation:  utf8mb3_general_ci (33)

I could not find any official document about how new uuid type stores the data (probably 16 bytes see: https://github.com/MariaDB/server/blob/ffc088634101e73d6191133deffa86b1e2e96ced/include/my_sys.h#L1035), so I've set up a simple test environment.

So my test setup:

-- create two tables to compare them
-- table 1 uses uuid data type
create table uuid_test_1
(
    pk_uuid uuid not null
        primary key
);

-- table 2 uses char(36)
create table uuid_test_2
(
    pk_uuid char(36) charset ascii not null
        primary key
);

-- Raw table for generating new UUIDs
create table uuid_test_raw
(
    pk_uuid uuid default uuid() not null
);

Generate test data

-- Add the first record to uuid_test_raw table
INSERT INTO uuid_test_raw VALUES(UUID());

-- Duplicating the records every time this code executed
INSERT INTO uuid_test_raw
SELECT UUID()
FROM uuid_test_raw;

-- Note I executed it until the inserted rows' count was 16777216  (2^24) so the table had 33554432 records.
-- After that I copied generated data into uuid_test_1 and uuid_test_2 tables:

INSERT INTO uuid_test_1
SELECT * FROM uuid_test_raw;
-- Result: 33,554,432 rows affected in 2 m 1 s 601 ms

INSERT INTO uuid_test_2
SELECT * FROM uuid_test_raw;
-- Result: 33,554,432 rows affected in 6 m 8 s 148 ms

Validating the record count in the two test tables:

SELECT COUNT(*) as cnt, 'uuid_test_1' as tbl FROM uuid_test_1
UNION
SELECT COUNT(*) as cnt, 'uuid_test_2' as tbl FROM uuid_test_2;
cnt tbl
33554432 uuid_test_1
33554432 uuid_test_2

After that we can check the statistics and table sizes

-- update statistics
ANALYZE TABLE test_db.uuid_test_1;
ANALYZE TABLE test_db.uuid_test_2;

-- get table statistics
SHOW TABLE STATUS LIKE 'uuid_test_%';
Name Engine Version Row_format Rows Avg_row_length Data_length Max_data_length Index_length Data_free Auto_increment Check_time Collation Checksum Create_options Comment Max_index_length Temporary
uuid_test_1 InnoDB 10 Dynamic 33478720 37 1249902592 0 0 5242880 null null utf8mb4_general_ci null 0 N
uuid_test_2 InnoDB 10 Dynamic 33433369 59 1983905792 0 0 6291456 null null utf8mb4_general_ci null 0 N

See the difference between the sizes.

Checking index size differences

SELECT * FROM mysql.innodb_table_stats WHERE table_name LIKE 'uuid_test_%';
database_name table_name last_update n_rows clustered_index_size sum_of_other_index_sizes
test_db uuid_test_1 2023-08-15 01:50:18 33478720 76288 0
test_db uuid_test_2 2023-08-15 01:50:21 33433369 121088 0

Column type info

C:\Program Files\MariaDB 10.9\bin>mariadb.exe -u root -p --column-type-info --execute "SELECT pk_uuid FROM test_db.uuid_test_1 LIMIT 1"
Enter password: **********
Field   1:  `pk_uuid`
Org_field:  `pk_uuid`
Catalog:    `def`
Database:   `test_db`
Table:      `uuid_test_1`
Org_table:  `uuid_test_1`
Type:       STRING (type=uuid)
Collation:  utf8mb4_general_ci (45)
Length:     144
Max_length: 36
Decimals:   0
Flags:      NOT_NULL PRI_KEY UNSIGNED BINARY NO_DEFAULT_VALUE PART_KEY

+--------------------------------------+
| pk_uuid                              |
+--------------------------------------+
| be7960f5-3afa-11ee-b769-00155daa7299 |
+--------------------------------------+

C:\Program Files\MariaDB 10.9\bin>mariadb.exe -u root -p --column-type-info --execute "SELECT pk_uuid FROM test_db.uuid_test_2 LIMIT 1"
Enter password: **********
Field   1:  `pk_uuid`
Org_field:  `pk_uuid`
Catalog:    `def`
Database:   `test_db`
Table:      `uuid_test_2`
Org_table:  `uuid_test_2`
Type:       STRING
Collation:  utf8mb4_general_ci (45)
Length:     144
Max_length: 36
Decimals:   0
Flags:      NOT_NULL PRI_KEY NO_DEFAULT_VALUE PART_KEY

+--------------------------------------+
| pk_uuid                              |
+--------------------------------------+
| 00000018-3afb-11ee-b769-00155daa7299 |
+--------------------------------------+

My observation: the uuid data type could be a better alternative to store UUIDs than char(36).

Test environment: OS: Windows 11 (x64) CPU: Intel Core I7 8750H Memory: 32GB DDR4 MariaDB Server: 10.9.3-MariaDB installed on windows

bgrainger commented 1 year ago

Thanks for the detailed analysis, but is there anything for MySqlConnector to do here?

Based on what you've described, it sounds like MariaDB stores the uuid as BINARY(16) (for compactness/efficiency) but sends it over the wire as CHAR(36) (for compatibility). MySqlConnector will handle this just like a regular CHAR(36) GUID.

ghost commented 1 year ago

I think the question may be: If MariaDB supports sending a uuid in BINARY(16) over the wire and, if so, can MySqlConnector do requests for a uuid in BINARY(16)? This would be helpful for when the application using the database is doing a lot of requests and processing.

~ Just my 2 cents 😄

nkelemen18 commented 1 year ago

Thanks for the detailed analysis, but is there anything for MySqlConnector to do here?

My question: should MySqlGuidFormat have a new type, like NativeUuid or can I reuse one of the defined types from that?

Based on my observations I'm planning to reopen my previous issue in Pomelo.EntityFrameworkCore.MySql repo, this is why I'm asking this. My issue: https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/issues/1711

bgrainger commented 1 year ago

If MariaDB supports sending a uuid in BINARY(16) over the wire

It doesn't. From UUID data type:

Data retrieved by this data type is in the string representation defined in RFC4122.

You could see if it supports CAST(id AS BINARY) and use GuidFormat.Binary16 to read it. However, you would need to verify that the byte order is correct.

bgrainger commented 1 year ago

My question: should MySqlGuidFormat have a new type, like NativeUuid or can I reuse one of the defined types from that?

AFAICT, Char36 is the correct type to use.

I'm planning to reopen my previous issue in Pomelo.EntityFrameworkCore.MySql

If that's about using UUID as the column type for Guid values, that seems like a reasonable request (but out of the scope of MySqlConnector).

jbdocuphase commented 4 months ago

If that's about using UUID as the column type for Guid values, that seems like a reasonable request (but out of the scope of MySqlConnector).

Any thoughts on reconsidering this? There aren't any other known viable alternatives to support this.

bgrainger commented 4 months ago

Any thoughts on reconsidering this?

There's nothing for MySqlConnector to do here.

If you want Pomelo to create Guid columns by using the UUID column type for MariaDB, then you need to add a :+1: reaction to https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/issues/1711.