ydb-platform / ydb

YDB is an open source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions
https://ydb.tech
Apache License 2.0
4.02k stars 587 forks source link

UUID sorts is unexpected #10328

Open rekby opened 1 month ago

rekby commented 1 month ago

The UUID type in YDB has an unexpected order when sorting. It can be surprising for developers and could be lead to production incidents in the future.

Test with example: https://github.com/ydb-platform/ydb/blob/63805c5c7a7ac9bba1e2bc5187d3c14d432db4d2/ydb/library/yql/tests/sql/suites/expr/uuid_order.sql#L4

Comparison with Other DBs: YDB: telegram-cloud-photo-size-2-5204222477293707395-x

Postgres: telegram-cloud-photo-size-2-5204222477293707393-x

MS-SQL has different order (non alphabet and not YDB's): https://onecompiler.com/sqlserver/42usaqs45

image

Oracle nosql: I can't create simple example in oracle, found in documentation about UUID is subtype of string and have same behavior: https://docs.oracle.com/en/database/other-databases/nosql-database/23.3/sqlreferencefornosql/using-uuid-data-type.html

The underlying reason for the YDBs behavior is that: uuid stored (and sorted) with reordered bytes.

The function for serializing UUIDs for storage: UuidBytesToString - save bytes with same layout as in memory.

The function for convert UUID to a string (with reorder bytes opposito to memory layout): UuidToString - reorder some bytes. And shown byte order has differences from storage/sorting byte order.

When storage system has no special uuid type the simplest way for store uuid is text, with text behavior (json, csv, sqlite, ...).

The YDB's ordering is differs from order in software languages, other DBs and my expectations. With special order will be difficult to:

may be some other.

Now we doesn't allow use UUID as primary key into public and I suggest to fix sorting order before enable that for prevent problems in the future.

rekby commented 1 month ago

@azevaykin @dcherednik @spuchin