bk / Data-ULID

Perl implementation of ULID (Universally Unique Lexicographically Sortable Identifier)
7 stars 6 forks source link

Is the binary format also lexicographically sorted? #5

Open druud opened 2 years ago

druud commented 2 years ago

I haven't checked it yet, so am leaving it as a question here:

Is the binary format also lexicographically sorted?

As I rather store them as a BINARY(16) than a CHAR(26), but can only use them if the binary format also grows lexically over time.

See also: https://en.wikipedia.org/wiki/Universally_unique_identifier#As_database_keys

bbrtj commented 2 years ago

Hello @druud,

Binary format is encoded as follows on 64 bit: first six bytes hold the timestamp, where four bytes belong to a 32-bit number encoded big endian (upper 32 bits of the timestamp), and the following two bytes belong to a 16-bit number also encoded big endian (lower 16 bits of the timestamp).

On 32 bit perls, the timestamp is transformed into bytestring with the help of Math::BigInt::to_bytes method.

If the timestamp part is shorter than 6 bytes, it is left-padded with zero bytes. The rest of the bytestring is just random 10 bytes.

Additionally, textual ULID is just binary ULID encoded into base32, 5 bits at a time left to right - so it shares same characteristics.