ulid / spec

The canonical spec for ulid
GNU General Public License v3.0
9.73k stars 174 forks source link

[Question] Is there any function to convert between uuid and ulid #64

Closed 0bach0 closed 2 years ago

0bach0 commented 2 years ago

As I see in the spec, 128-bit compatibility with UUID.

I have an usecase to return id in uuid format, but default output of ulid is Base32. Is there any function to convert back-and-forth between uuid and ulid?

eugene-borovov commented 2 years ago

Hi! symfony/uid can do that. I think you can find the answer to your question in the symfony/uid source code.

fabiolimace commented 2 years ago

Hi!

The script in this comment is an example of how to convert a ULID to and from UUID in Python.

#!/bin/env python3
# @author: Fabio Lima
# @created: 2022-02-10

base16 = '0123456789abcdef'
base32 = '0123456789ABCDEFGHJKMNPQRSTVWXYZ'

def encode(number, alphabet):
    text = ''
    radix = len(alphabet)
    while number:
        text += alphabet[number % radix]
        number //= radix
    return text[::-1] or '0'

def decode(text, alphabet):
    number = 0
    radix = len(alphabet)
    for i in text:
        number = number * radix + alphabet.index(i)
    return number

def to_uuid(ulid):

    # decode from base-32
    number = decode(ulid, base32)

    # encode the number to base-16
    text = encode(number, base16).rjust(32, '0')

    # add hyphens to the base-16 string to form the canonical string
    return text[0:8] + '-' + text[8:12] + '-' + text[12:16] + '-' + text[16:20] + '-' + text[20:32]

def to_ulid(uuid):

    # remove hyphens from the canonical
    # string to form the base-16 string
    uuid = uuid.replace('-', '')

    # decode from base-16
    number = decode(uuid, base16)

    # encode the number to base-32
    return encode(number, base32).rjust(26, '0')

def main():

    ulid_in = '5JRB77HQYQ8P08BWPJRF5Q7F06'
    uuid_in = 'b2c2ce78-dfd7-4580-85f2-d2c3cb73bc06'

    uuid_out = to_uuid(ulid_in)
    ulid_out = to_ulid(uuid_in)

    print('ULID in: ', ulid_in)
    print('ULID out:', ulid_out)
    print('UUID in: ', uuid_in)
    print('UUID out:', uuid_out)

if __name__ == '__main__':
    main()

Output of the script:

ULID in:  5JRB77HQYQ8P08BWPJRF5Q7F06
ULID out: 5JRB77HQYQ8P08BWPJRF5Q7F06
UUID in:  b2c2ce78-dfd7-4580-85f2-d2c3cb73bc06
UUID out: b2c2ce78-dfd7-4580-85f2-d2c3cb73bc06

Notes:

  1. A standard UUID has 6 fixed bits: 4 in the version field and 2 in the variant field.
  2. You must validate the UUID or ULID before using to_ulid() and to_uuid(). Use REGEX.

If you need a Java example, take a look at the source code of ulid-creator. It has a method that converts a ULID into a UUID. It also creates a ULID from a UUID.

0bach0 commented 2 years ago

@eugene-borovov @fabiolimace

thank you for your answer, I will implement them in my program.

peterbourgon commented 2 years ago

Every ULID is a valid UUID, but not every UUID is a valid ULID, so unfortunately the above implementation to_ulid is invalid.

fabiolimace commented 2 years ago

You are right @peterbourgon

A ULID has 128-bit compatibility with a UUID, but that doesn't mean it follows RFC-4122. I think the spec author used the word "compatibility" to express that a ULID can be stored in a UUID data type, for example, in a database that has a native uuid data type like PostgreSQL.

Strict UUIDs reserve 6 bits for version and variants numbers defined in RFC-4122. The ULID specification does not follow RFC-4122 because all of its 128 bits are used for timestamp and randomness.

The to_uuid() function I showed above can be used to convert a ULID string into a value that can be stored efficiently in a PostgreSQL uuid field. The database doesn't care if the 128 bits it receives are strict RFC-4122 or not, that is, the value is not validated on insert.

The to_ulid() function can be used to decode back that uuid field to a ULID string.