mdomke / python-ulid

ULID implementation for Python
https://python-ulid.rtfd.io
MIT License
391 stars 18 forks source link

base32.decode returns exactly same bytes from two different strings. #31

Open E5presso opened 1 month ago

E5presso commented 1 month ago

Hello, I am currently using your ULID as the primary key (PK) for the database.

The reason I am reaching out is because while retrieving the database’s PK string using ULID.from_str, I encountered a PK collision. While debugging this issue, I discovered that base32.decode is converting different strings into the same binary value.

I suspect this may be due to a malfunction in the decode_randomness function. Here is my test code.

from ulid import base32

def test_ulid_decode() -> None:
    first_sample = base32.decode("01J5PP33KAV586VA6296VXF0HU")
    second_sample = base32.decode("01J5PP33KAV586VA6296VXF0QZ")
    assert first_sample == second_sample # this must be failed but it actually passed.
migonzalvar commented 1 week ago

I can't reproduce this test case.

The ULID Spec in https://github.com/ulid/spec?tab=readme-ov-file#encoding says:

Crockford's Base32 is used as shown. This alphabet excludes the letters I, L, O, and U to avoid confusion and abuse.

0123456789ABCDEFGHJKMNPQRSTVWXYZ

@E5presso the first string, "01J5PP33KAV586VA6296VXF0HU", is not valid because it contains the letter "U".