Closed somnam closed 5 months ago
Hi @somnam, I will look into this and see how we can improve the validation here. Thank you for raising this issue.
I added stricter validation when creating ULIDs from user input in release 2.6.0.
@mdomke Thank you for addressing the issue.
After updating the library to 2.6.0
, I noticed a problem:
from_uuid
method sometimes fail to be re-instantiated after serializing to string:
>>> ul = ULID.from_uuid(UUID('1084932d-2962-4e86-9126-7b714c690b1d')); ULID.from_str(str(ul))
ULID(0GGJ9JTAB29T3929KVE566J2RX)
ul = ULID.from_uuid(UUID('f3c9261d-a7b7-4df5-86cd-0bb65f57c252')); ULID.from_str(str(ul)) ValueError: year 10464 is out of range ValueError: ULID timestamp is out of range.
ul = ULID.from_uuid(UUID('948ddc46-31a4-4733-9ef0-fec8e1b4ef69')); ULID.from_str(str(ul)) ULID(4MHQE4CCD48WSSXW7YS3GV9VV9)
ul = ULID.from_uuid(UUID('ee6fcae3-baa8-4f34-96e7-6e486b0d7ddc')); ULID.from_str(str(ul)) ValueError: year 10277 is out of range ValueError: ULID timestamp is out of range.
from_hash
method sometimes doesn’t work:
>>> ULID.from_hex("51df4604bdc9663d3a74127482b76b18")
ULID(2HVX309FE9CRYKMX0JEJ1BETRR)
ULID.from_hex("eddf81fec79667690f22de49efacb34f") ValueError: year 10258 is out of range ValueError: ULID timestamp is out of range.
ULID.from_hex("446896f419c5fbc9d492a60fcf25d744") ULID(24D2BF86E5ZF4X94N61Z7JBNT4)
ULID.from_hex("f09152e4bc21063f71d7eaee5e7f24de") ValueError: year 10351 is out of range ValueError: ULID timestamp is out of range.
It seems that timestamp validation based on the range of Python's datetime
object does not always conform to the ULID spec. According to the ULID spec:
the largest valid ULID encoded in Base32 is 7ZZZZZZZZZZZZZZZZZZZZZZZZZ, which corresponds to an epoch time of 281474976710655 or 2 ^ 48 - 1
The datetime
validation appears to reduce the accepted timestamp portion range. Both UUIDs and 32-bit hex strings can exceed this reduced limit, leading to problems in generating or deserialising ULID instances.
Tweaking the timestamp validation to check for the upper limit mentioned in the spec file could solve these problems and ensure better compliance. I've created a PR with a proposed update to the validation. Could you please have a look?
To verify whether a received string conforms to a valid ULID, I use the
ULID.from_str
method:Occasionally, a string of 26 characters may not be a valid ULID but may match the ULID length. I initially assumed such values would trigger a
ValueError
. However, it seems that any string consisting of 26 ASCII chars is accepted byfrom_str
, resulting in ULID instances created from random data, i.e.:Is this the intended behavior?
Additionally in some cases, the year part of the timestamp extracted from such values surpasses the maximum allowed by the
datetime
class, leading to ayear is out of range
error when inspecting thedatetime
:Perhaps the decoded timestamp part could be validated whether it exceeds the maximum value accepted by
datetime.datetime
?