Open mv-i22 opened 6 years ago
Bullet point 5 of ULID spec reads, in part: "Uses Crockford's base32". Crockford Spec reads, in part: "When decoding, upper and lower case letters are accepted, and i and l will be treated as 1 and o will be treated as 0. When encoding, only upper case letters are used."
I think the spec should explicitly note that.
When decoding [...] i and l will be treated as 1 and o will be treated as 0
I'm also pretty sure very few implementations will respect this. And although the spec explicitly mentions Crockford's Base32 (which allows for hyphens (-
) anywhere in the string), AFAIK most implementations don't allow for these. So either we're not (100%) using Crockford's Base32 but some 'derivative' OR these things should be called out more explicitly in the spec.
I have implemented both (allowing i
, l
, I
, L
, o
and O
and allowing hyphens) in my .Net implementation.
case-insensitive is for codec only, but not for all other case (like db pk, redis key), please specify (at least prefer which case) the case, otherwise the db may not found the "same" ULID.
BTW, I prefer lowercase, because in web env, most of time are case-insensitive, use lower case make more sense, like pg gen_random_uuid in lowercase.
I don't see why the spec would have to define / enforce something as simple as upper/lowercase. If you have a specific usecase where you require either one, then call a .ToUpper()
or strtolower()
or whatever your language provides on it before inserting it or searching for a ULID. As you say, most usecases will be case-insensitive; for the cases where case matters, enforce it.
from https://datatracker.ietf.org/doc/html/rfc4122#section-3
The hexadecimal values "a" through "f" are output as lower case characters and are case insensitive on input.
UUID specified the output case here, the underlying codec is not ulid, the output is.
Without consistency on case, we can not just call gen_now_uuid
, always used like to_lower(gen_now_uuid)
or to_upper(gen_now_uuid())
UUID specified the output case here
What they do is up to them, isn't it?
the underlying codec is not ulid, the output is.
I'm not sure I understand what you mean here. You mean the underlying encoding I guess? GUID's are case-sensitive in most languages AFAIK too. To me, I don't see why we would enforce either lower or upper case; it's trivial in most cases where it matters to make the ULID upper- or lowercase. I can see that agreeing on a canonical notation would be beneficial, but the benefits are minor and next to none IMHO. So my reasoning then is to leave it up to whomever uses it and their usecase. There's no real technical reason to enforce either notation IMHO.
Without consistency on case, we can not just call gen_now_uuid, always used like to_lower(gen_now_uuid) or to_upper(gen_now_uuid())
If it really matters then why not create a wrapper/proxy/adapter/derived class that handles the upper- or lowercasing for your specific usecase? Shouldn't be more than a few lines of code in most languages.
I know, you could argue that it costs extra CPU cycles to uppercase an entire lowercase string or vice versa and is wasteful if you can just output the correct case directly. So then let's argue we choose lowercase as 'canonical form' and then still, from the cases where casing does matter, 50% will have to run it through uppercasing methods; and if we choose uppercase then the other 50% will have to do the same...
I just saw that there are implementations of ULID that provide uppercase only ULIDs, others (like the PHP implementation by @robinvdvleuten) provide lowercase ULIDs. The specification does not yet impose uppercase or lowercase but states that ULID is "case-insensitive". This is a great feature.
Nevertheless, I'd like to propose suggesting Uppercase ULID as "the right way". Mainly for two reasons:
I understand that this is debatable, as being flexible in your setup is a strength. But I also think, that having either uppercase or lowercase as the proposed (or imposed) way to implement ULIDs will help the Spec to spread because there is less potential for conflicts.
What do you think of this?