ulid / spec

The canonical spec for ulid
GNU General Public License v3.0
9.74k stars 174 forks source link

Question: Has anyone managed to integrate ULID in PostgreSQL? #41

Open kikar opened 4 years ago

hjr3 commented 4 years ago

I created a domain:

CREATE DOMAIN ulid_d AS TEXT CHECK (...);

you can check for length, do a full regexp check, etc.

djbusby commented 4 years ago

There is this nasty C extension I made: https://github.com/edoceo/pg-ulid And this one that's in plpgsql - https://github.com/geckoboard/pgulid or https://github.com/schinckel/ulid-postgres

dharmaturtle commented 4 years ago

@hjr3 I'm confused why you made a domain/text for it. It's just a 16 byte number, and text increases (doubles?) that space, which increases your index size for no benefit. I just store my ULIDs in a UUID, which thankfully doesn't check to see if it conforms to RFC 4122:

CREATE TABLE mytable (
  id uuid NOT NULL,
  CONSTRAINT uuid_pkey PRIMARY KEY (id)
);
INSERT INTO public.mytable(id) VALUES ('00dead00-0000-0000-0000-0000beef0000');
hjr3 commented 4 years ago

@hjr3 I'm confused why you made a domain/text for it.

@dharmaturtle see https://github.com/ulid/spec/issues/25#issuecomment-457342085

dharmaturtle commented 4 years ago

@hjr3 I have issues with the linked article, and responded in that thread.

Since I'm invested in using the uuid type, I wrote a postgres function for generating ULIDs in a UUID format and made a PR for it here.

RPG-18 commented 3 years ago

You can try take it https://github.com/RPG-18/pg_ulid

asad-awadia commented 3 years ago

@dharmaturtle how are you generating ulids like that though? Both the jvm and golang ulid generators generate ulids that looks quite different than regular uuids

dharmaturtle commented 3 years ago

@asad-awadia ULIDs are 16 byte numbers, more or less. UUIDs and GUIDs are also 16 byte numbers. However, they're represented differently: UUIDs and GUIDs in hexadecimal with some octet ordering nonsense, and ULIDs as alphanumerics with lexical ordering.

I'm gonna emphasize this line from the spec:

128-bit compatibility with UUID

I can't speak for the jvm/golang generators, but in C# there's a ToGuid function. Of course you wanna make sure the UUID/GUID format maintains lexical ordering, but that depends on implementation details.

sergeyprokhorenko commented 3 years ago

See comparison of ULID generation functions for PostgreSQL Reviews are welcome

dvtkrlbs commented 2 years ago

Is Ulid represented as UUID still sortable ? I am debating between storing them as VARCHAR(26)and UUID.

peterbourgon commented 2 years ago

ULIDs and UUIDs are both 128 bits worth of data but they are not compatible with each other. ~More specifically: all ULIDs are UUIDs, but not all UUIDs are ULIDs. So you can't effectively store a ULID in a UUID column. If the spec says otherwise it's incorrect.~ edit: this comment is basically misleading, my apologies (but still don't do this 😉)

dvtkrlbs commented 2 years ago

Shouldn't storing ULIDs as UUID still work if you are making sure only ULIDs are inserted to that column. It is not ideal for sure but if all your apps are doing the same behavior that should work. If I do that does ordering them work the same?

dharmaturtle commented 2 years ago

Is Ulid represented as UUID still sortable ?

Can you ORDER BY? Technically, yes. My interpretation of that answer, given "UUIDs are compared lexically byte for byte using the binary values", is that ULIDs stored in a UUID column should "just work"... but I would caveat that with "you should test it to make sure it has the characteristics you're looking for."

Note that "sortable" depends on implementation details. For example, see this or this. UUIDs, unfortunately, have an unintuitive ordering of bytes, which may mess with the implementation of sortable. Note the comment at the end:

That said, the output of ToByteArray is awkward because the little-endian fields break field-oblivious binary portability.

So does postgres optimally sort ULIDs? I've no idea. Test it yourself and come back with answers :)


all ULIDs are UUIDs, but not all UUIDs are ULIDs. So you can't effectively store a ULID in a UUID column. If the spec says otherwise it's incorrect.

Can you elaborate @peterbourgon? My understanding is that the ULIDs are not UUIDs and UUIDs are not ULIDs for the simple reason that UUIDs have version/variant, and ULIDs do not. Their specs are incompatible. The new draft proposals for UUIDv6, v7, and v8 are compatible with UUIDs though.

peterbourgon commented 2 years ago

Can you elaborate @peterbourgon? My understanding is that the ULIDs are not UUIDs and UUIDs are not ULIDs for the simple reason that UUIDs have version/variant, and ULIDs do not.

Apologies, my prior comment was imprecise. (The issue I was referring to was that the canonical text encoding of a ULID is via Crockford base32, which produces strings of length 26, but that a Crockford base32 string of length 26 can have a value which is greater than the 128b/16B of a ULID, and therefore invalid.) In fact any arbitrary 128 bits of data will AFAIK always successfully parse as either a ULID or a UUID, so my comment was basically misleading.

UUIDs have version/variant, and ULIDs do not.

It doesn't really matter, I don't think: if you have 16 bytes of data, and those bytes successfully parse as a {ULID, UUID}, then they are a valid {ULID, UUID}. AFAIK the concepts of version/variant/timestamp/etc. exist "above" and independent of validity.

pksunkara commented 1 year ago

I have created a new extension called pgx_ulid for my personal use.

If you look at the README, it covers all the needed features for a proper extension and has a comparison table for all ulid extensions.