pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.6k stars 963 forks source link

Switch Primary Keys to UUIDv7 #14676

Open dstufft opened 1 year ago

dstufft commented 1 year ago

We currently use UUIDv4 for our primary keys, which has a number of nice properties for us, however it does have a number of downsides.

There's a new standard for UUIDs being worked on, which provides UUIDv7 which is very interesting for use of a primary key.

UUIDv7 combine a time based prefix with a randomized suffix, allowing us to have an identifier that is naturally ordered roughly by insertion, but which is still random, even when expressed as raw bytes (as is how PostgreSQL stores UUIDs).

It's probably too early to switch our primary keys at this point, UUIDv7 standardization isn't finalized and PostgreSQL doesn't natively support it yet (which only really matters for generating, you can store arbitrary bytes in a UUID type in PostgreSQL).

Without native support in PostgreSQL, we'd have to shift the generation of primary keys to Python code, which I don't believe is worth it. However, once PostgreSQL does natively support UUIDv7, we likely want to switch our primary keys to using it.

One open question is what we should do with old primary keys, if we should attempt to convert them to UUIDv7 keys or if we should just leave them as UUIDv4 keys.

In any case, just filing this now as a reminder to do this in the future.

miketheman commented 1 year ago

Follow this CommitFest patch for when it might be added to Postgres. Then we need to wait until RDS adds the supported version, and we upgrade to that version.

dstufft commented 1 year ago

One note for old UUID keys, is we do currently leak them into our API token caveats, so we'll probably need a solution to that if we want to convert them.

miketheman commented 1 year ago

Alternately, wait until a uuid 7 extension is "stable" and also supported by RDS A couple of candidates:

There might be some SQL-only implementations that could work , but I haven't found any yet. - https://gist.github.com/kjmph/5bd772b2c2df145aa645b837da7eca74