timonwong / cyksuid

Fast Python implementation of KSUID (K-Sortable Globally Unique IDs) using Cython
BSD 3-Clause "New" or "Revised" License
52 stars 2 forks source link

Support higher time precision #22

Closed jalaziz closed 2 years ago

jalaziz commented 2 years ago

The original blog post introducing ksuid suggested that a higher time precision could be offered by sacrificing payload bytes.

It would be great if this library could support millisecond precision as an alternative.

Looking at alternatives, it seems 48-bits are generally sufficient for storing the timestamp. The alternative svix-ksuid seems to use 5 bytes presumably due to the modified epoch but as result only offers 4ms precision.

jalaziz commented 2 years ago

I would be happy to try and contribute this, but looks like the python3-only PR changes implementation details so will have to wait for that. Also, I'd be curious if you have a preference on the overall approach (parameterizing the current API or introducing new classes and functions)?

timonwong commented 2 years ago

@jalaziz On my local dev env I have some experiments to support that, currently it's working.

However, I'll try to bump a new version and rewrite some API, in order to be compatible with svix's ksuid.

jalaziz commented 2 years ago

Amazing!

To be honest, the svix API isn't amazing, but I understand why it could be good to be compatible with it.

That being said, are you considering 40-bit timestamps for compatibility or would you also offer 48-bit timestamps?

timonwong commented 2 years ago

That being said, are you considering 40-bit timestamps for compatibility or would you also offer 48-bit timestamps?

All of them, 32, 40 and 48

See #24, it's still working in progress since I have a big refactor

timonwong commented 2 years ago

@jalaziz I can't find any reference doc/implementation about 48bit timestamp, do you have one?

jalaziz commented 2 years ago

@timonwong Not with ksuid. However, the ulid spec uses 48-bits and it's fundamentally the same except that ksuid supports more random bytes:

If I'm not mistaken, with ksuid and 48-bits, you actually extend the lifetime since you only need 42-bits to fully represent the timestamp down to 1ms. However, since we want to align to byte boundaries for sorting, you have to jump to 48-bits, which means the timestamp now has more bits available to it (the timestamp portion can representing larger numbers) at the expense of extra randomness.