paralleldrive / cuid2

Next generation guids. Secure, collision-resistant ids optimized for horizontal scaling and performance.
MIT License
2.6k stars 53 forks source link

Whats the right id lenght for 0,00000001% chance of collision? #77

Closed User124125 closed 1 month ago

User124125 commented 2 months ago

Hi there,

in the description its stated that :

"by default, you'd need to generate roughly 4,000,000,000,000,000,000 ids ... to reach 50% chance of collision"

I feel like 50% chance of collision is quite scary. So the big number 4,000,000,000,000,000,000 doesnt realy persuade me.

Im more interested in a close to 0% chance of collision...

How do I find out which length is the right for me if I want 0,00000001% chance of collision ?

Or do I just set it to maxLength 98 and hope for the best?

Sry If this is too nooby of a question. Maybe the docs could get a different formular other than sqrt(36^(n-1)*26) So people can calculate how to be more collision safe in their projects.

JoltCode commented 1 month ago

Hi there,

in the description its stated that :

"by default, you'd need to generate roughly 4,000,000,000,000,000,000 ids ... to reach 50% chance of collision"

I feel like 50% chance of collision is quite scary. So the big number 4,000,000,000,000,000,000 doesnt realy persuade me.

Im more interested in a close to 0% chance of collision...

How do I find out which length is the right for me if I want 0,00000001% chance of collision ?

Or do I just set it to maxLength 98 and hope for the best?

Sry If this is too nooby of a question. Maybe the docs could get a different formular other than sqrt(36^(n-1)*26) So people can calculate how to be more collision safe in their projects.

Just to be clear, you'll only reach this 50% chance of collision at 4,000,000,000,000,000,000 - the chance of collision increases as you decrease entropy, or, by generating more ids. And to put in perspective how ridiculously large this number is, the population of earth in 2024 is roughly 8 billion. Every single human would need therefore need around 500 million ids each (810^9 500,000,000) = 4,000,000,000,000,000,000 (or 4*10^18) to reach 50% chance of collision. So you probably don't need more than this.

Still concerned? You can always increase the entropy as you suggested, but beware, that will come at a performance cost.

ericelliott commented 1 month ago

The chance of collision depends very much on how many IDs you want to generate, so instead of answering your question directly, I'll tell you how many IDs you can generate by default with less than 10^-10 chance of collision: ~57 trillion. Or roughly 7,000 IDs for every human in the world.

User124125 commented 1 month ago

The chance of collision depends very much on how many IDs you want to generate, so instead of answering your question directly, I'll tell you how many IDs you can generate by default with less than 10^-10 chance of collision: ~57 trillion. Or roughly 7,000 IDs for every human in the world.

First thank you for the answers and the project !

Lets say I wanted to use the ids for something like wikipedia , 7000 ids for every human in the world seems reachable over a span of 40 years uptime or so .... articles on articles will be added to the system and at some point it will just secretly fail ...

thats why i was asking for a formular, so one could calculate for their own project and determine the point of failure roughly and can act beforehand.

You said your calculation uses the default length ... would you be willing to share the formula so one would be able to calculate with variable id length ?

ericelliott commented 1 month ago

First, even if you were the most popular site in the world, YouTube, you'd get roughly 262 million posts per year x 40 years = ~10.5 billion posts x 2.5 average interactions per post - and even after 40 years, there's still an infinitesimally small chance of a single collision using default values.

Here's the formula - this is the birthday paradox square root approximation adapted to the structure of CUID2 values:

IMG_1077