uuid6 / uuid6-ietf-draft

Next Generation UUID Formats
https://datatracker.ietf.org/doc/draft-peabody-dispatch-new-uuid-format/
187 stars 11 forks source link

Freeze the counter until it overflows #53

Closed sergeyprokhorenko closed 2 years ago

sergeyprokhorenko commented 2 years ago

Freeze the counter until it overflows. Fatal run-time errors are not allowed. Freezing the counter may only result in a slight decrease in database performance

broofa commented 2 years ago

This would complicate how we rationalize the probability of collisions, by introducing a new state for the generator. It would also be difficult to detect for users that may not be comfortable with this. Treating counter rollover as an error will alert users to the issue, and allow them to opt for a format with more counter bits.

sergeyprokhorenko commented 2 years ago

This would complicate how we rationalize the probability of collisions, by introducing a new state for the generator. It would also be difficult to detect for users that may not be comfortable with this. Treating counter rollover as an error will alert users to the issue, and allow them to opt for a format with more counter bits.

You shoud treat freeze of the counter (up to the next timestamp) as warning, but not fatal error that crashes the system at run time for no good reason. The freeze of the counter does not increase the probability of collisions, bacause random part is still changes for each new UUID.

broofa commented 2 years ago

The freeze of the counter does not increase the probability of collisions, because random part is still changes for each new UUID.

Even in the simple case where we only consider IDs generated by a single host process the odds of collision for ids generated in the same system-clock interval are different.

In more complex cases where you consider ids generated by many hosts, you lose the 100% uniqueness guarantee, but frozen counters still increase the colllision risk. In these cases, the timestamp + counter bits produce an effective clock resolution that is much greater than that of the system clock. If you allow the counter to freeze, you have to assume that effective resolution limited to what the system clock provides. For example, a UUID generated with 1-millisecod system clock + 12-bit counter has an effective clock interval of 244 nanoseconds (0.001/ 212). With the counter frozen, the effective interval is somewhere between that and 1 millisecond. It depends entirely on how long the counter is frozen for.

fabiolimace commented 2 years ago

I love the way our discussions evolve. Every little aspect is being scrutinized with great respect.

sergeyprokhorenko commented 2 years ago

@broofa, If you continue your thought, then you need to completely abandon the unnecessary random part, because the counter itself guarantees 100% uniqueness. You don't even need a timestamp because an increment is enough. It is both 100% unique and monotonous! Just a miracle! This is a joke.

Nobody does that. Because in fact, 100% uniqueness is not needed, but sufficient uniqueness is needed. For example, even 55 bits of the random part means only 2.7755576e-17 chance of a collision, according to your calculations. This is a negligible value. But in fact, your calculations are misleading, because the probability of a collision is much less. After all, even without a random part and without a counter, the probability of a collision in the same millisecond is negligible, regardless of the number of hosts.

I'm not against the counter. I am against bringing down the information system if the counter suddenly overflows. I, unlike you, at least try to prevent this and suggest lengthening the counter from 12 bits to 15 bits for millisecond accuracy. An even longer counter would not be necessary because all generated UUIDs cannot be inserted into the database table in the same millisecond.

Don't worry too much about the effective clock resolution of timestamp + counter. A slight break in key monotonicity due to a counter freeze when writing to a database table will only cause a slight slowdown when reading records from the database table. Modern DBMS can handle even UUID v 4.

broofa commented 2 years ago

I didn't say 100% uniqueness was needed. I said your assertion that freezing the counter does not change the probability was wrong. And then established that to be true by pointing out how the probability changes depending on whether or not you have an operational counter in all cases.

Being able to reason about collision probability is important. Maintaining guarantees about uniqueness and monotonicity are important. This specification will suffer to whatever degree we dismiss these key tenets of this work.

peterbourgon commented 2 years ago

Fatal run-time errors are not allowed.

Why not?

fabiolimace commented 2 years ago

IMHO it is preferable to make the generator wait a millisecond and guarantee some monotonicity rather than throwing an error and losing it.

If a system faces a bottleneck when generating 10 million UUIDs v1 per second, there may be a much bigger problem to solve, and the bottleneck in UUID generation is just the tip of the iceberg.

However, it is better to keep the two options that RFC-4122 already allows: freeze the generator or throw an error. Exceptions thrown in the system log are more effective for diagnosing issues before it gets worse, BTW.

peterbourgon commented 2 years ago

IMHO it is preferable to make the generator wait a millisecond and guarantee some monotonicity rather than throwing an error and losing it.

For some use cases, I agree. For all use cases? I don't think you can make such a strong claim.

Generating an identifier can fail. That failure should, by default, be reported to the caller so they can act upon it appropriately.

fabiolimace commented 2 years ago

Yes, I exaggerated the rhetoric. Sorry.

It is not applicable in all cases. And the standard should keep both options, at least for UUIDv1 and UUIDv6.

The new versions, v7 and v8, I'm not sure. It depends on whether there will be a counter (former clock sequence) or not.

kyzer-davis commented 2 years ago

Group, over on #58 I implemented text around Counters including some text around how to handle overflows if somebody decides to utilize a fixed bit length counter.

Please let me know what you think.