Open sergeyprokhorenko opened 9 months ago
(I had another comment here, but I think it was wrong, so I deleted it)
The most advanced official implementation available for PostgreSQL offers superior functionality.
See the article: Postgres UUIDv7 + Per-Backend Monotonicity
Hi, this has been sitting on our board for a bit. I wouldn't hate having this in the library, but I'm torn on how we'd want to approach this. I see that postgres has chosen to go with method 3 for their v7 generation. But it's not clear to me if that's too opinionated for this library and we should offer a way to choose between method 1 and method 3. What are your thoughts on the tradeoffs?
Any opinion on that?
@cameracker
The PostgreSQL algorithm provides additional guarantees compared to RFC 9562. In fact, it combines Method 3 with Method 1. The entire timestamp can also function as a counter when more than about 4 identifiers per microsecond are generated. Therefore, the monotonicity of generated UUIDs is ensured within the same backend (a single process).
A mutex could have ensured monotonicity across all backends, but it was too slow, so it was abandoned.
Monotonically increasing identifiers are generated even if the system clock jumps backward, if access to the system clock is unavailable, or if UUIDs are generated at a very high frequency, due to the internal timestamp functioning as a counter to maintain order.
The uuidv7() function can accept an optional offset parameter of type interval which is added to the internal timestamp. If the offset parameter results in a timestamp overflow or a negative timestamp, an adjusted timestamp value is automatically used. The timestamp behaves like a ring buffer: when the maximum value is exceeded, it wraps around to the minimum value. Similarly, if the absolute value of the negative offset exceeds the time elapsed since 00:00:00 UTC on January 1, 1970, the timestamp wraps around to the maximum value.
I like the idea of improving efficiency, but I'm not sure this fits into the API of this library.
The uuidv7() function can accept an optional offset parameter ...
This is one thing that sticks out for me. There are no optional parameters in Go, so we'd either have to have 2 exported functions, one that takes an offset and one that doesn't, or force everyone to pass a "zero" offset explicitly. That doesn't feel like a net gain from an API standpoint.
We could potentially use the PostgreSQL algorithm internally, but that eliminates the advertised benefit of the optional offset.
There are no optional parameters in Go
A nice way to achieve something like optional parameters is to use variadic args
The PostgreSQL algorithm will have to be modified by replacing the interval type with some units of time, and I prefer milliseconds (int64 or uint64)
The ability to shift the timestamp value makes it possible to hide the true date of record creation, prevents lock contention when generating UUIDs in parallel across multiple processes, and ensures monotonicity when generating on remote clients.
Percona Server for MySQL also uses a timestamp offset (in milliseconds, positive or negative):
UUID | Version Argument | Description |
---|---|---|
UUID_V7() | Can have either no argument or a one integer argument: the argument is the number of milliseconds to adjust the timestamp forward or backward (negative values). | Generates a version 7 UUID based on a timestamp. If there is no argument, no timestamp shift occurs. Timestamp shift can hide the actual creation time of the record. |
Welcome
Detailed Description
Although fault tolerance requires that each microservice writes to its own database tables, in practice this requirement is often violated.
The implementation of UUIDv7 for PostgreSQL had to switch from Method 1 to Method 3 (Increased Clock Precision with 12 bits sub-millisecond timestamp fraction) to synchronize the UUIDv7s generated by different microservices for the same database table. This turned out to be simpler than the autoincrement-like analogue. See the C implementation v27-0001-Implement-UUID-v7.patch of Method 3 at the page as a reference. The entire timestamp acts as a counter in rare case when more than about 4 identifiers per microsecond are generated.
This implementation also added the ability to offset the timestamp by a specified interval to hide the record creation time for information security.
It would be nice to add such a special UUIDv7 function for microservices.