f4b6a3 / uuid-creator

UUID Creator is a Java library for generating Universally Unique Identifiers.
MIT License
431 stars 44 forks source link

UNID: as an alternative for ULID #9

Closed nimo23 closed 4 years ago

nimo23 commented 4 years ago

The ULID are derived from UUID (with timezone) creators and hence uses also a string based approach. Please include another alternative to have a number instead of a string for ULID:

UNID-based Number: a Number with 48 bits for Unix milliseconds and 80 bits for randomness. The suffix of 80 bits for randomness can be (secure or fast) random numbers. (Or: 13 digits for millis and 5 or 6 digits for random suffix)

You can call it Unid (Universally Unique Number Sortable Identifier)

This UNID would be a good alternative for ULID because:

I had actually used the approach from https://github.com/callicoder/java-snowflake (read more about https://www.callicoder.com/distributed-unique-id-sequence-number-generator/). But the unid-generator would be a very good alternative to include it in uuid-creator and ulid-creator project. Could you do that?

fabiolimace commented 4 years ago

An UUID is a128-bit number. These methods to generate "Ulid-based" UUIDs:

UUID uuid = UuidCreator.getUlidBased();
UUID guid = UlidCreator.getGuid();

I'm afraid a didn't understand your question. Do you mean that these libraries should have a method to generate an instance of java.lang.Number like BigInteger? Maybe it's not necessary, since the UUID is a number.

Databases like PostgreSQL have a data type for UUID. Do you need it for a specific database that doesn't support the UUID data type?

I don't have objections. I just want to understand why you need a number instead of a UUID and if that number is a BigInteger.

Could you give more details of what you need?

Btw, I like the name UNID.

nimo23 commented 4 years ago

I was thinking of providing the ID by a long primitive when calling UnidCreator.getUnid() or UnidCreator.getNext(). Actually, all returned IDs are string values and for databases which dont support UUID or where memory space matters a primitive long instead of string is a reasonable alternative. I dont know the implementation specs of UUID, but maybe you can use the (time based) UUID which is a number and return its raw number representation for UNID (millis with random). BigInteger would not have much benefits in compare to String so returning a long primitive for UNID would be the best.

As long cannot have the same length as 128bit UUID, it can be, for example, a combination of the timestamp (most significant bits, 41 bits) and the rest of the available space the random part (least significant bits: nodeIdentifier 10 bits, clockSequence 12 bits). The result would be a long primitive called UNID.

fabiolimace commented 4 years ago

I started a the repository tsid-creator.

It generates IDs with one of these two structures.

  1. Structure with node ID:
                                            adjustable
                                           <---------->
|------------------------------------------|----------|------------|
    timestamp: millisecs since 2020-01-01     nodeid      counter
                42 bits                       10 bits     12 bits

- timestamp: 2^42 = ~69 years (with adjustable epoch)
- nodeid: 2^10 = 1,024 (user defined)
- counter: 2^12 = 4,096 (initially random)

Note:
The node Id is adjustable from 0 to 20 bits. 
The node id bit length affects the counter bit length.
  1. Structure WITHOUT node ID:
|------------------------------------------|----------------------|
    timestamp: millisecs since 2020-01-01           counter
                42 bits                             22 bits

- timestamp: 2^42 = ~69 years (with adjustable epoch)
- counter: 2^22 = 4,194,304 (initially random)

The term TSID stands for (rougthly) Time Sortable ID.

I couldn't call it UNID because it's not possible to gurarantee universal uniqueness with 64 bits. But it can be considered unique in a cluster, using the node id.

It's just an initial implementation. A lot of things can change, for example, the copied code from other repositories.

nimo23 commented 4 years ago

Very good start.

I like the name and the possibility to adjust the epoch and the nodeid to 0. Would be good to include this repo to uuid-creator to have all possible id generators under the hood.

I would also like to see the TSID in the benchmark section of uuid-creator. (If there would be a difference in performance, the benchmark can measure TSID with nodeId=0 and nodeId=10.)

fabiolimace commented 4 years ago

The time-sortable ID generator is in this project: https://github.com/f4b6a3/tsid-creator

It will also be included in this other project that generates ALL types of IDs: https://github.com/f4b6a3/id-creator

All benchmarks will be in this project, including the tsid-creator: https://github.com/f4b6a3/fabiolimace/id-creator-benchmark

Thanks again and sorry for taking so long.

nimo23 commented 4 years ago

Thanks. UNID is really a good alternative in compare to UUID or ULID when it comes to speed and size.