ulid / spec

The canonical spec for ulid
GNU General Public License v3.0
9.69k stars 174 forks source link

Is this repository still active? #55

Open ultimaweapon opened 3 years ago

ultimaweapon commented 3 years ago

Just wonder due to there are a lot of opened PRs.

amcgregor commented 3 years ago

I also question this, and am sad I missed the initial development of this idea. Most of the PRs appear to be language-specific implementations, or references to such. The developer has been otherwise active, which relieves me a bit.

In my toolbox, this has been an already-solved problem for years using coordination-free ObjectIDs—link to my slightly more functional clean–room reimplementation—which also offers additional room for taint/tracing/origin information. Plus all of the other goodness of replacing a creation time field, being range filterable and sortable, and so on. For more compact than hexadecimal representation (which would be 24 characters), I use HHC, treating the ObjectID as a 96-bit integer.

The timestamp not being millisecond accurate is resolved by the inclusion of a per-process counter with random IV, but does harm replacement of a creation time field if milliseconds are required. Two generations within the same second from the same process on the same machine will have unique counters.

I'm… too often "that asshole", but I have to ask: why is? Kudos for formalizing a specification, though, independent of a specific use case.

ultimaweapon commented 3 years ago

Thanks for sharing. For me it can be anything that meets my requirements. In the first place I was considered to use Twitter Snowflake but it required machine identifier, which is not container friendly. I don't remember how I found ULID but it meet all my requirements.

amcgregor commented 3 years ago

@ultimaweapon The clean-room ObjectID implementation I linked (already-solved problem) allows for all official forms of "machine identifier" generation from MongoDB, their modern "random identifier on startup" approach, as well as implements hardware MAC hashing (last byte used to XOR all prior bytes of the MAC) and fully custom identifiers.

In virtual machine cases (i.e. containers) you have complete control over the MAC, hostname, and absolutely can specify the internal identifier explicitly. From the module docstring:

To determine which approach is used for generation, specify the hwid keyword argument to the ObjectID() constructor. Possibilities include:

  • The string legacy: use the host name MD5 substring value and process ID. Note if FIPS compliance is enabled, the md5 hash will literally be unavailable for use, resulting in the inability to utilize this choice.
  • The string fips: use the FIPS-compliant FNV hash of the host name, in combination with the current process ID. Requires the fnv package be installed.
  • The string mac: use the hardware MAC address of the default interface as the identifier. Because a MAC address is one byte too large for the field, the final byte is used to XOR the prior ones.
  • The string random: pure random bytes, the default, aliased as modern.
  • Any 5-byte bytes value: use the given HWID explicitly.

You are permitted to add additional entries to this mapping within your own application, if desired.

One potential use is in client-side identifier generation. Each user may be given a HWID for this purpose, permitting auditing of which records were populated by which users—one possible use. The variants using an actual machine identifier of some kind are useful for auditing of server-side behavior.

ultimaweapon commented 3 years ago

Thanks for information. My application don't need extra information in the identifier so ULID is sufficient.