WICG / uuid

UUID V4
Other
63 stars 10 forks source link

feat: describe UUID v4 algorithm #6

Closed bcoe closed 3 years ago

bcoe commented 3 years ago

An initial pass at describing the UUID v4 algorithm.

Fixes #5


My goal here was to get us started, @broofa @ctavan, you know the space much better than me so please correct any strange decisions I've made.

CC: @domenic if you have the time, your feedback is greatly appreciated around the spec text.


Preview | Diff

bcoe commented 3 years ago

Found this implementation in Chromium:

function generateUUID() {
  var array = new Uint8Array(16);
  window.crypto.getRandomValues(array);
  array[6] = 0x40 | (array[6] & 0x0f);
  array[8] = 0x80 | (array[8] & 0x3f);

  var UUID = "";
  for (var i = 0; i < 16; i++) {
    var temp = array[i].toString(16);
    if (temp.length < 2)
      temp = "0" + temp;
    UUID += temp;
    if (i == 3 || i == 5 || i == 7 || i == 9)
      UUID += "-";
  }
  return UUID;
};

I wonder if generating bytes instead of words reads a bit cleaner.

bcoe commented 3 years ago

I'm just afraid that we're speccing along the spirit of the RFC but that real world implementations will much more likely follow a simpler approach in the end. So why risk this divergence in the first place?

@ctavan here's an approach that's taken in chromium today, that we could potentially just expose:

https://chromium.googlesource.com/chromium/src/+/refs/heads/master/base/guid.cc#86

  1. generate the 16 random byes.
  2. clear the version and reserved bits.
  3. return the hyphenated representation.
bcoe commented 3 years ago

After all even the RFC describes a much simpler algorithm which really boils down to ~3 lines of code e.g. see the chromium implementation you posted

@domenic @ctavan does it make sense to land this, and then perhaps take another pass at the simplified algorithm that @ctavan suggests?

I think we'd be able to use a lot of the same language, just condense some of the steps together, perhaps the final step would be:

Return the concatenation of  « hexadecimal representation array[0], hexadecimal representation array[1], hexadecimal representation array[2], hexadecimal representation array[3], "-", hexadecimal representation array[4], hexadecimal representation array[5], "-" ...

👆 the one argument I'd make against this is it makes for a very long step in the algorithm.