oculus42 / short-uuid

Translate standard UUIDs into shorter formats and back.
MIT License
450 stars 13 forks source link

Length of short.generate() is not consistent #39

Closed deepinder19 closed 3 years ago

deepinder19 commented 4 years ago

short.generate() usually returns uuid of 22 characters but sometimes it just returns 21 characters. There is no straightforward way to reproduce this but it happens sometimes if I keep on calling the function.

Is this some known behaviour or an issue that needs to be addressed?

jeremyblalock commented 4 years ago

A UUID is just a 128-bit number, under the hood. So my guess is it's not zero-padding, and so it's just being represented one character (or more) shorter if it's smaller.

papb commented 4 years ago

@jeremyblalock Should we call padStart on the result of short-uuid to make it always return 22 characters consistently? Can we pad with zeroes? Or this way it will no longer be unique?

jeremyblalock commented 4 years ago

From what I can tell, calling padStart (or some other type of leftpad) should be completely fine. Internally short-uuid is using any-base (https://github.com/HarasimowiczKamil/any-base), which doesn't provide any padding options. short-uuid used Flickr's implementation of base-58 which uses 1 as the zero bit (because base-58 doesn't use zero or capital o), so padding with 1's should be totally fine.

Here's info on the Flickr base-58 alphabet: https://en.wikipedia.org/wiki/Base58

oculus42 commented 4 years ago

The variable length is a known behavior. For some values and alphabets it will vary by more than one character.

We could add a padStart or equivalent option. It would need to calculate the max length of an Id in that alphabet to pad consistently., As @jeremyblalock suggests, it should use the alphabet to pad so translating the value back to a UUID works as expected, even with older (current) versions of short-uuid.

papb commented 4 years ago

@jeremyblalock Thanks for the explanation! Just to make sure I understand, this means that the current implementation never outputs a short-uuid starting with the character 1, right?

(assuming the default alphabet, which is Flickr Base58)

jeremyblalock commented 4 years ago

@oculus42 maybe you can confirm, but I believe that's the case.

oculus42 commented 4 years ago

I can confirm. This should work for any alphabet by using first character of the translator alphabet translator.alphabet[0]

You can validate this for yourself with the same code I used:

const validatePadStart = (translator, entries = 100) => {
    const zeroValue = translator.alphabet[0];
    const data = Array(entries).fill('').map(translator.generate);
    const maxLen = data.reduce((result, val) => val.length > result ? val.length : result, 0);
    const allMatch =  data.every(val => translator.toUUID(val) === translator.toUUID(val.padStart(maxLen, zeroValue)));
    return allMatch;
}
papb commented 4 years ago

Awesome. So maybe an option could be added to generate such as:

translator.generate({ consistentLength: true })

which would always return a string with the same length?

function generate(options) {
  if (options.consistentLength) {
    const length = figureOutWhatIsTheMaxLengthPossiblyGeneratedInAlphabet(alphabet);
    return generate().padStart(length, alphabet[0]);
  }
  // ...
}
thadeucity commented 3 years ago

Hi, i was having the same "issue", thought about it and sent a pull request with a simple feature to "solve" this.

Since the UUID is 128 bit long, rounding up the division between the Logarithm of ( 2 ^ 128 ) and ( alphabet length ) will give you the maximum length the Short Id will ever have.

With that and the first Char of the alphabet I changed the function to pad the result accordingly.

If my pull request have any inconsistency please let me know.

thadeucity commented 3 years ago

Inserted a better solution with options as suggested by @papb

oculus42 commented 3 years ago

With thanks to @thadeucity, v4.0.1 is out with consistent-length shortened values by default.