Closed snstanton closed 2 years ago
Hmm, this issue is that the ShortUUID is incompatible with a node.js library, but why is the issue filed here? What makes the node.js library "correct" and ShortUUID "incorrect"?
That is the symptom. The issue is that the ShortUUID library is changing the alphabet that it was passed in rather than using it as is. Rather than enforcing an ASCII sorting order on the alphabet, it should encode using the alphabet as presented, or at least provide that as an option.
Sure, it also removes duplicates. Why shouldn't it?
How about raising an exception for duplicates instead of mutating the data.
It could do that, but why? You pass in an alphabet, you get output in that alphabet. You encode, it works, you decode, it works. Why does the order of the alphabet matter to you?
Because not all short UUIDs are generated by this library. I would like to use the library to decode UUIDs generated by external sources, not just round trips from this library. The underlying code is perfectly capable of doing this. The only issue is the outermost ShortUIUD class that modifies the given alphabet rather than using it as given. The library would be more useful if it could handle both scenarios directly.
Right, but the first issue with that is that this library doesn't have interoperability with other libraries as a goal, and the second issue is, why is this library wrong? Why shouldn't the external sources just sort their alphabets?
I don't have any control over what other developers are doing with their code. If there is an encoded uuid in data I am trying to process, I need a way to decode it that works with the existing format. Just because this library's primary goal is to encode and decode uuids generated by one program does not mean that's the only thing it could be used for. I'm not saying what it it is doing now is "wrong". I'm saying it would be useful to have the option to not sort the alphabet so I can use the library for an adjacent use case. The beauty of a good library is that it can be used for things the original author did not envision.
Certainly, but that would break backwards compatibility, so I have to weigh the usefulness of the new feature against the hassle of breaking everything for other users.
That's why I suggested making it an optional flag.
This is a very niche use case, and I'm not sure it's worth the extra interface/documentation/effort. Can you not just set _alphabet
directly to the unsorted alphabet?
I've already worked around the issue with the following code using the lower level string_to_int function:
def decode_shortuuid(value):
# We have to use the lower level interface because the Flickr alphabet isn't in ASCII order
# and ShortUUID sorts the alphabet before using it.
return uuid.UUID(int=string_to_int(value, FLICKR_BASE58))
I don't need the change. The only reason I filed the ticket is because someone else might need similar functionality. Clearly you disagree, so feel free to close the ticket.
Sorting isn't necessary to prevent duplication, right?
It's not, but if you convert the list to a set to deduplicate easily you lose order guarantees, so you sort to go back to a repeatable order.
It would be nice for an official "Short UUID" specification so that the community could (potentially) guarantee interoperability between implementations across languages. No idea how to accomplish that, but it would be cool to see.
That would be a great idea, there would be the problem of having everyone agree and obsoleting a ton of data when the libraries changed, though.
Closing this for now.
I am attempting to convert a shortuuid generated using node.js and the Flicker base 58 alphabet:
Unfortunately this doesn't work:
It turns out that ShortUUID is sorting the alphabet before using it. This is incorrect and leads to this failure. It is possible to correctly decode the id like this:
This is a backwards compatibility issue, so switching to an unsorted implementation will probably require a compatibility flag similar to
legacy
.