Closed ghost closed 3 years ago
Hi, The algorithm used has two parts.
First the lexicographically smallest string rotation is found for the DNA sequence. I think that there is mathematical proof that this rotation is unique for strings that are not concatenations of two substrings. If they are, the two rotations will be identical.
https://en.wikipedia.org/wiki/Lexicographically_minimal_string_rotation
The second part is simply an url safe SHA-1 hash of the smallest string rotation. I think there has been no accidental SHA-1 collisions that I know of. Git is using SHA-1 although it seems to be transitioning to SHA-256.
For my purposes, this seems to be accurate enough. I have never experienced problems. If would be very easy to implement an upgraded version if needed.
Hope this helps, Björn .
The documentation says the following
I'm not sure, does this mean there is no collision at all?