azaghal / pydenticon

Pydenticon is a small utility library that can be used for deterministically generating identicons based on the hash of provided data.
BSD 3-Clause "New" or "Revised" License
66 stars 10 forks source link

Possibility of similar identicon collision #4

Closed pravj closed 9 years ago

pravj commented 9 years ago

Hi there,

My question is that, what's the maximum possible number of identicons pydenticon can produce?

As I can see that pydenticon uses 7 colors to fill in tiles and have one options to toggle the colors also. Methinks that this way, maximum possible identicons will be (7 * 2) * (2 ** 15), as it uses 15 tiles only because of the vertical axis symmetry.

This number is mere 458752 which is too less compared to total number of users on GitHub.

So this way it should be producing similar identicons and collisions are inevitable.

Correct me, if I'm wrong somewhere.

azaghal commented 9 years ago

Well, most of those parameters are really configurable. When you are initialising the generator, you can pick your own dimensions, digest algorithm (this can limit the size of your identicons), as well as possible foregrounds and background (background is always just one, so it does not affect the number of possible combinations).

I think that with 5x5 size, 7 foreground colours, and 1 background colour, the number will be more likely to be something like (2**15 - 1) * 7 + 1 (2**15 for the number of 1/0 combinations for 15 fields, minus the one that is all zeros, times 7 possible foreground colours, plus the one that is all zeros. If I am not mistaken. So that'd be 229370. I hope I haven't missed anything in the calculation here.

As a side-note, due to the algorithm, you can have maximum of 256 foreground colours (one byte of entropy for that).

azaghal commented 9 years ago

Also keep in mind that your collisions will probably happen much before you managed to exhaust all 229370 combinations (due to hashing algorithm chosen and data).

pravj commented 9 years ago

OK, got this and Yeah! according to your new explanation, your calculation is right. Thanks for the reply :)