Closed jamoes closed 10 years ago
Nice, thanks!
Ahhhh I see now... The fact that 2^31 is not divisible by 94 was the concern, ay? 2^31%94 = 68 2^31 = 2.1e9 Thus, 68 of the numbers would appear 1.0000000005 times as often as the remaining 26.
I was more distracted by the double modding: x % 100 % 94 This would make the values 0-5 appear twice as frequently as 6-93! Right?
Yep, exactly. Modding by 128 instead of 94 generates a uniform distribution because 128 is divisible by 2^31.
The double-mod did indeed double the odds of the first 6 characters occuring in a password (but only if special
was true, because otherwise those selections were skipped).
This still is a pretty minor reduction of entropy though, so it's not necessary to get users to change their passwords. (And now I have to do the math to ensure this is true! :anguished:)
Using the discrete random variable entropy calcuations found at https://en.wikipedia.org/wiki/Maximum_entropy_probability_distribution, you can calculate the enropy of passwords generated using the old and new algorithms:
When each character has exactly a 1/94 odds (new algorithm), the entropy is: 94 * 1/94 * -(ln(1/94) / ln(2)) = 6.55 bits of entropy per character
Wheras, if the first 6 characters have a 2% chance, and the other 88 characters have a 1% chance (old algorithm), then the entropy is: 6 * 2/100 * -(ln(2/100) / ln(2)) + 88 * 1/100 * -(ln(1/100) / ln(2)) = 6.52 bits of entropy per character
Or, a reduction of only 0.03 bits of entropy per character. A 20 character password generated with the old algorithm would still have 130.4 bits of entropy (it will have 131 bits using the new algorithm).
The current password generation method will ever-so-slightly favor certain characters since randomWords() returns a number between -2^31 and 2^31, but is being modded by 94. Modding it by 128 and throwing away invalid values solves this.