ryanjohns / passmaster

Host-Proof password storage using client-side AES-256 encryption.
https://passmaster.io
GNU General Public License v3.0
10 stars 4 forks source link

Generate passwords more uniformly #2

Closed jamoes closed 10 years ago

jamoes commented 10 years ago

The current password generation method will ever-so-slightly favor certain characters since randomWords() returns a number between -2^31 and 2^31, but is being modded by 94. Modding it by 128 and throwing away invalid values solves this.

ryanjohns commented 10 years ago

Nice, thanks!

scottsatkin commented 10 years ago

Ahhhh I see now... The fact that 2^31 is not divisible by 94 was the concern, ay? 2^31%94 = 68 2^31 = 2.1e9 Thus, 68 of the numbers would appear 1.0000000005 times as often as the remaining 26.

I was more distracted by the double modding: x % 100 % 94 This would make the values 0-5 appear twice as frequently as 6-93! Right?

jamoes commented 10 years ago

Yep, exactly. Modding by 128 instead of 94 generates a uniform distribution because 128 is divisible by 2^31.

The double-mod did indeed double the odds of the first 6 characters occuring in a password (but only if special was true, because otherwise those selections were skipped).

This still is a pretty minor reduction of entropy though, so it's not necessary to get users to change their passwords. (And now I have to do the math to ensure this is true! :anguished:)

Using the discrete random variable entropy calcuations found at https://en.wikipedia.org/wiki/Maximum_entropy_probability_distribution, you can calculate the enropy of passwords generated using the old and new algorithms:

When each character has exactly a 1/94 odds (new algorithm), the entropy is: 94 * 1/94 * -(ln(1/94) / ln(2)) = 6.55 bits of entropy per character

Wheras, if the first 6 characters have a 2% chance, and the other 88 characters have a 1% chance (old algorithm), then the entropy is: 6 * 2/100 * -(ln(2/100) / ln(2)) + 88 * 1/100 * -(ln(1/100) / ln(2)) = 6.52 bits of entropy per character

Or, a reduction of only 0.03 bits of entropy per character. A 20 character password generated with the old algorithm would still have 130.4 bits of entropy (it will have 131 bits using the new algorithm).