dropbox / zxcvbn

Low-Budget Password Strength Estimation
https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/wheeler
MIT License
15.14k stars 943 forks source link

Using a constant bruteforce cardinality is a step backward #135

Closed bgmort closed 8 years ago

bgmort commented 8 years ago

Maybe this is a duplicate of #122 and #117, but I'm going to add my own thoughts. I discovered this library a few days ago, and was really excited to use it until I noticed that all bruteforce patterns multiplied the total guesses by ten per character. I assumed it was a bug and came here searching for a solution, and saw that it was done on purpose as not to overestimate the strength of some passwords.

If you consider the end goal of the library, absolutes such as crack time estimates are not nearly as important the subjective measures such as the score and the verbal feedback that are produced. Crack times or bits of entropy may be over a user's head, but they can understand if their password is "so-so" or "strong". And if one password is stronger than another password.

Therefore, using a single constant to represent bruteforce cardinality is way off the mark. By doing so, you are suggesting that all non-dictionary, non-pattern passwords are created equal, and the only way to increase password strength is by adding length. Passwords like hurpadurpy or kardashiam are scored the same as vMjR0a28yd, #e(2!J%}'H, or ᔘ¥z9Z}èåw俗. And obviously I'd rather have a user choose easy2guess^~ than easy2guess97.

I trust your judgement to choose the best solution to this problem, but here are two ideas:

bgmort commented 8 years ago

Another indicator that the bruteforce cardinality is far too low to be useful is that many spatial sequences are categorized as bruteforce sequences because the spatial sequence has more guesses than the bruteforce sequence. xcdfvb, for example, has 10^6.244 guesses as a spatial sequence, but only 10^6 guesses as a bruteforce sequence. Except for finding matches from the password dictionary, essentially you're just spending a lot of cycles to evaluate the length of a password.

lowe commented 8 years ago

Hi @bgmort,

To add my own opinion on this:

Random passwords are uncommon in leaked password distributions. So for zxcvbn, regions matched as "bruteforce" are typically far from random. A missing dictionary word or uncaught pattern is a more common case. So, 10^n is going to be too high most of the time, even though it's too low in the random case.

To give an example, "Teiubesc" used to be ranked as 52^8 (~10^14) by the bruteforce matcher. It is now ranked as 10^8 by the bruteforce matcher, and additionally matched as 10^4 because it is a common password in the Rockyou '09 password distribution ("Te iubesc" means "I love you" in Romanian). So, the old scheme overestimated by 10 orders of magnitude (!) for a common case. The new brute force estimator is still off by 4 orders, not great, but an improvement for this very common situation. I picked an example that happened to be in the dictionary, but many other examples in other languages won't be.

Random passwords as generated by password managers are going to be long enough to achieve a "strong" score for a 10^n weighting, even though they'd be C^n in reality for whatever the alphabet size C is. This underestimation doesn't bother me. You mention 10^6.244 spatial vs 10^6 bruteforce -- I'm also not too bothered because these scores are so close. In terms of user feedback, it would be better if the pattern were recognized, but I don't see it as a big problem given the number of turns on a short sequence. Simpler spatial patterns absolutely will be matched.

Anyhow, happy to listen to any other proposals for better bruteforce detection and strength estimation. Something simple, fast to calculate, and based on observations of leaked password distributions is preferred.