dropbox / zxcvbn

Low-Budget Password Strength Estimation
https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/wheeler
MIT License
15.02k stars 934 forks source link

Inconsistent results when going from 2 to 3 digits #37

Closed nylen closed 9 years ago

nylen commented 10 years ago

The library says that word781 has less entropy than word78, because 781 is identified as digits but 78 is identified as bruteforce with lower entropy:

password:   word78
entropy:    20.928
crack time (seconds):   99.743
crack time (display):   3 minutes
score from 0 to 4:      0
calculation time (ms):  0
password:   word781
entropy:    18.677
crack time (seconds):   20.95
crack time (display):   instant
score from 0 to 4:      0
calculation time (ms):  1
jboning commented 10 years ago

repeat has a similar problem to digits here:

password:   xx
entropy:    9.401
crack time (seconds):   0.034
crack time (display):   instant
score from 0 to 4:  0
calculation time (ms):  9
match sequence:
'xx'
pattern:    bruteforce
entropy:    9.400879436282185
cardinality:    26
password:   xxx
entropy:    6.285
crack time (seconds):   0.004
crack time (display):   instant
score from 0 to 4:  0
calculation time (ms):  1
match sequence:
'xxx'
pattern:    repeat
entropy:    6.285402218862249
repeat-char:    'x'
pyramids commented 10 years ago

To even answer the question if and to what extent there is a problem here, we'd have to define what we expect, in particular how to account for the entropy associated with the choice of model (e.g. are those random characters requiring bruteforcing or just digits of a number?).

The original author addressed this issue: "It’s difficult to formulate a sound model for structural entropy; statistically, I don’t happen to know what structures people choose most, so I’d rather do the safe thing and underestimate." The logical consequence is that we must limit the number of models. Otherwise we'd tend to a zero entropy estimate, even arriving at that extreme if we'd allow infinitely many models, for example by saying "this password just follows the model of inserting zero random characters between the model-given prefix word and the model-given suffix 781." Since allowing single or two character patters already comes disturbingly close to that extreme, as long as we continue not accounting for model entropy, we may have to live with such non-monotonic estimates. Quite apart that the very fact that the number of applicable models explodes as we model shorter runs of characters not only drives the entropy estimate towards zero but also makes the run time for accounting for them, at least using zxcvbn's design of exploring applicable models, explode accordingly.

In my opinion, a larger problem is that all of these estimates are way too high. The letter x is by far my most frequent answer to "chose an arbitrary letter," with at best 2 bits of entropy (and more likely just one). Following zxcvbn's rank-based underestimation of word entropy, it should arguably be assigned zero bits as the rank 1 letter for the only model that could conceivably make me pick "xxx." With the repetition adding about 2 bits of entropy, a fair estimate of the entropy in "xxx" should be between 2 and 4 bits, not 6.3 as given above. Obviously the estimate of 9.4 bits for "xx" is even worse. But speaking of which: "xxx" does, presumably, occur more frequently than just "xx," and if the lower entropy estimate for the former weren't an accident, it could be called evidence for accounting for that disparity.

I'm having a similar problem with "word781." If I can remember a number like 781, it means it is not random but carries a significance---and IMHO, the entropy estimate should be based on the assumption that an attacker can second-guess the numbers that have a meaning to me, since it is almost certainly less well guarded than a well-handled password (e.g. license plate number or various account numbers may get shown in cleartext, sometimes even publicly). So treating it as a uniformly drawn random choice, as zxcvbn does, is a vast over-estimation of its likely entropy. Not to mention that "word" is the kind of word that users discouraged from using "password" may be picking first, at a frequency higher than its dictionary rank would suggest. Combine these two problems, and the entropy estimate I'd like to see for it is about half of what is shown above.

lowe commented 9 years ago

Thanks for reporting -- the jump from 2 to 3 digits was fixed in version 3.3.0 and up. zxcvbn used to skip 2-character substrings for several of its patterns, and that behavior has since been removed.

Additionally, a new regex matcher now matches digit regions as well as other common regex patterns.

Try:
https://dl.dropboxusercontent.com/u/209/zxcvbn/test/index.html