Open goatslacker opened 6 years ago
Excellent example. I really appreciate how succinct this is, yet how much information it conveys.
The trigraph score is based on word prediction. Because it's mostly based on words, any symbols are just collapsed into a number of symbol categories because it's frequent that similarly grouped symbols repeat after each other, such as in the password football2018
. It doesn't really do a good job of saying your password is stronger when you simply repeat a character. You've found a weakness with the trigraph scores.
Have you encountered any research that provide insight for how to weigh symbols when they repeat after each other, over and over? Is !!!!!!!!!!!!!!!!!!!!
significantly more secure than !!!!!!!!!!!!!!!!
? If so, how much more secure? I have not seen much research in this area.
I suppose one option would be to add the Shannon entropy score for all of the symbols to the trigraph score. I've resisted this temptation earlier but would be willing to consider it and other alternatives. What suggestions do you have?
Have you encountered any research that provide insight for how to weigh symbols when they repeat after each other, over and over?
I haven't but my inclination is that the shannon and nist scores are closer to what it should be.
I suppose one option would be to add the Shannon entropy score for all of the symbols to the trigraph score.
This was my initial thought, especially since you fallback to the shannon score anyway if trigraph is unavailable.
I've resisted this temptation earlier
How come?
What suggestions do you have?
I think this is quite an edge case so it's fine if no action is taken. If you wanted to scope things down then you can detect when there's a huge delta between shannon and trigraph and if that's the case combine the scores somehow.
edit: the shannon score is high because of password length, does trigraph not take into account length? If not it probably should.
Trigraph entropy is based on the Markov frequency of 3 characters occurring next to each other in a particular data-set. I am assuming that the Trigraph frequency data-set provided in this script is for an English based data-set.
This invokes a major assumption: People are creating passwords in English only.
Unless you know for sure what language people are using to create passwords, the values of the trigraph entropy might be biased.
Given that bias, if you have the choice of choosing an entropy value between Shannon and a language dependent Markov Trigraph, I would use the smallest of these 2 values as the trusting entropy value from any given pass phrase.
FWIW I believe most of the time the Shannon value from this script would be smaller on phrases in any particular language as the entropy value would be smaller in a patterned construct like a human language. Shannon entropy is basically based on how frequent any character is within a phrase.
Shannon: The longer the phrase and the less character repetition, the higher the entropy.
There is more character repetition in language, ..think of how often English vowels are used.
If Shannon is high and English Trigraph is low: you have a mixed up phrase that still looks like it is part of the English language. This may not be ideal in a mutated English dictionary attack.
The safest bet is to assume that you do not know what language someone is using to write a password phrase, so only trust the smallest value of all these entropy methods to determine password "strength".
It might be a nice feature to have this as an option in how the "strength" is calculated.
let saferEntropyCompareValue = !results.commonPassword ? Math.min(
results.shannonEntropyBits,
results.trigraphEntropyBits,
results.nistEntropyBits) : 0;
Just my 2c.
You do presume correctly that the trigraphs here are from English sets, but I also mixed in common password breach data. You're completely correct that there is a bias towards English or English-like passwords.
The Shannon entropy score doesn't take into account the odds of letters appearing as passwords. For instance, the score of aaaaa is going to be fairly weak. Are LLLLL or 99999 any better? I would argue that both are marginally better simply because they don't use a commonly-repeated letter and both are not in the default of lowercase. When I see leaked passwords, the numbers typically are two digit ore four digits and often correspond to a year.
Shannon's score does not take into account the number of possibilities on the keyboard. Pretend I'm rolling a fair die to determine the next number to hit (from 1 to 6). I press 4. It's a 1//6 chance of happening and a Shannon score of 0. I roll again and get another 4. Odds are 1/36, but the Shannon score is still 0. Hardly seems fair when the real entropy from this generation mechanic is a truly random die roll.
This library is not judging how much entropy is generated from your password, but how much work it would take to crack your password. If I were to brute force the die roll password above, it would take roughly 18 tries (half of the 36-entry keyspace). That's a non-zero effort, which is why the trigraph entropy bits are higher than 0.
There's several reasons why a 0 score would be reported for various measurements, so blindly using Math.min()
to get the smallest is not an option, but I do get the point that we could use something like that - perhaps take the smallest non-zero score? However, looking at the dice example above, I could roll 44 and have 0 bits for Shannon and 4.8 bits for trigraph. Adding a digit shouldn't decrease the score. Rolling a 3 to get a passphrase of 443 results in 2.75 bits for Shannon, which is low, and 8.16 bits for trigraph, which could be a fair assessment. The UI in this case would show 4.8 bits for 44 and then decrease the bits to 2.75 for 443? That also doesn't seem fair because any time you make a password longer, you don't decrease entropy.
While I do think there's merit to showing a lower score, I don't think it's accurate to simply assume the minimum is the correct value to use.
The Shannon entropy score doesn't take into account the odds of letters appearing as passwords.
That is correct, it does only considers the frequency of contents of characters in the phrase itself and not what other external possibilities are.
This library is not judging how much entropy is generated from your password, but how much work it would take to crack your password.
That is not so clear, because if you only wanted to judge how much work it would to take to brute-force (every bit permutation) then all you have to worry about is how long your password is. The bit length.
The Trigraph, Shannon and NIST are all measuring something else, specifically a degree of disorder in context of some particular order. i.e in the case of the English Trigraph, the disorder of arrangement of characters compared to the ordered structure of frequent English words.
I think the importance of these entropy finding methods, correct me if I am wrong, is when you are expecting a password to be not be cracked by brute-force but by some other mechanism based on an insight of an expected order, such as.. the structure of a human language. A dictionary attack is attack on the diction of that language. It seems I am stating the obvious but I think we can be lulled into a false sense of security by trusting a single entropy method alone.
Unfortunately in its current state, this JavaScript library is doing exactly this. But don't get me wrong I don't have a full solution. I am just trying to understand it better as well. It is good to talk it through.
That also doesn't seem fair because any time you make a password longer, you don't decrease entropy.
I think that depends on what the method to determine what the disorder (a.k.a entropy) should look like. Shannon considers a single repeating character of infinite length to have a disorder of 0. This is because it looks at the contents of the phrase itself, when the phrase has only the same character there is no disorder (according to the Shannon formula).
I don't think it's accurate to simply assume the minimum is the correct value to use.
Yes I think that is more clear now. I think the important thing is first look at the brute-force bit length, then look at the entropy of common human constructs as an attack vector. We are after all.. creatures of habit.
Purely looking at a single entropy derivation alone to determine "strength" is surely not the way.
On a side note, I know the EFF have come up with diceware to encourage people to create longer passwords. https://www.eff.org/dice
I am not so convinced of the diceware method for the future of computing, by using a public list of words to generate long passwords, you may have indeed increased the time to brute-force just because of the larger bit length, but at the same time you have created a succinct attack dictionary for a quicker permutation pattern search.
It might be better to pick 5 uncommon long words in multiple languages you are not related/familiar with that have also not been publicly designated as suggested password construction words that create a succinct attack dictionary.
It is an interesting topic, but seems quite a bit of "voodoo" depth understanding is required to demystify it satisfactorily.
Hey guys, I didn't find any better place to put this (please tell me if there is any place to discuss/feature requests).
It would be nice to include in charsets the whitespace characters \s
regex. Is that possible?
Feature requests can be submitted as issues.
It's not possible to change what \s
matches in JavaScript. That's defined by the standard, not by us users. However, one could change the pattern to accommodate different whitespace characters. Why not start a new issue and discussion can happen over there.
The shannon and nist scores are off the charts for this password but the trigraph score is low and thus the whole password gets marked as weak.