Recommend Updating Password Quality Algorithm to Weight Length More as a Strength Factor

dragon-architect commented 1 year ago

Summary

This request is as it says on the tin. KeepassXC seems to weight its password strength computation heavily in favor of character variety, and minimally weights password length. This can have the unintended user experience effect of influencing users to choose shorter, harder to remember passwords that are easier to crack by brute force than longer, easier to remember pass phrases that are significantly harder to crack.

Examples

Randomly generated pass phrase: "partner rarity dismantle commotion". Password Quality is rated as "Weak" with 51.70 bits of entropy. This password is 34 characters long. However, even when generating other random four-word pass phrases, even in Title Case to increase the size of the character set from 27 to 53, the calculated password entropy remains the same.

Randomly generated password excluding Extended ASCII: "hl9LHQ/XK%LC". Password Quality is rated as "Good" with 78.84 bits of entropy. This password is only 12 characters long (almost 1/3rd the length of the example pass phrase) and exponentially easier to crack by brute force compared to the pass phrase. This one was cherry picked out of a number of random trials, most being rated as "Weak".

Both passwords were generated using KeepassXC.

And a contrived example: "1 12 123 1234 12345 123456 1234567 12345678 123456789". The password quality did not get rated as "Good" until I typed the first digit in the eighth block. This password is 53 characters long and has only 87.89 bits of entropy, according to the password quality algorithm that KeepassXC uses. This is obviously not a password that anyone should use, given its absurd simplicity and limited character set, but it is also absurdly long and absurdly easy to memorize.

Context

Currently, when I attempt to generate a pass phrase, I am told by the password strength rating that it is "weak" and has extremely low entropy, this in spite of the absurd length of it with only four words compared to any jumble of random characters less than half as long. While I am able to ignore this, myself, and confidently forge onwards with pass phrases, I feel as if users who are more inclined to blindly trust that the software knows what is or is not a "strong" password are going to be led astray by this and will instead fall back onto much shorter jumbles of random characters that are significantly more difficult for anyone to ever remember, but also significantly easier to crack by any brute force methods.

In the face of new password guidelines published by NIST in 2020, and in reference to that famous xkcd comic, I wish to recommend that the maintainers of this project revisit the password strength algorithm that KeepassXC uses, and update it to give stronger weight to password length, especially since KeepassXC already has a feature to generate pass phrases. However, KeepassXC paradoxically ranks pass phrases as much lower quality.

Pass phrases using words are much easier for users to type in the event that the user must manually type the password, as I often have to do on my mobile devices or through ssh connections. Anyone with a "smart" device or a gaming console and no attached keyboard knows the pain that is password entry. And so, the user should be encouraged by UI cues as subtle as the password quality rating to choose a type of password that is going to be easier for the user to work with.

In the examples above, the latter randomly generated password seems to be given a stronger quality rating more aligned with the now-deprecated 2003 password guidelines published by NIST. Imagine having to manually type that on a soft keyboard like on a smartphone or "smart" TV. Meanwhile, the former example password is more aligned with the updated 2020 password guidelines, is much easier to type manually in the event that it must be typed manually, and therefore would be considered the more robust password by NIST. I believe that KeepassXC's password strength algorithm should reflect this in order to improve the user experience of the application.

droidmonkey commented 1 year ago

No, we use zxcvbn for scoring passWORD entropy. Length is one of many factors considered. When you generate a passphrase, that is a significantly different calculation for entropy.

dragon-architect commented 1 year ago

I would argue that very few users of the software are going to make that kind of technical distinction, and may not even realize that such a distinction even exists. As far as any password-protected service is concerned, the whole string of characters--including repeated delineating characters between blocks as would exist in a phrase--is a singular password, and so most users are going to think of the whole phrase as a singular password, and again, be confused as to why their much longer passphrase is scoring lower. This is an incongruency in the User Experience between the password manager and the services for which users are managing their passwords.

To that end, I would like to amend my recommendation to:

Update the documentation for the software to more clearly define that this distinction exists, and that the password entropy calculation is specifically for words and not for phrases, and favors complexity over length in its calculation, and that it should be disregarded if one chooses to use pass phrases. I would also recommend adding an item in the FAQ as well, that can succinctly answer the question about why pass phrases get a lower entropy score than pass words, and a pointer to the relevant section of the documentation for further detail on the matter. At least for as long as zxcvbn favors complexity over length.

The solution of creating additional documentation to clarify this technical distinction is, perhaps, the easiest to implement given that KeepassXC relies on a third party repo for its entropy calculation, but it is still dependent on users thinking to check the documentation in the first place, which may not always happen.

Meanwhile, I'll submit a modified form of my original recommendation to the zxcvbn github.

EDIT: Could someone point me to which zxcvbn repo that KeepassXC uses? There is apparently an original one maintained by dropbox that appears to be laying fallow, and a fork specifically written in TypeScript and I'm not certain which one KeepassXC is using.

droidmonkey commented 1 year ago

We don't use zxcvbn for pass phrase entropy calculation so don't bother

dragon-architect commented 1 year ago

Ah, alright, that wasn't initially clear in the first response. To that end, would it still be possible to update the algorithm used for pass phrase entropy calculation? My stance on the user experience issue that this raises is firm, and I would not be making these recommendations if I did not feel strongly enough that users need to not be discouraged from using more secure pass words or phrases, especially given that NIST has amended their own recommendations roughly three years ago now.

droidmonkey commented 1 year ago

The calculations for entropy of pass phrases is 100% determined by the length of the pass phrase choice list. That is why it is significantly lower

ghost commented 1 year ago

Why is this the case? 100% the determining factor would surely undermine the point of 'entropy' or am I missing something?

Calculations for entropy of passphrases cannot be solely determined by the length of the passphrase choice list. The entropy of a passphrase is a measure of the amount of uncertainty or randomness in the passphrase. While the length of the passphrase choice list can contribute to the entropy of the passphrase, other factors such as character diversity, word frequency, and pattern recognition also play a crucial role in determining the passphrase's overall entropy.

For example, a passphrase that consists of a long list of consecutive numbers or a series of easily recognisable patterns may have a long character count but low entropy since it can be easily guessed or cracked by an attacker. On the other hand, a shorter passphrase that includes random combinations of uppercase and lowercase letters, numbers, and special characters may have higher entropy even though it has a shorter character count.

Therefore, when calculating the entropy of a passphrase, it is not important to consider not only the length of the passphrase choice list but also the other factors that can impact its overall randomness and security.

Would be interesting to get some feedback on this decision and hopefully, clarify a possible misunderstanding of the concepts.

droidmonkey commented 1 year ago

You are missing two things:

We always show the lowest entropy, which means it takes into account attack vectors and known patterns (zxcvbn) or word list size (pass phrase)
Calculating pass phrase entropy makes the assumption that the attacker knows you used a passphrase and which word list is used. Thus we show you the lowest possible entropy given that data. This is your worst possible case.

On the other hand, a shorter passphrase that includes random combinations of uppercase and lowercase letters, numbers, and special characters may have higher entropy even though it has a shorter character count.

This is a passWORD and is no longer a pass phrase.

ghost commented 1 year ago

Right. So it boils down to interpretation of password and passphrase, and the assumptions of the attacker knowing a passphrase has been used, and also which word list has been used. Due to the open source nature, the attacker would then be able to make assumptions based on your use of the eff_large.wordlist.

Your interpretation is, using a mix of uppercase, lowercase, numbers, and symbols may move away from the traditional concept of a passphrase, which typically involves combining multiple words or phrases to create a longer password. In that sense, using a mix of characters would make it more similar to a password.

Appreciate the quick response and the added clarification.

keepassxreboot / keepassxc