keepassxreboot / keepassxc

KeePassXC is a cross-platform community-driven port of the Windows application “Keepass Password Safe”.
https://keepassxc.org/
Other
20.83k stars 1.44k forks source link

Password generator overestimates entropy of extended ASCII passwords #8227

Open bkrl opened 2 years ago

bkrl commented 2 years ago

Overview

The password generator often displays entropy values much bigger than the theoretical maximum when the extended ASCII option is enabled. For example, it displays an entropy value of 79.73 bits for ÍéÐ¥õÂ, even though a six-character extended ASCII password has a maximum possible entropy of an entropy of at most 6 * 8 = 48 bits. This is an extreme example, but if I repeatedly generate passwords like this, I get entropy values greater than 48 most of the time.

Steps to Reproduce

  1. Open the password generator.
  2. Enable the extended ASCII option.
  3. Generate some passwords and compare the displayed entropy value to eight times the number of characters.

Expected Behavior

The displayed entropy value should never be greater than eight times the number of characters, since each character has at most eight bits of entropy. I know that the displayed entropy is supposed to be an estimate of guessibility rather than the Shannon entropy, but currently it's giving users the impression that a six character password could be secure.

Actual Behavior

The displayed entropy value is often much greater than the maximum possible entropy.

Context

I'm using the package from the Fedora 36 repositories.

KeePassXC - Version 2.7.1 Revision: 5916a8f

Qt 5.15.3 Debugging mode is disabled.

Operating system: Fedora Linux 36 (Workstation Edition) CPU architecture: x86_64 Kernel: linux 5.18.6-200.fc36.x86_64

Enabled extensions:

Cryptographic libraries:

Operating System: Fedora 36 Desktop Env: Gnome Windowing System: Wayland

I was also able to reproduce this with the latest official AppImage (same revision) on a new Fedora 36 virtual machine.

droidmonkey commented 2 years ago

Your maximum entropy calculation is not correct. https://generatepasswords.org/how-to-calculate-entropy/

This bug should be levied onto zxcvbn

bkrl commented 2 years ago

Your maximum entropy calculation is not correct. https://generatepasswords.org/how-to-calculate-entropy/

I used 8 bits per character as a quick upper bound. I didn't mean to say that it was the exact maximum but I see how it could be interpreted that way.

This bug should be levied onto zxcvbn

I will file a bug report there, but the README states that the code "is for character sets which use single byte characters primarily in the code range 0x20 to 0x7E" so maybe we shouldn't be using it for extended ASCII passwords. The original version at https://lowe.github.io/tryzxcvbn/ gives a reasonable value though.

droidmonkey commented 2 years ago

I'd consider handling extended ascii in a special way

bkrl commented 2 years ago

We can take the minimum of the Shannon entropy and the entropy estimated by zxcvbn. If the user edits the password then it would be difficult to get an accurate estimate so it might be best to disable the strength estimation when there are non-ASCII characters.

sjvudp commented 1 year ago

Well, I wanted to file an issue for the password generator's entropy estimate, but I think I can piggyback here:

Example: When the password generator generates a three-digit sequence like "123", the entropy would be estimated as 10^3 states (almost 10 bits), silently claiming that "000" is just as likely as "738" (for example) . In contrast when the user enters "123" the estimate would be 3^3 (not quite 5 bits), because we cannot know whether the user just picked from the set (1,2,3), or from the set of digits. It's also known that user's input is probably not uniformly distributed, meaning the user picking "AAAAAAAA" (as eight letters) it more likely than a uniform random generator picking that (where it would be worth almost 46 bits). My pessimistic estimator would assign zero bits of entropy for that because "the user picked exactly one character for a sequence known to be eight characters long" (again, we assume the user just picked on character instead of "any letter").

To make a long story short: The entropy estimate should be fixed IMHO.

bkrl commented 1 year ago

To make a long story short: The entropy estimate should be fixed IMHO.

I agree with you but from past issues it appears that the maintainers of this project strongly insist on using the current approach.

Iiridayn commented 1 year ago

Yeah. I've a peer reviewed paper in the final stages of publication which I'll post in that issue, but I doubt it'll change their mind. I spoke with Dr Ruoti about his research (actively researching password managers) when he visited our campus a couple months ago, and came away with the impression that the most secure password managers tend to be responsive to security researcher input (though I might be extrapolating incorrectly beyond what was actually said).

sjvudp commented 1 year ago

Your maximum entropy calculation is not correct. https://generatepasswords.org/how-to-calculate-entropy/

Why not convert the claim that it is not correct to an explanation why it is not? I had been assuming that those characters could be encoded with some 8-bit charset, while keepassxc probably assumed it's some Unicode encoding, I guess.