Open yamikuronue opened 6 years ago
If you mean using the API, I will note that it's not all positive for security to use it. It's a tradeoff. On one hand it prevents common or publicly leaked passwords from being used, but has the downside that it exposes 20 bits (about 3-4 characters) worth of password entropy to a third party. If one assumes the third party to be compromised/untrusted, this potentially makes a password which is not found to be vulnerable about 1048576 times faster to brute-force crack if the account it corresponded to was also known or easily guessable from account creation timing. It may be that this tradeoff could be considered worthwhile... but... 20 bits is a lot of lost entropy if one assumes the worst in terms of the worst case of a breach of the third party. Personally I think the tradeoff is most likely a win in the case of most users, but disclosing 20 bits of password entropy to a third party is a big enough deal that I tend to think it may be ethically questionable to use it without giving notice to users, but accurately conveying such notice in a way that gives a fair representation of the tradeoff is... not exactly user friendly... though maybe it's good enough to have a tiny "Passwords checked with Pwnd Passwords" little text thing with link to the site for more information.
The other way to use it, is by storing a copy of the database locally. In this case, there is (almost) only upside to the security, no downside.... except that the database is huge, taking quite a chunk of disk space wherever it's installed.
@Bluenaxela
does the first five characters of the SHA1 hash really leak that much information?
https://haveibeenpwned.com/API/v2#SearchingPwnedPasswordsByRange
As i understand it it should be theoretically "impossible" to get from the first five characters of the SHA1 hash to the password that was hashed, and if a third party knew the first six characters they would at best be able to eliminate 600ish of the brazilliand and seven different passwords that would hash to that prefix.
Or am i misunderstanding where the entropy leak is coming from?
@AccaliaDeElementia
5 characters of the (hex encoded) SHA1 hash is by definition 20 bits of entropy and is what I'm referring to.
Directly getting the password back from the first five characters of the SHA1 hash is not directly feasible no, but your estimate of how it impacts brute force guessing is incorrect. It doesn't eliminate "600ish of the brazilliand and seven". It eliminates literally ~99.9999% of possible password guesses, or to put another way, eliminates in the ballpark of 999999 out of every 1000000 passwords. A brute forcing system could check the SHA1 of possible guesses in no time flat, before attempting to use the comparatively very very few remaining guesses to do an actual login to the site.
If a password was weak enough that Eve or Mallory could guess it in an average of ten million attempted logins, well now they could guess it in an average of just 10 logins if they had those five characters from the hash.
Basically, a potential factor of one million reduction in brute forcing time is nothing to sneeze at, and it's particularly a danger in the case of users with weak passwords that barely squeak by filters that try to avoid letting a user set weak passwords.
A truly good password should have enough margin in it's entropy that it'd likely remain decently strong in the face of such leaked information, but one can't always assume users will have such truly good passwords in spite of one's best efforts to nudge them in the right direction. Human creativity, the way most normal non-technical end users come up with passwords, tends to strongly optimize for the easiest to remember thing that passes whatever filters prevent them from setting a password, and the easiest to remember things also have the least entropy margin keeping them safe. There's often simply not much entropy to spare before things start getting dicey.
hmm....
I can see where you are coming from, but i feel the 20 bits of leaked entropy are worth it to remove passwords that are already breached. it's not a decision to be done lightly, but if properly implemented it should be difficult to associate those leaked 20 bits (should they become leaked, which i think is not likely) with an account in a manner that could be used to facilitate an attack.
Hows about this for a passphrase policy?
correct horse battery staple
would be more secure without being harder to rememberI mean i'm whipping that out of my tailhole so they'll probably need tweaking. Please let me know your feelings on that. I'm NOT a security background sort of person, so this feedback is valuable.
@AccaliaDeElementia I agree that in the balance, it is likely worth it, but I do think the (admittedly small) possibility of a kinda significant downside (which could have impact outside the context of the site, due to bad habits of re-use) makes it ethically important to indicate it in the UI.
It may also be a good idea to give an option to a server admin to use a local copy of the pwned passwords database.
Regarding other aspects of your suggested policy:
correct horse battery staple
thing is flawed advice for coming up with passwords. Using a few words would be fine if the source of the words was true randomness with no human intervention, but the way that is usually interpreted is people just picking words at a whim. Humans are not good sources of randomness. People rarely appreciate just how non-random their "random" picks they make are. The words that humans would pick are heavily biased statistically speaking, and multiple words would also have a strong statistical interdependence when picked in sequence. Even a human choice to reroll a properly random phrase generator, can introduce more bias than is ideal.@AccaliaDeElementia Oh, and also... regarding the lockout for incorrect attempts, I think that's generally a good idea , but it may be worth giving thought to reducing the risk of it becoming a denial of service against a user. One option is to have a a staged soft lockout. Consider the following concept:
The above rules may not be perfect and may need further refinement... but they are deliberately designed with progressively more aggressive measures, to stop someone before it has to get so aggressive the legitimate user gets locked out or is otherwise inconvenienced.
Stage 1 blocks non-captcha-bypassing bots, stage 2 discourages (lazy) malicious humans, stage 3 steps things up with a auto-IP-whitelist that tries to avoid locking out the real user, stage 4 is the last-resort stage.
Integrate with Pwnd Passwords.