use hibp to prevent compromised passwords

nikhiljha commented 4 years ago

A lot of the "security" rt tickets have to do with compromised passwords. These are almost definitely from a credential stuffing attack, which can actually be somewhat mitigated by disallowing known-compromised passwords.

There's a PAM module here, but it makes an HTTP request to the haveibeenpwned API.

If we'd rather not make an API call every time someone enters their password, there are bloom filters ~2GB that have pretty good false positive rates.

Downsides to using a bloom filter include: it needs to be regenerated from the latest API data every so often, it has some false positives (1 in a million).

Alternatively alternatively, we can just host the entire hibp API for internal use (and also recommend that it gets applied to all the WordPress sites running on OCF infra). Someone made a tool to do this here: https://github.com/ralscha/selfhost-hibp-passwords.

emmatyping commented 4 years ago

I don't think I have a problem with self hosting the hibp data, my only concern there is does it take a long time to search?

nikhiljha commented 4 years ago

No, it's all stored in a hash-tree-like thing. Search times should be ~a few ms.

emmatyping commented 4 years ago

Then that seems optimal, I definitely don't want to rely on hibp being up for account creation.

jvperrin commented 4 years ago

We could also do something like try and query the API but have a fallback to just allow the password if it's down.

nikhiljha commented 4 years ago

Ideally the password would get checked every time it's used, not just for account creation. It's possible that it got compromised between account creation and usage.

At the very least we need to make sure that everyone's current password gets checked at least once during login, since there are a lot of existing accounts with potentially questionable passwords.

nikhiljha commented 4 years ago

Also, the full database is like 10GB compressed/slightly larger uncompressed. Hosting it and throwing data at it is probably (TM) not that big a deal.

nikhiljha commented 4 years ago

Ok it should be just...

Compile this debian package: https://github.com/skx/pam_pwnd
Add auth required pam_pwnd.so try_first_pass to the relevant pam config.

...but I don't have any debian systems atm so I can't test it.

cg505 commented 4 years ago

We need to make sure that when it fails, we alert the user to change their password instead of just failing to log in.

Also, debian uses pam_auth_update, so we shouldn't edit pam config directly. See https://github.com/ocf/puppet/blob/master/modules/ocf/manifests/auth.pp#L81-L116 in puppet or check out /usr/share/pam-configs on an OCF host.

cg505 commented 4 years ago

~We also must make sure things do not break when a user authenticates with a kerberos ticket instead of a password.~ nvm this is an ssh thing, pam is not involved

kpengboy commented 4 years ago

Somewhat off topic, but I wonder if we have also considered alteratives to cracklib like zxcvbn.

aaronjanse commented 3 years ago

I'll work on this issue this weekend

aaronjanse commented 3 years ago

Downsides to using a bloom filter include: it needs to be regenerated from the latest API data every so often, it has some false positives (1 in a million).

I think we could expand the filter to 3 GB instead of 2 GB to make the false positive rate 1-in-a-billion (1e-9 = (1e-6)^(3/2)). I assume that rate would be acceptable?

emmatyping commented 3 years ago

I think we could expand the filter to 3 GB instead of 2 GB to make the false positive rate 1-in-a-billion (1e-9 = (1e-6)^(3/2)). I assume that rate would be acceptable?

I should hope so!

nikhiljha commented 3 years ago

The bloom filter is cool but why spend compute regenerating it when we can spend storage instead 😁

or better yet, fail open + hibp api k anonymity

ocf / projects

use hibp to prevent compromised passwords #39