whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.16k stars 2.69k forks source link

Proposal: `strong` attribute for <input type=password> #5421

Open timruffles opened 4 years ago

timruffles commented 4 years ago

A strong attribute for HTML password inputs would improve the lives of web users by providing an alternative to password format validation:

<input type=password strong>

It aims to address the motivations/constraints that typically lead developers towards password format validation. It's designed to enable password generation, and smart validation from browsers to protect users by nudging them away from reused or guessable passwords (e.g $firstname$lastname$dob).

Motivation

Password strength validation is fiddly, and many implementations:

Bad validation rules disallow strong passwords

Rule-based attempts to enforce passwords mostly generate irritation without being sophisticated enough to enforce strong passwords. e.g they'll tell you snappy aux fish jupiter and 31702abe175d9ca401a38d2a0b819265 are weak passwords, and then literally accept Password1.

JS implementations are often broken, and hamstring password managers

Password rule validation is actually quite easy to mess up. For example, JS-based validation that refuses to acknowledge complex passwords if you fill from a password manager or paste. I’ve tested and several sign-up forms from major companies - e.g telcos, banks - which pass validation only if you type each character: they’re literally counting characters one by one as they're typed.

Since passwords rules are hard to implement with the pattern attribute, imperative implementations are often picked instead.

Validation messaging is often poor

Many developers do a poor job of error messages, e.g telling the user “Please use a strong password" without explaining the rules they are required to abide by (or which they've not abided by).

Even best in class JS solutions can't know enough to protect users

zxcvbn and other heuristics implemented in JS cannot have the knowledge about the user necessary to protect them from picking attackable passwords, e.g:

  1. reusing passwords
  2. passwords based on public information, e.g first name + last name + DOB
  3. using already compromised passwords

Browser generation of passwords alone can't solve the problems of users and developers

Autogeneration alone does not address the motivations and constraints that lead to password format restrictions:

  1. it doesn't address the need for symmetrical client and server validation (client-side provides no security, and symmetry is required to avoid poor UX)
  2. autogeneration cannot be mandatory i.e it's not a validator. Users will continue to want the option to generate their own passwords (using a shared computer so a hard to remember generated password is untenable, by dicewords, $obscureOSSManager, imagination etc), thus mandatory generation is user-hostile
  3. many browsers exist, not all of which will implement autogeneration (or they will diverge). Thus developers cannot rely on autogeneration (in contrast, strong allows for server-side implementation of the baseline validation algorithm, which guarantees validation regardless of browser)

Proposed Solution

Adding a strong constraint validator for password inputs:

<input type=password strong>

This would impose two levels of validation: a simple to implement entropy baseline, and (optionally) the implementor's own algorithm which can further subset allowed passwords (i.e the passwords allowed by the baseline are a superset of those allowed by the implementor's algorithm).

The importance of server-side

The entropy baseline, and its superset relation to the optional algorithm, is necessary to ensure all passwords considered valid on the client are considered valid on the server. Specifying the baseline allows for it to be implemented in all server-side languages, and provide the necessary guarantee that a strong password has been submitted that client-side validation cannot.

It would be poor UX if the browser generated a password that passed client-side validation, but was then rejected by the server's validation rule. Sever-side validation is necessary as no client validation provides a guarantee.

Implementor supplied algorithm and UX

The implementor should also provide a user-experience that aids the user in generating a strong password, e.g:

  1. suggesting OS or third party password managers
  2. suggesting use of a pass-phrase
  3. suggesting additions to increase entropy, e.g "try adding a symbol"

Examples of the implementor's additional rules could be disallowing common passwords, disallowing or warning against password reuse, and preventing the user using public information in their passwords (name, DOB, etc).

Strength specification

strong without an attribute value will indicate 40 bits of entropy.

Alternatively it can be supplied explicitly:

<input type=password strong="50-bits">

Why not rule based attributes?

strong is a nudge away from rule-based authentication towards real entropy based models. Entropy based models by their very nature avoid ruling out strong passwords, and work more naturally with password managers.

References

j9t commented 4 years ago

What is the definition here for a “poor” or “strong” password, especially if there are “no knobs to turn”? What guarantees are to be given either way—and if there are no guarantees, given that these are hard to give, what’s the point of the proposal?

(Contrary to how this may appear, I’m asking to help make the proposal stronger.)

timruffles commented 4 years ago

Edit: I've updated the proposal and the below is a bit out of date.


@j9t thanks for asking! I think there's good existing work on password strength algorithms which estimate real difficulty vs common attacks. The zxcvbn password library and paper present one excellent model.

Do you think it's worth the proposal digging into exactly which algorithm/approach to take? I'd imagine it would be something that evolved as attacks evolved too, which is another advantage of leaving the precise details up to implementations.

If knobs were desired, again because of the way attackers evolve, I think they should also be declarative, e.g:

<input type=password strong="offline-slow-hash">

would specify the user should generate a password strong enough to resist offline slow hashing attacks. That'd be very different now and in 15 years time.

<input type=password strong="web-service">

would specify the password should be strong enough to resist online password guessing attacks.

othermaciej commented 4 years ago

This is a cool idea, but I'm not sure it's feasible.

For starters, how is the UA supposed to calculate entropy of a password?

In the case of a UA-provided strong password generator, we can do this. We can review the RNG and algorithm, calculate how many equally likely possibilities[*] it can produce, and take the log base 2 of that. Out comes entropy! We have in fact done this calculation for Safari's built-in password generator, this is part of how we chose the format and length.

But I don't know how to do this calculation for user-entered passwords because we don't know the user's selection process (and definitely can't assume they used an unbiased RNG for any part of it).

In fact, I believe information theoretic entropy can only be defined for a random variable, not for a single value in isolation. See https://en.wikipedia.org/wiki/Entropy_(information_theory) for an overview. (Information theory experts are welcome to correct me).


It might be possible to build something like zxcvbn into the client, but that doesn't use entropy, it searches for a variety of guessable sequences using data tables and heuristics, and it's not clear if what they do is suitable to be interoperably specified (among other things, it likely needs to be updated regularly, and it doesn't address non-English-speaking/non-US locales).


Alternately, we could provide a hook into an unspecified black box password strength checker. But I suspect it would cause real problems if different browsers give different answers for whether the same password is strong. Also, I don't think it would be wise for a website to use an unspecified and unknown checker instead of zxcvbn itself as a JS library (which I strongly endorse).

[*] If some values were more likely than others, they would have different information contents, or, informally, "different entropy", but that is technically not correct per above.

victornpb commented 4 years ago

There's not a single true solution to this problem, so this is my 2c:

timruffles commented 4 years ago

@othermaciej thanks, great points, you've definitely convinced me we couldn't determine entropy in any mathematically rigorous sense.

That said, I don't think the proposal depends on that. The baseline algorithm is only to ensure strong can be adopted by development teams - it's a fallback. In practice passwords would actually be constrained by the browser's more realistic, stricter, evolving algorithms.

On the baseline algorithm side, I feel that some good-enough heuristic can be derived. Remember: it just has to be good enough to allow dev teams to adopt it rather than the problematic rule-based approach they may otherwise be forced to adopt ('browsers vendors say it's compliant' is a strong argument vs check list wielders). Here's a bad first draft for the baseline algorithm to prompt someone with more cryptographic/mathematical chops to jump in with a better one:

  1. calculate best case entropy - as if it was derived from a cryptographic-quality source of randomness - bits = log2(characterSet(input)**input.length)
  2. disallow any input where bits < 40

The baseline constrains real password strength as well or better than the problematic rule-based solutions, without being password manager unfriendly. Both rule-based and the baseline will be fooled by common passwords etc disallowed by zxcvbn. But zxcvbn can't go as far as the posited browser-implemented algorithm, given that browsers know things about the user zxcvbn doesn't:

input entropy baseline 1 upper 1 symbol zxcvbn > 1 days for online browser with context
pass 18.8
password 37.6
Password1 53.6
Password1$ 61.7
correct horse battery staple 131.6
c24e54722013bda978347a453c282959 165.4
TimRuffles1985 83.4
ReusedStrongPassword88ca1d523c4 184.6

✅ = allowed by validator.

The last two rows are where the additional knowledge the browser has can identify attackable passwords zxcvbn can't: those derived from public information known about the user, and reused passwords.

NiciusB commented 4 years ago

I might be wrong here, but I think the most common way accounts get hacked is due to passwords being reused. A low entropy password can be cracked more easily, but unless you are a high value target, you are not going to be a target for this kind of attack.

I my opinion, the only way to improve password security would be something like allow="autogenerated" where it doesn't prompt the user to enter a password, but instead uses the password manager of choice of the user to generate and save a random string.

Edit: I quoted @getify as if he suggested the same thing here https://twitter.com/getify/status/1246917493588975620, but it's not exactly the same idea. He clarified that in this issue below

getify commented 4 years ago

For the record, I think sites should start generating the passwords for users, since the sites know what their own requirements are better than users or tools. And I think it's OK and proper for different sites to have different rules... my password on my bank account should be a lot stronger than my password on a blog post comment form. The blog post comment form password could be that same strength, but that's overkill so I don't think it should have to be that same strength.

Having sites generate passwords would then strongly encourage more users to use tools like browsers or password managers to save the auto-generated passwords (rather than memorizing them). That's what my various tweets recently are about.


To me it doesn't make any sense to require users to brute-force "generate" passwords up to some arbitrary complexity, which is the status quo on most of the web. Here human, please reverse engineer this regex and find a matching pattern. In response, most humans do the dumbest thing, like just adding a number and symbol and capital letter onto the end of their normal re-used password. That's bonkers. SMH.

It also doesn't make sense to "externalize" the requirements for the passwords, because then you make it slightly easier to attack. This information should be kept private; only the site should know (and care) what rules it applies. It's also a lot of over-engineering IMO to try and encode the requirements (entropy or patterns) just so password tools can adhere. Why go to that trouble?

All this points back to letting the site generate, and making it easy for tools to grab and store what was generated.

I think the attribute we need is autogenerated, on a div or span (or input[type=password]), which the password tools can detect, and then prompt the user to store this autogenerated password.

victornpb commented 4 years ago

We already have website autogenerated passwords, those are called API tokens. image Since the solution is gravitating something less user oriented, maybe we should look at how things are done in other contexts beyond the good old user/password combo.

othermaciej commented 4 years ago

I think the best solution for passwords is for the UA to generate passwords for the user by default, like Safari does. This guarantees high entropy without requiring the user to guess-and-check whether their password passes a password checker. It's also better than websites doing it individually because it's comprehensive. It also works with no markup changes or back-end changes whatsoever on websites (though there's complications for sites with unusual password format restrictions). There's no need for allow="autogenerated", Safari just does this for all password fields that we can detect as being part of account creation or password change flows.

Thus, I think there is no need for any changes to password markup along these lines to support stronger passwords. Rather, browsers can and should do what Safari does.

NiciusB commented 4 years ago

What I envisioned for allow="autogenerated" was to disallow manual input entry, enforcing an autogenerated password. The naming is confusing, tho.

timruffles commented 4 years ago

Thanks for replying @othermaciej, however I think I can't have communicated the proposal's motivation clearly:

though there's complications for sites with unusual password format restrictions

Password format restrictions are precisely the motivation for the proposal. Unfortunately, many developers are currently - and will continue to be - asked to implement password format restrictions. That's trivially observable: just take a look at a few online banking forms (we certainly do at @plaid).

Autogeneration alone does not address the motivations and constraints that lead to password format restrictions:

  1. it doesn't address the need for symmetrical client and server validation (client-side provides no security, and symmetry is required to avoid terrible UX)
  2. the autogeneration is not enforced i.e it's not a validator (which @NiciusB addressed above), it thus cannot provide assurances for security
  3. users will continue to want the option to generate their own passwords (by dicewords, $obscureOSSManager, imagination etc) (which is why enforcing autogeneration isn't tenable - it's user-hostile)
  4. many browsers exist, not all of which will implement autogeneration (or they will diverge)

So any solution that doesn't address these motivations/constraints will not reduce the number of password format restrictions in the wild.

These motivations and constraints are what led me to solution proposed above. Here's how a development team could reply to a request to implement password format restrictions if strong existed:

<input type=password strong> is as least as secure as your format restrictions, and we can validate against the baseline algorithm on the server to ensure it's followed by every user and browser. Browsers like Safari and Chrome will keep evolving the UX that helps users painlessly submit strong passwords in a way that would be prohibitively expensive for us. It's becoming a UI standard that users expect and enjoy using.

To conclude <input type=password strong /> is easier (and therefore cheaper) than your proposed password format restrictions, and as a bonus it will have a way lower bounce rate! It makes clear business and security sense.

othermaciej commented 4 years ago

Thanks for replying @othermaciej, however I think I can't have communicated the proposal's motivation clearly:

though there's complications for sites with unusual password format restrictions

Password format restrictions are precisely the motivation for the proposal. Unfortunately, many developers are currently - and will continue to be - asked to implement password format restrictions. That's trivially observable: just take a look at a few online banking forms (we certainly do at @plaid).

Fortunately, there's now a project that collects password rules across many sites (in absence of the passwordrules proposed attribute): https://github.com/apple/password-manager-resources/tree/main/quirks

This essentially solves the problem. Safari's generation works across a broad range of sites, and external password managers are using this info source too.

Autogeneration alone does not address the motivations and constraints that lead to password format restrictions:

  1. it doesn't address the need for symmetrical client and server validation (client-side provides no security, and symmetry is required to avoid terrible UX)

  2. the autogeneration is not enforced i.e it's not a validator (which @NiciusB addressed above), it thus cannot provide assurances for security

If sites want assurance that a password meets their choice of strength rules, and they want the same check on client and server site, it kind of seems like they need to use a library? A built-in strength check in the browser can't be run server side (and will not exist in older browser versions).

  1. users will continue to want the option to generate their own passwords (by dicewords, $obscureOSSManager, imagination etc) (which is why enforcing autogeneration isn't tenable - it's user-hostile)

Yes, it would be best to remove the ability to manually create a password entirely, but some users foolishly want to create their own password. Very strongly discouraging manual password generation seems doable though. In Safari, it takes a couple of steps to override the generated password.

  1. many browsers exist, not all of which will implement autogeneration (or they will diverge)

"Not all browsers will implement it" is true of all features. Browsers not implementing one feature is not a very good argument for proposing a different feature, unless there's reason to believe browsers are more likely to implement it

So any solution that doesn't address these motivations/constraints will not reduce the number of password format restrictions in the wild.

I don't think any solution will, because these constraints often exist on sites that don't update very quickly or that have legacy technology choices on their back end. Better to route around it.

These motivations and constraints are what led me to solution proposed above. Here's how a development team could reply to a request to implement password format restrictions if strong existed:

<input type=password strong> is as least as secure as your format restrictions, and we can validate against the baseline algorithm on the server to ensure it's followed by every user and browser. Browsers like Safari and Chrome will keep evolving the UX that helps users painlessly submit strong passwords in a way that would be prohibitively expensive for us. It's becoming a UI standard that users expect and enjoy using.

To conclude <input type=password strong /> is easier (and therefore cheaper) than your proposed password format restrictions, and as a bonus it will have a way lower bounce rate! It makes clear business and security sense.

The currently proposed validation rule in this issue is not good enough to ensure strong passwords, given that it thinks "Password1" is fine.

timruffles commented 4 years ago

I applaud the password-manager-resources idea, it'll certainly reduce the number of cases formats will affect people. However, given that you come up with passwords only once per site, and most people will use a long-tail service or 10 (your local gym, school, etc), it will not radically reduce each user's encounters with bad password format validation.

A built-in strength check in the browser can't be run server side (and will not exist in older browser versions).

That's precisely the motivation for the baseline algorithm in the proposal. I should have spelled it out explicitly that it was specified to allow for symmetrical execution on the server-side (and perhaps too that, of course, all user-input must be validated server-side, passwords being an especially important example).

"Not all browsers will implement it" is true of all features.

You're missing the context in which I stated this: this is from the point of view of the people who are deciding how to validate password strength. Any browser feature implemented by a subset of browsers is not solving their problem. That's another design constraint that motivated the baseline algorithm to can be used symmetrically server and client-side.

The currently proposed validation rule in this issue is not good enough to ensure strong passwords, given that it thinks "Password1" is fine.

In the end, all approaches to validating the strength of password you did not generate are heuristic. As you said above:

But I don't know how to do this calculation for user-entered passwords because we don't know the user's selection process (and definitely can't assume they used an unbiased RNG for any part of it).

NiciusB commented 4 years ago

I have just learned about setCustomValidity, and I believe it would cover this case perfectly. It's not as simple as an adding an attribute, and it requires javascript (and bundling a password entropy calculator in JS).

I know it drifts away from the original proposal, but seeing that there's no consense it might be useful to consider how it's being done out there right now. How do you feel about implementing the password requirements in JS vs HTML? Could a standard password validation feature be added from JS instead?

I have created an example using password entropy: https://stackblitz.com/edit/custom-validity-password?file=index.js Here's the important part:

function updatePasswordValidity() {
  const score = calculatePasswordEntropy(passwordInput.value)
  const isValid = score > 50

  passwordInput.setCustomValidity(isValid ? '' : errorMessage)
}

passwordInput.addEventListener('input', updatePasswordValidity)
updatePasswordValidity()
timruffles commented 4 years ago

Thanks @NiciusB, but the problem isn't that it is impossible to implement correctly in JS (using setCustomValidity, or any of the JS libraries), it's that many teams implement it badly. If you have a look at the proposal you'll see that's its motivation.

timruffles commented 4 years ago

@othermaciej I know you're incredibly busy, I'll make one more attempt to explain my reasoning here.

The proposal's goal is that developers feel able to allow browsers and password managers to define the UX of password creation. If it succeeds, very few passwords will be shaped by the symmetrical heuristic backstop: most will be generated by the browser or password manager and have the highest (PII + existing password aware) security possible.

The proposal is shaped by the constraints that currently force developers to take control of the UX instead. Those constraints are why autogeneration alone will not reduce the number of developers who feel forced to control the client-side UX. The apple/password-manager-resources project is laudable, but, given most people interact with many long-tail services, I have a hunch it will not result in a big net reduction in the number of bad password experiences1 each individual will have.

Here's a chart that maps solutions to the outcomes and the constraints I mentioned:

Client UX outcome Password manager outcome Relevance to developers' constraints Reduction of bad password UX
Roll own on client & server 😭 - often implemented badly (broken), or with user-hostile UX 😭 - JS validation incompatible with password managers, sever-side constraints invisible to them 👍👍 - rolling your own allows perfect match with constraints 😭 - rolling own client-side UX is the precise problem we're trying to solve
Browser autogeneration 👍 - doesn't help with incompatible password managers or users of dice-words/other offline techniques 👍 - good, but many users prefer using third-party password managers/generators (e.g 1pass's pronounceable auto-generation) 😭 - generation might not match mandated format 😭 - because doesn't address developers' constraints
strong on client, symmetrical heuristic 'back stop' on server-side 👍👍 - great, browser/password can define UX and use its knowledge on user to ensure highest security 👍👍 - password manager can completely define how strong UX works, and use its knowledge on user to ensure highest security 👍 - having the symmetrical heuristic implemented on the server side gives guarantee, and they know that UX + security in implementing browsers will be better than they could provide (given PII required) 👍 - addresses constraints, allows browsers + pswd managers to define many more password experiences

1 This may feel like a violation of 80/20, but remember: generating password is a one-time or infrequent experience per service. This means that although as a population people spend most of their time on the big services/sites, from each individual's perspective their sample of "generating a password UX" experiences is not dominated by the big services in the same way. Most of the logins will be local and niche - their university login, local school, hospital, custom booking software for the small businesses they use (hairdressers, physios etc).