Closed pdurbin closed 7 years ago
@lwo was recently looking at this part of the code in his pull request #3095 and because no good deed should go unpunished (kidding!) I pinged him to see if he'd like to take a swing at this issue. He agreed so I assigned it to him but I'll be keeping an eye on this issue as well. @lwo if you'd like me to start a requirements doc for this issue, please just say the word.
To be clear, implementing two factor authentication should be considered out of scope for this issue. To avoid that requirement, all we have to do is make sure passwords are 10 characters in length or greater, when the installation is flipped to "secure mode" or "sensitive data" mode. (Let's not call it "level 3 mode", which is way too Harvard-specific.)
Today @lwo met over Google Hangouts to discuss this issue. We took a few notes at https://docs.google.com/document/d/1Cq6ud-yydSPxRkTAlp_IS_5iCbBDNNk4goe7Yxp3T5A/edit?usp=sharing
I hadn't really considered that "No common names or dictionary words" necessitates having access to a dictionary! The plan is to ship one with Dataverse, to bundle it with the app, rather than relying on some operating system-specific file such as /usr/share/dict/words
on Linux. @lwo has ideas on where to find one.
It would be nice to have some more clarity at http://policy.security.harvard.edu/sa2-complex-passwords or elsewhere about the upcoming changes having to do with 20 characters. We're still using this for guidance: "The upcoming changes to the password policy will only include a relaxation of complexity rules for passwords of 20 characters or more. Everything else will remain the same." Anyway, the plan is to have a flag/setting (one of many stored in the setting
table) to turn this on or off.
@lwo will let me know if he has any more questions. Thanks for working on this @lwo!
Hello @djbrooke, I better ask you, as @pdurbin should be on his break. For this I like to use a well documented third party library http://www.passay.org/ which is dual licensed under both the LGPL and Apache 2. Can you confirm I can use this maven dependency to tackle this issue? Without response, I will work under the assumption that this is ok.
Heh. @djbrooke and I are both back now. Passay looks fine to me, from a quick glance. The website says it's built on vt-password and you may have noticed that in code (PasswordValidator.java) I had already suggested switching to this. Yes, the license looks fine and you are welcome to allow Maven to resolve dependencies. Thanks @lwo!
Hmm... those password lists... I am still unsure about the technical translation from the "No common names or dictionary words" requirement. For example: "Password" and "password123" are commonly used passwords, and we could take a dictionary list from: http://weakpass.com/ to block them. But then, those examples any the large majority of those dictionary words would not pass the validator anyway ( no caps, digits, etc ).
Now "P@$$w0rd" is not on that list and considered valid... but it is a weak password because the character substitution is an open door cipher. It makes more sense to me if we block that. If this is considered the problem, then I suppose we could add to #3227 a stripped down version of the rule based algorithm described here: https://labs.mwrinfosecurity.com/blog/a-practical-guide-to-cracking-password-hashes and use it to invalidate common names or dictionary words plus those variations of them that are the function of a simple character substitution cipher. But we should investigate how expensive that is.
Per @djbrooke the Sensitive Data project is being deferred until Dataverse 5.0 so I'm unassigning myself since I like having https://github.com/IQSS/dataverse/issues/assigned/pdurbin reflect what I'm actually working on. I'll also add the 5.0 milestone.
@lwo as we've discussed, next week we'll at least touch base about the excellent work you've done to address this issue in pull request #3227 but you don't seem to be in any hurry to see this issue addressed for own installation of Dataverse. I appreciate your flexibility on the timing!
Looking at the code, are we making users change passwords regularly? If so, we shouldn't be. Also with regards to dictionary words, The easy answer is to take the lowercase version of the password, turn symbols to their likely letter versions (@->a, 3->e), and check substrings against a dictionary with 1 and 2 letter words removed.
Sensitive data work isn't happening until Dataverse 5.0. We'll revisit this then. Closing. Thanks for your input, @oscardssmith . :smile:
It would be nice to have some more clarity at http://policy.security.harvard.edu/sa2-complex-passwords or elsewhere about the upcoming changes having to do with 20 characters.
The policy has been updated and now includes lot of language about 20 characters. Here's how it looks as of today (2017-08-09):
SA2: Servers and applications that manage passwords must force the setting of a complex password. This must meet the following requirements where technically feasible (consult the Security office if the following requirements are not technically feasible):
Use HarvardKey for authentication
OR:
Passwords of more than 20 characters in length
OR:
Passwords 20 characters or fewer in length with the following requirements: No common names or dictionary words No sequences of more than 4 digits in a row Include at least one character from at least 3 of these categories: Uppercase letter Lowercase letter Digits Special character Password reset/expiration period as follows: 10-20 characters = no periodic reset/expiration required 8-9 characters plus a second authentication factor = no periodic reset/expiration required 8-9 characters only = annual password reset/expiration required
Thankfully, we had a heads up on this change and let @lwo know about it before he opened pull request #3227 which I think more or less meets all these requirements.
Current account sign up screen on Harvard Dataverse (ignore the icons inside the boxes, that's from my password manager):
I think the main UI implications here are just messaging - the explanatory text above the Password box and the error messages when you input a password that doesn't match the requirements. Once we come to a final conclusion on exactly which requirements to implement, I can start writing this messaging.
Example of what the new text above the Password bar could look like: "Passwords must be 10-20 characters long and must include at least one uppercase letter, one lower case letter, and one digit or special character."
I think that is mostly done (with user configurable rules). the current rule is 8-9 char: 3 of Upper, lower, number, symbol. not in dictionary. Must be changed every year 10-19 char: 3 of Upper, lower, number, symbol. not in dictionary. 20-255 char: no restrictions.
Moving this back to not started for now. Thanks @oscardssmith for the work on this before you left!
Just talked with @djbrooke and @pdurbin about this issue. Our plan is: Phil is going to determine some good default out-of-the-box password requirements we can set up.
For Harvard Dataverse to support sensitive data, we're going to set it up to use the following parameters:
I quite like this approach. Not using 8 or 9 chat passwords will simplify the implementation a lot
Been giving some thought to this issue. Here's a mockup I made showing new messaging implemented with the minimum possible change to the UI: The text, for ease of copying + pasting:
Password must be at least 10 characters long, must contain no dictionary words, must not contain a sequence of more than 4 numbers in a row, and must include characters from 3 the following categories: uppercase letter / lowercase letter / number / symbol. Passwords over 20 characters in length are exempt from all requirements.
I think that this is not really the most intuitive way to present this info. It'd be easier for a user to quickly grok these various requirements if they were presented the form of a list, but that might require more substantial changes to the page than we want to make. Can that "i" string of info text support a bulleted list? Would that look OK, or would it be a mess? I'm interested in @TaniaSchlatter and @mheppler 's thoughts on this.
@matthew-a-dunlap helped me find a very instructive example of someone else doing this same thing - HarvardKey has implemented very similar password requirements, and their messaging is fantastic. Here's what their "change password" page looks like:
Note that they present the info in the form of a bulleted list, with dynamically changing bullets indicating whether the requirements have been reached. The info is presented in a little box that separates positive requirements from negative requirements. I wonder if we can model something similar to this.
P.S. ignore the little symbols in the password boxes in these screenshots, those are from my password manager.
Thanks for all the input so far. If anyone wants to hack on this issue, please use the 3150-complex-passwords
branch at https://github.com/IQSS/dataverse/tree/3150-complex-passwords
In 4b0b17b we added some static help text about password rules that looks great (screenshot below) but needs to be made dynamic. In the error message below it, you can see the reasons why they password "qwertyuio" was rejected based on the out-of-the-box rules as of that commit:
From a scope and product point of view, especially now that I added some content to the Installation Guide in eba5e87, is whether or not we should ship the :PVExpirationDays
and :PVValidatorExpirationMaxLength
settings and related code. Here are the descriptions of those settings:
The reason I'm thinking we could perhaps jettison those settings and related code is that @dlmurphy doesn't think we should use them in Harvard Dataverse and I don't know if other installations of Dataverse require them or not. If we ship these settings and code, we need to consider the UI/UX impact of passwords expiring. Do we email users periodically as the date approaches that their password expires? More generally, what is the user experience for one's password expiring?
The other thing that's on my mind isn't a question but an assertion. Yesterday @djbrooke said I'm free to make a decision about what the out-of-the-box password complexity settings should be. I played around with new CharacterRule(EnglishCharacterData.Digit, 1)
and related code and I think with a small change I can preserve the existing password complexity rules we've used since Dataverse 4.0, expressed in the UI as a "minimum six characters long and uses at least one letter and number". In the User Guide I would document recommended settings that would be in compliance with https://policy.security.harvard.edu/sa2-complex-passwords and would have a password length long enough that changing the password every year wouldn't be required (to make @dlmurphy happy π ).
Once this code is deployed, it's my understanding that some people on the team want to force all users to change their passwords. I don't think this is necessary if users' passwords are already complex enough to meet the requirements. I haven't tested any of the "force password change" logic yet but I will and will know more after I do.
Other thoughts as they come to me. I think by writing but we're going to meet about this soon. π
Messaging for Harvard Dataverse, above the password box on the account creation screen:
Your password must contain: -At least 10 characters (passwords of at least 20 characters are exempt from all other requirements) -At least 3 of the following: uppercase, lowercase, numeric, or special characters
It may not include: -Number sequences of 4 or more numbers in a row -Dictionary words or common acronyms of 5 or more letters
Great meeting with @djbrooke @scolapasta @dlmurphy and @bsilverstein and here's the todo list:
:PVExpirationDays
and :PVValidatorExpirationMaxLength
) because there's UI/UX impact and we don't need it for Harvard Dataverse because we'll simply force passwords to be 10 characters or more to be in compliance with Harvard's security policy. (If anyone out there really wants this feature, please create a separate issue.) Done in e72f382.CharacterCharacteristicsRule
code so that when installations upgrade, the rules are the same (6 characters, one letter, one number). This will probably be a new config option as this is currently hard coded. The config string may look something like Alphabetical:1,Digit:2
. Progress at 9b48745. [Still need to default disable repeated chars]weak_passwords.txt
works as intended. [@matthew-a-dunlap on this]@dlmurphy @mheppler I made the messaging dynamic again. Here's how the out-of-the-box messaging looks as of c34fdad if I try using a period as my password. I'm keeping the out-of-the-box configuration the same that we've had since Dataverse 4.0, which is requiring one letter and one number:
@mheppler I need to move the English test to the bundle but I figured we'd nail down the UI/UX first. I'm creating the ul
list on the backend.
The approach in the example above (noting what's missing/wrong) works if you also tell the user what criteria the password met.
If we can't do that, I suggest a simpler solution β telling the user that their PW does not meet the criteria, and asking them to review the criteria and try again.
From chatting with Tania about this, the easiest way to implement this might be to make the bullet points on the password criteria dynamically reflect whether each criteria has been met, similar to how HarvardKey does it in this example:
@dlmurphy do you mean easiest for the user or easiest for the developer? π Don't answer that. I can guess what you mean. π
@bsilverstein and I just talked about what's left for this issue and here's the current todo list:
:PVNumberOfRepeatingCharactersAllowed
as something like :PVNumberOfConsecutiveDigitsAllowed
. Done as of b668926.getPasswordRequirements
(the code that adds the help text bullets) to only show "may not include" items like repeated digits or dictionary words if those features are enabled.DatasetPublishPopupCustomText
). Use passwdReset.newPasswd.details
if unset. Done in 2958a06.UpperCase:1,LowerCase:1,Digit:1,Special:1
rather than treating that string like a magic string to enable Harvard's CharacterRule
list. Done in 1d8f3ae and 4fb9249.For what it's worth, the repeating characters -> repeating digits change comes from an oversight by myself and @pdurbin where we misconstrued part of the Harvard complex password requirements which reads:
No sequences of more than 4 digits in a row
I sort of rushed into developing what I thought the rule should be, which is a configurable rule limiting the amount of repeating characters in a password. Fortunately this was in the passay.org library and wasn't too tough to implement, but it doesn't stop sequences of four or more digits in a row e.g.: wronghorse8080808 would pass.
Currently I'm trying some of the other rules from http://www.passay.org/reference/ that are easy to access but there's none that look like a bullseye for this requirement. Not really looking to introduce any new libraries nor reinvent the wheel, so any ideas for this that rope in passay rules would be helpful!
From discussing(/reminiscing) with folks @ Harvard Key, they only enforce a dictionary on words 5 letters or longer. We don't want to be blocking folks from using the word "a" or "an". (EDIT: I see now that's already how its designed)
or common acronyms of 5 or more letters
@dlmurphy this isn't in https://policy.security.harvard.edu/sa2-complex-passwords which says "No common names or dictionary words" so I'm changing it to what the policy says. It's all going to depend on what dictionary file you choose. For now, I'm testing with the the standard /usr/share/dict/words
file on Linux.
I promised @TaniaSchlatter and @dlmurphy screenshots and I'm going to show below the sign up page as of 96c1c43 with two configurations:
As I mentioned to @dlmurphy next week, I'm planning on keeping two servers up to date with the code as I work on it so that we can all play around with the password rules and feel at a gut level what they feel like. Here are the two servers I'm using:
Here's the current todo list:
getRequiredCharacters
in PasswordValidatorUtil
dynamic, a follow on to making :PVCharacterRules
dynamic in 1d8f3ae. (blocked by library version, see later comment):PVCharacterRules
configurable on the fly with no Glassfish restart required, like the other rules, ensuring that the BuiltinUsersIT
tests still pass. (in investigation by @matthew-a-dunlap)scripts/api/setup-optional-harvard.sh
recommended options for Harvard Dataverse. We can't use fewer than 10 characters because we don't have two factor or password expiration.The dictionary provided to dataverse will need to be stripped of words less than 4 characters in length, otherwise it with error on matches of shorter words. This command generates a compliant dictionary from the linux words file:
cat /usr/share/dict/words | grep -x '.\{4,20\}' > ~/wordsGt3.txt
Attached is the password reset page ( passwordreset.xhtml ) with the red/green feedback @TaniaSchlatter @dlmurphy
From the checklist, "Make getRequiredCharacters in PasswordValidatorUtil dynamic" requires upgrading to a later version of the library, as otherwise one cannot get the names of the character rules stored.
I have tested the upgrade and it changes the text/number of errors thrown by the library, so it may not be worth the effort.
Note from sprint planning:
I think I have the fix in place for making pvcharacter rules configurable on the fly, but a number of the tests in testDifferentPasswordsAndSettings are now blowing up. At least some of those tests look wrong themselves. Will investigate more tomorrow.
See pull request https://github.com/IQSS/dataverse/pull/4128
@matthew-a-dunlap and I have been discussing a couple bugs as of 92af18a, best illustrated by a screenshot:
Todo list as of 34d3beb
Invalid value for PwMaxLength: null
Invalid value for PwMinLength: null
Invalid value for PVNumberOfCharacteristics: null
Invalid value for :PVNumberOfConsecutiveDigitsAllowed: null
PwDictionaries not set and no default password file found: weak_passwords.txt
Dictionary was set, but none was read in.
Invalid value for PVNumberOfCharacteristics: null
Invalid value for PVGoodStrength: null
Invalid value for PwMinLength: null
I just poked around at a working demo of this, and from a design standpoint I think it's good to go. It's a massive improvement over what we started with.
While talking with @matthew-a-dunlap about this issue, we thought of something that would improve the design even further, but we're opting not to implement it right now in order to avoid scope creep on an issue that has already been more complex than expected. But I want to note it here for posterity, as it might be worth pursuing in the future.
When one of the password requirements is "No dictionary words" and a user's password fails on this rule, it would be useful for us to tell the user what word it failed on.
For example:
If we told the user that the password failed because it included the word "Bose" then they'd more easily be able to adjust their password to make it work.
HarvardKey does this; here's a screenshot Matthew snagged:
Again, this is a "nice to have" and not essential, so let's move on for now and consider coming back to this some day in the future.
I'm not sure I love the idea of part of my password being shown to me in the the clear. There's a reason password input fields show asterisks when you type.
Some testing notes: Parts to test:
Other notes:
Issues found:
Tested all rules, 3 pages, config settings, script.
I reduced logging in e9fb520. I also merged the latest from develop (just the addition of ISSUE_TEMPLATE.md
) and re-enabled a related API test having to do with password complexity.
Tania asked for screenshots of the final implementation. There are 3 pages: 1. sign up, 2. forced password change, 3. account info change password, and 1 view of sign up page where some fields were entered correctly and some incorrectly.
Enforcing complex passwords is a requirement for storing level 3 data as defined by http://policy.security.harvard.edu/level-3 and the following text comes from http://policy.security.harvard.edu/sa2-complex-passwords as of 2016-06-01:
On 2016-05-25, we were advised that "The upcoming changes to the password policy will only include a relaxation of complexity rules for passwords of 20 characters or more. Everything else will remain the same." We expect these changes to be published at the URL above by 2016-07-01. The spirit of this change is in line with XKCD cartoon from https://xkcd.com/936/ in which the password is "correct horse battery staple":
Again, for passwords of 19 characters or fewer, the requirements above must be applied.
All of these requirements are optional if a particular Dataverse installation has not been put into some sort of "secure" mode such that we plan to put https://dataverse.harvard.edu in before we permit level 3 data to be uploaded. That is to say, the existing password complexity rules should remain (minus the bug reported in pull request #3095) out of the box. Ideally there would be a switch called "level 3" or something to enforce all of the rules above but it would be nice if each Dataverse installation had control over each of the rules separately.
We have control over password complexity for local accounts within Dataverse (stored using bcrypt per #1034 in the
builtinuser
table) but for Shibboleth accounts we must trust that each Identity Provider enforces decent passwords. Membership with a federation such as InCommon should help in this regard and in Harvard's case http://iam.harvard.edu/resources/incommon-bronze details how passwords are handed in Harvard Identity Provider (IdP):The page for Harvard's Identity Provider (IdP) above links to https://www.incommon.org/assurance/ and from the Dataverse perspective it's our understanding that all InCommon federation members have sufficient password complexity rules to meet level 3 requirements. In short, there's nothing to do code-wise within Dataverse with regard to passwords that are not stored in the database used by a Dataverse installation. These passwords are stored in each institution's Identity Provider (IdP).