Review possible policy enhancements

ryannewington commented 5 years ago

Do you envision in enhancing the following 2?: "Points-based complexity policy definition" --> assigning points to number of banned words used in password (basically like the MSFT algorithm)

"Regular expression policies" --> for example I would like to be able to create a regular expressions similar to ^$BANNED([0-9]|[a-z]|[A-Z]){1,8}$ and define that in the "passwords must not match a specified regular expression" where $BANNED is one of the banned words from the banned words list

Originally posted by @zjorz in https://github.com/lithnet/ad-password-protection/issues/22#issuecomment-517799627

ryannewington commented 5 years ago

I'm not sure what the MSFT algorithm is, can you elaborate? What would you like it to do?

ryannewington commented 5 years ago

"Regular expression policies" --> for example I would like to be able to create a regular expressions similar to ^$BANNED([0-9]|[a-z]|[A-Z]){1,8}$ and define that in the "passwords must not match a specified regular expression" where $BANNED is one of the banned words from the banned words list

As banned words are hashed before being added to the banned word store, its not possible for us to retrieve the plan text later. Even if we did, as we have no limit on the size of the banned word store, each password set operation could be prohibitively expensive in cases where there are a lot of words in the store. For example, we have the english dictionary in our store. That's about 32,000 banned words that would need to be processed.

We need to run policies in a way where they are computationally appropriate, while adding the highest amount of value. We can't capture every possible situation where someone might use part of a banned word. We just need to raise the bar for attackers trying to brute force and spray passwords by removing the low hanging fruit and 'breaking' the character substitution dictionaries they know that people use to reduce the number of permutations they need to try.

What's the scenario you are trying to protect from in your regex example. There might be a more efficient way.

zjorz commented 5 years ago

I'm not sure what the MSFT algorithm is, can you elaborate? What would you like it to do?

REMARK: Remember "Banned Password List = MSFT List (content is unknown) + Custom List (max 1000 words)"!

Normalize password a. all letters in password to lower case (e.g. A->a, B->b, etc.) and compare with banned password list. If true, deny! b. Replace special characters to normal letters (e.g. @->a, 1->I, etc.) and compare with banned password list. If true, deny!
Fuzzy matching --> Check the password against the banned password list taking 1 edit distance into account (e.g. "abcdef" vs. "abcdeg"). If true, deny!
Check if password contains: a. First Name. If true, deny! b. Last Name. If true, deny! c. Tenant Name. If true, deny!
If still not denied, calculate the score of the password a. Minimum allowed score for passing is 5 points b. Check which words on BPL are in password, assign 1 point for each (words in BPL are spring 2018 asdf) i. found substrings (“Spring2018asdfj236”) = 1 point each –> 3 points in this example c. For every remaining individual characters, assign 1 point for each character i. found individual chacracters (“Spring2018asdfj236”) = 1 point each –> 4 points in this example d. Total = 7 points –> PASS

zjorz commented 5 years ago

"Regular expression policies" --> for example I would like to be able to create a regular expressions similar to ^$BANNED([0-9]|[a-z]|[A-Z]){1,8}$ and define that in the "passwords must not match a specified regular expression" where $BANNED is one of the banned words from the banned words list

As banned words are hashed before being added to the banned word store, its not possible for us to retrieve the plan text later. Even if we did, as we have no limit on the size of the banned word store, each password set operation could be prohibitively expensive in cases where there are a lot of words in the store. For example, we have the english dictionary in our store. That's about 32,000 banned words that would need to be processed.

We need to run policies in a way where they are computationally appropriate, while adding the highest amount of value. We can't capture every possible situation where someone might use part of a banned word. We just need to raise the bar for attackers trying to brute force and spray passwords by removing the low hanging fruit and 'breaking' the character substitution dictionaries they know that people use to reduce the number of permutations they need to try.

What's the scenario you are trying to protect from in your regex example. There might be a more efficient way.

Even if we did, as we have no limit on the size of the banned word store, each password set operation could be prohibitively expensive in cases where there are a lot of words in the store That's probably why MSFT has a limit of 1000

So what I'm trying to achieve...? I'm not against people using words in Passwords, but where do you draw the line? Let's say lithnet and cool and software are in the banned word list According to MSFT algorithm L1thNet1q would not be allowed (3 points < 5) But L1thNet1qAx would be allowed. I still think that is too weak as you have a password with a known factor lithnet followed by 5 random chars, so your pwd basically is 5 chars strong as you already know part of it. What you already know from a password does not really count anymore in determining strength. I know that your normalization rule remove numbers and symbols before and after (e.g. L1thNet1234 --> L1thNet --> lithnet) but when when using L1thNet1qAx that algorithm does not work For example L1thNet1qAvu7t0906gj would be ok. Hence my regular expressions ^$BANNED([0-9]|[a-z]|[A-Z]){1,8}$ do not allow banned words in a password that have 8 chars or less after it. A banned word followed by 9 chars or more is ok. Been trying to solved that using the policies you have in LPP but failed to succeed. I was really tired yesterday so when I look at it again I might be able to solve it. But that's for later today. I would love to hear your thoughts on this

ryannewington commented 5 years ago

Understood, and agree.

I took a bit of a different approach when coming up with the algorithm we have in place today. I took a billion or so passwords, and applied different variants of possible normalization rules on them. After each computation, i was left with a number of unique passwords smaller than the original list. I reached a point where tweaking the algorithm more produced only miniscule improvements in the number of unique passwords.

The goal was to prevent users setting weak passwords, where entropy was reduced by using known, common, human patterns of adding memorizable complexity to common words. Why? To reduce the success of both password spraying attacks, and to a lesser extent, offline brute-force attacks against the NTLM hashes.

So the mechanism today prevents use of a word as the base of a password, by interfering with the most common ways humans try to 'add complexity' to meet their password requirements. It doesn't attempt to prevent the use of that word at all. Using a banned word, with unpredictable data either side of it, still provides high resistance to password spray and brute force attacks. This mechnanim is about increasing entropy and reducing predictablity.

It makes sense why Microsoft have placed a limit on the number of banned words. The password must be evaluated against every banned word. We dont have any limits as we normalize the password once and then see if that value is in the store.

We can look at having a similar feature, but it would have to be a different policy to the banned word feature. With limits on the number of these super banned words you can have. However, we need to articulate the threat we are protecting from here. I dont think it will do much for password spray attacks, but is probably more relevant for offline brute force attacks against the AD database.

That in itself is a more complicated topic, with of course length really being the key to protect against those sort of attacks.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.

lithnet / ad-password-protection

Review possible policy enhancements #25