simplesamlphp / simplesamlphp

SimpleSAMLphp is an application written in native PHP that deals with authentication.
https://simplesamlphp.org
GNU Lesser General Public License v2.1
1.07k stars 680 forks source link

Proposal: prevent password guessing with rate limiting #1259

Open pradtke opened 4 years ago

pradtke commented 4 years ago

Problem

Certain compliance situations require us to prevent brute force password guessing for logins. UserPassBase and implementing classes do not have a mechanism to prevent brute force password guessing.

Proposed Solution

We would like SSP to have built in support (which we are proposing to write) for rate limiting authentication attempts.

Proposed features:

Non proposed features:

Storing rate limiting data

We would use SimpleSAML\Store for storing the data, since it must be persistent across servers and not be tried to a specific user session.. The Store interface does not support atomic incrementation of a counter, thus in a situation where multiple request are simultaneously editing the same tracking information for a user not all of those attempts will be tracked. However, at least one will be tracked. This will allow us to perform rate limiting, even if the actual rate an attacker can do is higher than the limit configured. While not desirable, it is better than no protection and we can document the caveats of the implementation.

Rate limiting algorithm

The two most common algorithms are either a leaky bucket approach or a fixed/sliding window.

In a leaky bucket approach, a given user or ip (or etc) would be given C authentication attempts and every minute could earn N more per minute (up to a maximum of C). This allows for bursty traffic patterns.

A window approach would allow N logins per time window.

I don't have a preference for algorithm. If SSP project does have a preference, please let me know. A fixed window is likely an easy algorithm

User Messaging

I've seen systems that alert the user to being locked out and ones that just always return "Invalid username/password" for locked out users. The former is more user friendly but would require template changes to display the new message (and require translation).

Initial proposal is just to say "Invalid username/password" while logging that the user/ip is locked out in the logs. If SSP project has a preference for better messaging, please let me know.

DoS protection with device cookies

The downside of any system that locks users or ips out after failed attempts is that they can be used to denial of service a user or IP. Device cookies help protect against this. When a user successfully authenticates a long lived device cookie is stored on the browser. The cookie indicates to the system that user XXX has successfully. If a request is made for a locked out user, or from a locked out IP and that request user matches the user associated with the device cookie then the request is allowed through. After N failed attempts with the device cookie then the cookie is also locked out

Describe alternatives you've considered

Finding an apache module to perform rate limiting.

Next Steps

I'm interested in opinions/suggestions/etc prior to us starting to work on this feature.

tvdijen commented 4 years ago

What's holding you back from creating an EnhancedUserPassBase?

A few thoughts on what I've just read:

I think, we should be able to come up with some of these enhancements, perhaps as traits, that can be easily be reused in custom authsources..

pradtke commented 4 years ago

What's holding you back from creating an EnhancedUserPassBase

Nothing is holding us back from just creating our own. The goal was to make it available to classes that already extend UserPassBase. e.g. users of some subclass could now add some additional config options to the authsource to enable rate limiting.

There are classes inheriting from UserPassBase that would not be able to implement some of the proposed solutions, like the LDAP AuthSource class.. Configurable hashing alg+salt for example, is not applicable here.

I must not have explained it very well. Subclasses wouldn't have to make any changes to take advantage of rate limiting. The hash alg and salt is just for configuring rate limiting based on password. The underlying password store doesn't need to know anything about it. It also doesn't need to be enabled for those that, understandably, don't want to rate limit based on password.

IP based limits are easy too easy to bypass.. It's not gonna do us any good.

It will do us good :) The attacks we see are generally unsophisticated and just use a few IP addresses. Limiting on IP address is useful to us. For an attacker with a botnet with lots of IPs we'll rely on IP reputation and threat lists that get fed into our WAF. Again it optional. If you don't want to rate limit on IP don't enable it.

The downside of all these 'solutions' is that they all require a Store.. I think that always should be optional..

I agree on optional. I also agree that they all require someplace to store the data. I don't know of way to perform rate limiting without storing data.

I think, we should be able to come up with some of these enhancements, perhaps as traits, that can be easily be reused in custom authsources

That's an interesting idea.

hparadiz commented 4 years ago

Just some things to consider.

pradtke commented 4 years ago

Typically in a corporate network you have thousands of users sharing a single IP address so rate limiting by IP would be a huge problem there.

Yup, that's why turning on IP rate limiting is optional, or you can white list your corporate IP.

Rate limiting by username defeats the purpose because you have to decrypt the assertion before you know what username is logging in and since decryption is so CPU heavy you've already spent most of the computing resources by the time you know you should block them.

The rate limiting is for the "enter username and password" form on the IdP side, so there is no assertion to decrypt. I agree an attacker that wants to overload your CPU has other vectors they can use.

Stores are slow.

That is a good point that I hadn't consider. The more things (user, ip, password, etc) you track to rate limit, the slower it will be. However for users logging in as themselves from a browser they have previously logged in with will have a device cookie. If the device cookie is valid (which is one check) and the user enters the correct password then they will get through with one read from the store. IMHO, the vast majority of legitimate traffic will fit that pattern.

FWIW, all our memcached calls are instrumented to time them and the vast majority take less than 10 ms.

jaimeperez commented 4 years ago

I think this is a very good idea Patrick!

I agree that many users could benefit from this out of the box, if we implement it in the base class and you just need to add some configuration directives to it I see some difficult bits though:

We need to decide what do we want to protect from, since this could be used both to protect against (D)DoS or against brute forcing. The former needs mechanisms external to the software, in the servers or even the network. The latter is something we can tackle in the software. For that case, I see two approaches (not necessarily excluding each other) that could be interesting:

Does that sound interesting?

pradtke commented 4 years ago

Thanks @jaimeperez, @hparadiz and @tvdijen for you feedback. This has been helpful to me.

One is that evaluating if this is is a brute force attack is only relevant when authentication succeeds.

That is interesting idea to do it at that point, but as you pointed out there is the worry about side channel attacks with the timing or behavior. I suppose if we're going to do the checks regardless of authentication failure or success, then it wouldn't matter if it was before or after. In that case I think doing it prior to the authentication check could have some benefits, such as not passing known "bad" requests to the underlying password store where it could trigger additional problems (like AD lockout policies designed for internal connections).

The other approach would be to make this a bit more general, like a risk-based system to decide whether you authenticate the user or not, and how. I can imagine a system to plug evaluators that inspect the context and evaluate the risk for the current authentication, and callbacks that make decisions and take actions based on the evaluated risk

I like that idea a lot. It reminds me of the PAM stack for unix logins. I think I understand what an evaluator would look like, but I'm having trouble picturing the callbacks, where they are defined and the parameters.

For example, if a deployer of SSP wanted to show a captcha after 5 failed logins, and block for 10 minutes after 15 failed logins, what are you thinking pseudo code wise? Is there a TooManyLoginAttemptsForUser evaluator that calls a callback after 5 attempts? And the callback gets arguments about the number of attempts so it can decide what to do?

Or is there a chain of evaluators that create/define some structure that is passed to the callback?

Where would a deployer define their callback? Can that be done in authsources.php (I thought I remember an issue defining functions there)? Or are there some common callback behaviors that we include, and the deployer can reference in authsources.php?

Or are you thinking something else?

pradtke commented 4 years ago

I've pushed an alpha version of rate limiting to https://github.com/cirrusidentity/simplesamlphp-module-ratelimit.

@tvdijen Since it is a standalone module, I followed your suggestion of using a different class. It can wrap another UserPassBase so installers can add rate limiting to the ldap, admin, etc authsources through configuration

@hparadiz I added a log line that logs the overhead time of running rate limiting for each authentication attempt, along with the time taken by the underlying authsource. I can report real wold timing results once we finish our internal testing and run this in production for awhile. For IP based restrictions, they are disabled by default, and if enabled it also supports white listings IPs and subnets

@jaimeperez I used your evaluator idea, though I called it Limiter. Installers can enable different limiters as they see fit. They can also write their own. I included 4 of them in the module, 2 of which are enabled by default.

Anyhow, we'll test and gather data, and then we can discuss next steps.

tvdijen commented 4 years ago

This is pretty cool @pradtke ! I have one remark on the 'delegate' setting though. I think it would be better to refer to the delegate authsource by name, instead of having all of the delegate authsource's config in your module's config. We do this in a similar manner for the negotiate-module, so we can keep authsource-config in one place.

Also, I see you're introducing an InMemoryStore for testing purposes.. Perhaps it's better to reuse Symfony's Array Cache adapter instead..

pradtke commented 4 years ago

@tvdijen Using \SimpleSAML\Auth\Source::getById($authId); is actually what I did in my first pass through the code. Then I realized that the authsource I wanted to protect could still be called directly by an attacker via the path module.php/core/authenticate.php?as=ldap, thus bypassing the rate limiting. As a result I did the (ugly) method of using reflection and embedding the other authsources config in mine , to ensure it can't be called directly. I'm certainly open to other ideas (since it is ugly)

For the Cache adapter I can take a look, though I don't understand the intent. Is your idea that I create an SSP Store to PSR Cache bridge? I wonder about the 'key' syntax compatibility - the PSR cache has restrictions on characters in the key, which the SSP Store does not.

tvdijen commented 4 years ago

For the Cache adapter I can take a look, though I don't understand the intent. Is your idea that I create an SSP Store to PSR Cache bridge? I wonder about the 'key' syntax compatibility - the PSR cache has restrictions on characters in the key, which the SSP Store does not.

My idea was to use the PSR array cache directly from your tests, so you don't need your own class.. Must admit I missed the string typehints there, so it may not be a good idea..

We'll discuss the other this week during our Madrid meetup

thijskh commented 4 years ago

I think it’s great to see there’s a module already. Simce it’s now in a separate module I would propose to remove this feature as a todolist item for 2.0. That doesn’t preclude working on it, but it does also not make it a blocker for 2.0.