spring-attic / spring-security-oauth

Support for adding OAuth1(a) and OAuth2 features (consumer and provider) for Spring web applications.
http://github.com/spring-projects/spring-security-oauth
Apache License 2.0
4.7k stars 4.05k forks source link

Concurrency problems refreshing OAuth2 tokens #834

Open miguelfgar opened 7 years ago

miguelfgar commented 7 years ago

Hi,

I'm using @EnableOAuth2Sso to implement the pattern that allows an "API / Security Gateway" (Zuul in our case) to handle OAuth2 tokens (acting as client, authorization_code grant type). The idea is to have a single OAuth2 token per user session stored in the session itself (Spring achieves this by creating a DefaultOAuth2ClientContext bean with "Session" scope by default - this makes the OAuth2ClientContext containing the OAuth2 token to be "linked" to the user session). The duration of the user session is not linked to the OAuth2 access token duration (in my opinion that would be a bad practice) and so the oauth2 token needs to be refreshed several times along the user session. However, following this pattern, the idea is to keep a single OAuth2 access token (and refresh token in our case) in the user session at a time.

We have detected that in high concurrency scenarios when Zuul (acting in our case as Security Gateway) receives several requests that are associated to the same user session (same session id) in a very narrow time frame they try to refresh the same access token (with the same refresh token). In the rest of cases (when the refresh is not necessary) everything works fine even with high concurrency.

What happens is the following (Thread1 and Thread2 are triggered by http requests linked to the same user session):

In this scenario (if the refresh token does not change on refreshing access tokens) the application works because in this race condition always a new access token persists in the OAuth2ClientContext (and so in the user session in our case) and subsequent requests can use it with sucess. However, unnecessary access tokens are created and stored in the CTS (Authorization Server) which ideally should not happen.

If the AS is configured to provide a new refresh token, every time an access token is refresh the result is worst as only the first request to refresh the access token will work (for the second the refresh token will not be valid anymore). The second call to refresh the token would result in an error from the AS and Spring removes the access token from the OAuth2ClientContext (it sets it to NULL) and the OAuth2 access token is lost for the subsequent requests.

I provide a log with DEBUG enabled that shows the described scenario (in this case with the refresh token not changing after refreshing an access token). We are using Spring Boot 1.3.7, Spring Cloud Brixton-SR5, spring-oauth2 2.10.0

Please note that the two requests are received by Zuul with 10ms difference more or less..

2016-08-26 11:14:46.671 DEBUG 13248 --- [nio-8765-exec-6] o.s.c.s.o.proxy.OAuth2TokenRelayFilter   : restTemplate.getResource().getClientId() value: poc-spa-client
2016-08-26 11:14:46.671 DEBUG 13248 --- [nio-8765-exec-6] o.s.c.s.o.proxy.OAuth2TokenRelayFilter   : rauth.getOAuth2Request().getClientId() value: poc-spa-client
2016-08-26 11:14:46.671 DEBUG 13248 --- [nio-8765-exec-6] o.s.s.oauth2.client.OAuth2RestTemplate   : THREAD NAME: http-nio-8765-exec-6 THREAD ID: 80 ACCESS TOKEN FROM OAuth2ClientContext: c55b7bc7-d1d9-4138-abc8-522fae28bd60 REFRESH TOKEN: b3659bd9-f7ba-45e1-adbe-97f926b4901b
2016-08-26 11:14:46.671 DEBUG 13248 --- [nio-8765-exec-6] o.s.b.f.s.DefaultListableBeanFactory     : Creating instance of bean 'scopedTarget.accessTokenRequest'
2016-08-26 11:14:46.672 DEBUG 13248 --- [nio-8765-exec-6] o.s.b.f.s.DefaultListableBeanFactory     : Returning cached instance of singleton bean 'org.springframework.security.oauth2.config.annotation.web.configuration.OAuth2ClientConfiguration'
2016-08-26 11:14:46.673 DEBUG 13248 --- [nio-8765-exec-6] o.s.b.f.s.DefaultListableBeanFactory     : Finished creating instance of bean 'scopedTarget.accessTokenRequest'
2016-08-26 11:14:46.674 DEBUG 13248 --- [nio-8765-exec-6] o.s.s.oauth2.client.OAuth2RestTemplate   : THREAD NAME: http-nio-8765-exec-6 THREAD ID: 80 EXISTING TOKEN FROM OAuth2ClientContext: c55b7bc7-d1d9-4138-abc8-522fae28bd60 REFRESH TOKEN: b3659bd9-f7ba-45e1-adbe-97f926b4901b
2016-08-26 11:14:46.674 DEBUG 13248 --- [nio-8765-exec-6] g.c.AuthorizationCodeAccessTokenProvider : Retrieving token from http://openam.example.com:8080/openam/oauth2/access_token?realm=bank_employees
2016-08-26 11:14:46.675 DEBUG 13248 --- [nio-8765-exec-6] o.s.web.client.RestTemplate              : Created POST request for "http://openam.example.com:8080/openam/oauth2/access_token?realm=bank_employees"
2016-08-26 11:14:46.675 DEBUG 13248 --- [nio-8765-exec-6] g.c.AuthorizationCodeAccessTokenProvider : Encoding and sending form: {grant_type=[refresh_token], refresh_token=[b3659bd9-f7ba-45e1-adbe-97f926b4901b]}
.......................
2016-08-26 11:14:46.681 DEBUG 13248 --- [nio-8765-exec-2] o.s.c.s.o.proxy.OAuth2TokenRelayFilter   : restTemplate.getResource().getClientId() value: poc-spa-client
2016-08-26 11:14:46.681 DEBUG 13248 --- [nio-8765-exec-2] o.s.c.s.o.proxy.OAuth2TokenRelayFilter   : rauth.getOAuth2Request().getClientId() value: poc-spa-client
2016-08-26 11:14:46.681 DEBUG 13248 --- [nio-8765-exec-2] o.s.s.oauth2.client.OAuth2RestTemplate   : THREAD NAME: http-nio-8765-exec-2 THREAD ID: 68 ACCESS TOKEN FROM OAuth2ClientContext: c55b7bc7-d1d9-4138-abc8-522fae28bd60 REFRESH TOKEN: b3659bd9-f7ba-45e1-adbe-97f926b4901b
2016-08-26 11:14:46.681 DEBUG 13248 --- [nio-8765-exec-2] o.s.b.f.s.DefaultListableBeanFactory     : Creating instance of bean 'scopedTarget.accessTokenRequest'
2016-08-26 11:14:46.681 DEBUG 13248 --- [nio-8765-exec-2] o.s.b.f.s.DefaultListableBeanFactory     : Returning cached instance of singleton bean 'org.springframework.security.oauth2.config.annotation.web.configuration.OAuth2ClientConfiguration'
2016-08-26 11:14:46.681 DEBUG 13248 --- [nio-8765-exec-2] o.s.b.f.s.DefaultListableBeanFactory     : Finished creating instance of bean 'scopedTarget.accessTokenRequest'
2016-08-26 11:14:46.681 DEBUG 13248 --- [nio-8765-exec-2] o.s.s.oauth2.client.OAuth2RestTemplate   : THREAD NAME: http-nio-8765-exec-2 THREAD ID: 68 EXISTING TOKEN FROM OAuth2ClientContext: c55b7bc7-d1d9-4138-abc8-522fae28bd60 REFRESH TOKEN: b3659bd9-f7ba-45e1-adbe-97f926b4901b
2016-08-26 11:14:46.682 DEBUG 13248 --- [nio-8765-exec-2] g.c.AuthorizationCodeAccessTokenProvider : Retrieving token from http://openam.example.com:8080/openam/oauth2/access_token?realm=bank_employees
2016-08-26 11:14:46.682 DEBUG 13248 --- [nio-8765-exec-2] o.s.web.client.RestTemplate              : Created POST request for "http://openam.example.com:8080/openam/oauth2/access_token?realm=bank_employees"
2016-08-26 11:14:46.682 DEBUG 13248 --- [nio-8765-exec-2] g.c.AuthorizationCodeAccessTokenProvider : Encoding and sending form: {grant_type=[refresh_token], refresh_token=[b3659bd9-f7ba-45e1-adbe-97f926b4901b]}
2016-08-26 11:14:46.709 DEBUG 13248 --- [nio-8765-exec-2] o.s.web.client.RestTemplate              : POST request for "http://openam.example.com:8080/openam/oauth2/access_token?realm=bank_employees" resulted in 200 (OK)
2016-08-26 11:14:46.709 DEBUG 13248 --- [nio-8765-exec-2] o.s.w.c.HttpMessageConverterExtractor    : Reading [interface org.springframework.security.oauth2.common.OAuth2AccessToken] as "application/json" using [org.springframework.http.converter.json.MappingJackson2HttpMessageConverter@196da]
2016-08-26 11:14:46.710 DEBUG 13248 --- [nio-8765-exec-2] o.s.s.oauth2.client.OAuth2RestTemplate   : THREAD NAME: http-nio-8765-exec-2 THREAD ID: 68 BEFORE WRITING NEW TOKEN, CHECKING THAT THE TOKEN IN THE CONTEXT HAS NOT CHANGED... EXISTING TOKEN FROM OAuth2ClientContext: c55b7bc7-d1d9-4138-abc8-522fae28bd60 REFRESH TOKEN: b3659bd9-f7ba-45e1-adbe-97f926b4901b
2016-08-26 11:14:46.710 DEBUG 13248 --- [nio-8765-exec-2] o.s.s.oauth2.client.OAuth2RestTemplate   : THREAD NAME: http-nio-8765-exec-2 THREAD ID: 68 UPDATED CONTEXT WITH NEW TOKEN, WRITTEN IN OAuth2ClientContext: 472ce917-1ed0-48f5-b098-1104ad832320 REFRESH TOKEN: b3659bd9-f7ba-45e1-adbe-97f926b4901b
2016-08-26 11:14:46.712 DEBUG 13248 --- [nio-8765-exec-2] o.s.b.f.s.DefaultListableBeanFactory     : Returning cached instance of singleton bean 'ribbonRestClient'
2016-08-26 11:14:46.722 DEBUG 13248 --- [nio-8765-exec-6] o.s.web.client.RestTemplate              : POST request for "http://openam.example.com:8080/openam/oauth2/access_token?realm=bank_employees" resulted in 200 (OK)
2016-08-26 11:14:46.722 DEBUG 13248 --- [nio-8765-exec-6] o.s.w.c.HttpMessageConverterExtractor    : Reading [interface org.springframework.security.oauth2.common.OAuth2AccessToken] as "application/json" using [org.springframework.http.converter.json.MappingJackson2HttpMessageConverter@196da]
2016-08-26 11:14:46.723 DEBUG 13248 --- [nio-8765-exec-6] o.s.s.oauth2.client.OAuth2RestTemplate   : THREAD NAME: http-nio-8765-exec-6 THREAD ID: 80 BEFORE WRITING NEW TOKEN, CHECKING THAT THE TOKEN IN THE CONTEXT HAS NOT CHANGED... EXISTING TOKEN FROM OAuth2ClientContext: c55b7bc7-d1d9-4138-abc8-522fae28bd60 REFRESH TOKEN: b3659bd9-f7ba-45e1-adbe-97f926b4901b
2016-08-26 11:14:46.723 DEBUG 13248 --- [nio-8765-exec-6] o.s.s.oauth2.client.OAuth2RestTemplate   : THREAD NAME: http-nio-8765-exec-6 THREAD ID: 80 UPDATED CONTEXT WITH NEW TOKEN, WRITTEN IN OAuth2ClientContext: 746bc205-fa9a-4929-865f-a9426204b84b REFRESH TOKEN: b3659bd9-f7ba-45e1-adbe-97f926b4901b

I guess this is something that someone might have come across even though is not easy to detect if the AS is configured not to change the refresh token when you refresh an access token (because your application will still work), you don't have high concurrency requirements or your access tokens long last (as it is more likely to refresh short lived access tokens and so is more likely to come across this problem)

Can someone help with this or can @dsyer can come up with something handy like most of the times?

I thought this could be somehow controlled using a flag, sort of semaphore to control that the refresh of the token is already in progress (for the other threads to be aware of). However i would like to hear from the community and specially from @dsyer or the rest of the Spring team if they can help with this.

Thanks so much for your help!

miguelfgar commented 7 years ago

Hi,

I'm still struggling with this concurrency issue and I guess I have a more precise idea about what is exactly the problem.

Using @EnableOAuth2SSO annotation we get auto-configuration so that Spring takes care of handling OAuth2 tokens for us per user session (this means, the idea is to have 1 oauth2 token at a time per user session stored in the HttpSession itself (inside OAuth2ClientContext object) - the access token can change as it needs to be refreshed.

The concurrency problem arises when several requests to access OAuth2 protected resources from HttpRequest belonging to the same user session arrive to the server at the same time. There is a possibility that 2 or more threads read the access token (and refresh token) from the OAuth2ClientContext (Session-scoped in our case). Only the first one will be able to refresh the access token using the refresh token, the rest will fail...

Two things I found out about this:

1. The possibility of a token refresh failing because of concurrency (another thread has used the same refresh token) is not controlled -> Using authorization_code grant, for instance (AuthorizationCodeAccessTokenProvider) it redirects to user authorization (as the first time you need to obtain an access token). From my point of view this is correct if the refresh token has also expired (as there is no way to refresh the access token) but not in the case of getting the error because of a race condition. In this case (depending on the AS config to grant refresh tokens) the result is that N access tokens are generated in the AS. I guess the solution would be to try to detect the failure because a refresh token has already been used (because of concurrency) and then check if the context has been upgraded (by another thread) with a valid access token (not expired) and return that one.

2. OAuth2ClientContext (both session-scoped, which is my case, or request-scoped) is proxied (proxyMode = ScopedProxyMode.INTERFACES - One thread will not realize OAuth2ClientContext upgrades performed by other threads. Trying to implement the solution explained in 1) I realized that another problem is that if 2 threads read the current access token (and refresh) from the OAuth2ClientContext, thread 1 refreshes the OAuth2ClientContext (with the new token) in session, then thread 2 has a "stale" version of OAuth2ClientContext, it will not realize the changes that thread 1 performed in this object. I guess this might be because OAuth2ClientContext is proxied and it reads the value from the HttpSession just once per thread but then is not realizing changes. Which makes complicated to implement 1)

I'm still working on it but I would appreciate if someone could point me in the right direction with this. Specially @dsyer or another Spring member, if they could help. It would be great at least to confirm that I have explained myself properly and someone understand the concurrency issue and what I explained in 2) really happens. Some advice or direction about how to workaround this and implement a solution this would be great and also to know if the approach described in 1) (when refreshing a token fails see if there is up-to-date access token in the context and return that one) is correct.

Thanks in advance.

Regards

dsyer commented 7 years ago

Seems like a good analysis to me. If I was trying to fix it, I would be looking at trying to add a lock in the client to protect the code where it refreshes the access token.

miguelfgar commented 7 years ago

Thanks @dsyer for your answer and help. Since you wrote I have been studying this case thoroughly, I performed further tests and I have new findings (I'll try to be brief explaining it):

1. Adding a "lock" to the point where an access token is obtained (returning existing one or refreshing it) from context (OAuth2ClientContext) is necessary but this will not be enough depending on the HttpSession implementation

In order to test this I have designed a test in our project that forces high concurrency requests per user session to ZuulProxy (we follow "@EnableOAuth2SSO pattern") to OAuth2 protected resources. We do this so we can easily reproduce the concurrency issue or, conversely, when we try to fix it confirm that the issue is not happening.

Solving the problem with multi-threading (adding the lock - I explain where exactly in the next section) I have found out:

The reason is that even we add the proper "lock" (or synchronized block) to the part of code in charge for the refresh the writes to Redis performed by Spring Session seem to be asynchonous (this mean, once the thread writing the new access and refresh token writes it to the context it continues the execution but this is not persisted in Redis at the moment, this is happening in paral.lel. The conclusion is that controlling thread-safety in this case is not enough.

I'm still doing research in this scenario (Redis backed session) tunning "redisFlushMode" (changing to RedisFlushMode.IMMEDIATE) but I have not been sucessful yet.

2. A good place to add the lock and guarantee thread-safety is OAuth2RestTemplate Because it allow to apply the fix (concurrency control) in a single place. Zuul delegates on OAuth2RestTemplate for obtaining an access token (this implies obtaining it for the first time, returning an existing one if not expired or refreshing a new one an returning it if it is a flow that supports refresh). OAuth2RestTemplate delegates in the AccessTokenProviders that are configured to obtain / refresh the tokens. I would say OAuth2RestTemplate is the place to add the fix.

I manage to make it work (using default HttpSession - in memory) making this method in OAuth2RestTemplate syncrhonized:

    protected **synchronized** OAuth2AccessToken acquireAccessToken(OAuth2ClientContext oauth2Context)
            throws UserRedirectRequiredException {

        AccessTokenRequest accessTokenRequest = oauth2Context.getAccessTokenRequest();
        if (accessTokenRequest == null) {
            throw new AccessTokenRequiredException(
                    "No OAuth 2 security context has been established. Unable to access resource '"
                            + this.resource.getId() + "'.", resource);
        }

        // Transfer the preserved state from the (longer lived) context to the current request.
        String stateKey = accessTokenRequest.getStateKey();
        if (stateKey != null) {
            accessTokenRequest.setPreservedState(oauth2Context.removePreservedState(stateKey));
        }

        OAuth2AccessToken existingToken = oauth2Context.getAccessToken();
        if (existingToken != null) {
            accessTokenRequest.setExistingToken(existingToken);
        }

        OAuth2AccessToken accessToken = null;
        accessToken = accessTokenProvider.obtainAccessToken(resource, accessTokenRequest);
        if (accessToken == null || accessToken.getValue() == null) {
            throw new IllegalStateException(
                    "Access token provider returned a null access token, which is illegal according to the contract.");
        }
        oauth2Context.setAccessToken(accessToken);
        return accessToken;
    }

With this modification our concurrency tests passed (with default HttpSession, with spring-session Redis backed session it will not pass for the aforementioned reasons). I'm not saying this is the best solution to the thread safety, in fact instead of synchronizing the full method an implementation with a Java ReadWriteLock might be optimum for performance but I tried and I didn't succeed to make it work. At least is a confirmation that synchronizing this code fixes the problem (for default HttpSession). In terms of performance we don't want to slow down all threads retrieving the access token from the context in the case when the token has not expired, this would have serious implications to the throughput. However, even synchronizing the full "acquireAccessToken" method in OAuth2Template doesn't seem to be so bad as this is method is called after checking that a refresh is necessary (otherwise the current - not expired - token is returned):

    /**
     * Acquire or renew an access token for the current context if necessary. This method will be called automatically
     * when a request is executed (and the result is cached), but can also be called as a standalone method to
     * pre-populate the token.
     * 
     * @return an access token
     */
    public OAuth2AccessToken getAccessToken() throws UserRedirectRequiredException {

        OAuth2AccessToken accessToken = context.getAccessToken();

        **if (accessToken == null || accessToken.isExpired()) {**
            try {
                accessToken = **acquireAccessToken**(context);
            }
            catch (UserRedirectRequiredException e) {
                context.setAccessToken(null); // No point hanging onto it now
                accessToken = null;
                String stateKey = e.getStateKey();
                if (stateKey != null) {
                    Object stateToPreserve = e.getStateToPreserve();
                    if (stateToPreserve == null) {
                        stateToPreserve = "NONE";
                    }
                    context.setPreservedState(stateKey, stateToPreserve);
                }
                throw e;
            }
        }
        return accessToken;
    }

So even though I'm not sure that is the best solution, this seems to work to ensure thread-safety in a single point of the code without having implications in performance for all http request to OAuth2 protected resources (only for those where the token is found to be expired).

Summary:

@dsyer, what is your oppinion about the best implementation to ensure thread-safety and do you have any idea about how is it possible to provide OAuth2RestTemplate consistency in its behaviour regardless the HttpSession implementation used? (maybe assuming that in some cases is not going to be possible to guarantee thread safety to read / write the underlaying session implementation and controlling the failure to refresh a token that has been already refreshed and performing a retry?).

Thanks so much for all your help.

dsyer commented 7 years ago

The problem should only be encountered when you have concurrent requests from the same unauthenticated user, and I'm not sure I understand how that applies when you are refreshing an existing token (per the original post). I think I'd like to get a proper test case to analyse before we commit to doing anything. Can you make something that shows the problem without any additional dependencies (or maybe just Spring Boot) and put it in github somewhere?

miguelfgar commented 7 years ago

Thanks @dsyer for you answer.

I understand it might be difficult to understand the scenario without having a look at code. I'm willing to provide something but it will take me a while "simplify" to something I can provide as a test case. Basically I'm using ZuulProxy to implement an Security / API Gateway to provide access to OAuth2 protected resources (Resource Servers) based on a microservices architecture using the pattern you described in https://spring.io/blog/2015/02/03/sso-with-oauth2-angular-js-and-spring-security-part-v for a Single Page Application client (HTML5, AngularJS, etc). So this implies we have all Spring Cloud Netflix OSS stack: config server, Eureka, several microservices (acting as Resource Servers), ZuulProxy (@EnableOAuth2SSO powered)... plus a external OAuth2 Authorization Server. We run all this based on containers (Dockers). Of course I can provide a minimum test case based on this just to reproduce the issue but I guess at least I will need at least the following components: Zuul Proxy with the proper setup, an OAuth2 server (we could implement it with Spring-OAuth2 as we use OpenAM in our project and we start it in a Docker container) and another project to run the web integration tests simulating the Single Page Application client (generating the high concurrency scenario per user session to reproduce the issue). Would be a test case with the aforementioned components OK for you? For sure it will take us a long to provide you with this test case, we will need to invest time but I completely understand you ask for it. Unfortunately we can't share our full project code and it will be difficult to run it anyway (it is necessary to start several Docker containers, etc).

In the meantime I just try to explain again from a high-level point of view so you see the full scenario.

In this scenario the issue we are finding is that when Zuul detects the OAuth2 access token stored in the user session (OAuth2ClientContext) is expired it will need to refresh it. Once the token is expired high concurrency can happen (several requests for the same user at the same time - this is same incoming session ID). In this case Zuul delegates to OAuth2RestTemplate and several threads can be trying to refresh the same OAuth2 token (calling the Authorization server endpoint). This is what it is fixed with the "synchronized" code in OAuth2RestTemplate i mentioned earlier for the default HttpSession implementation (memory). However for other implementations of HttpSession (spring-session Redis backed session, for instance) the scenario could be even more complicated (because the main thread keeps on executing but spring-session is handling writing to Redis in paral.lel (changes in flush mode can affect)).

Sorry for my long answer and thanks for your effort to understand the scenario and help. Did my explanation provide a better understanding of the full scenario we are implementing and where we found the issue? Could you please confirm a test case with: Zuul, an AS (implmented with Spring-OAuth2, Eureka, 1 microservice (Resource Service) and another project to run integration tests (simulating the client) would be OK for you? If so, we will invest effort in providing this.

Thanks so much!

miguelfgar commented 7 years ago

Hi @dsyer, I start working in a Test Case for this and I'll provide it as soon as possible.

Just as a short summary (with less detail that the previous one - sorry about that) my guess is that the problem is that the access to the mutable object OAuth2ClientContext from OAuth2RestTemplate is not synchronized. Even though access to HttpSession should be ensured by Servlet spec, the application needs to guarantee that the access to a mutable object (OAuth2ClientContext) that you set to session with a "setAttribute" is thread-safe also. I would say in this case it isn't and that's why the "synchronize" I introduced in OAuth2RestTemplate fixes the problem. I guess is just a matter of confirming this and see the best implementation (Synchronized block / method, Java Lock, Atomic Reference). I have seen in this case spring has no "synchronizeOnSession" enabled (at least for RequestMappingHandlerAdapter) which I guess is normal for performance reasons and I understand is a feature that can be used in some cases to enable this sort of concurrency with mutable objects saved to session. I guess the solution goes more for synchronizing this specifically in OAuth2RestTemplate. @dsyer do you think this makes sense (apart from confirming with the test case)?

The other case (using spring-session to implement a Redis backed HttpSession) is different as the problem here is not solved with thread synchronization. The problem here seems to be that after saving in Redis control is returned to the thread but Redis is still writing (asynchronous?). I tried RedisFlushMode.INMEDIATE but I didn't see much difference. For this I might ask spring-session team once I have the Test Case. I can provide a variant for them with Redis enabled.

Thank you

miguelfgar commented 7 years ago

Hi @dsyer, guys,

I have developed a simple test case as agreeded: https://github.com/miguelfgar/tests (see the readme, please!) I tried to keep it as simple as possible and in the end it was possible to reproduce the issue with just 3 components:

It is a Maven multimodule project with Spring Boot projects.

I have added 3 branches in the repository:

The "readme" of the project provides detailed information. I'm looking forward to hear what are your thoughts once you have tried this out and you have seen the problem. I'm looking forward also to contribute to the final solution and the discussion if there is a way I could help. If you need something related to the test in the meantime just let me know, I will reply quickly.

In paral.lel I'm going to spring-session team for the problem with sprin-session + Redis - that is additional to the concurrency issue - to see if this is something that could be solved by applying configuration. This combination seems very logical for me in this scenario to scale ZuulProxy easily in a container independent way. Anyway, if you had any idea also in this scenario (spring-session) that could help I would be happy to hear it.

Specially @dsyer, thanks for your help!

Thanks and regards,

Miguel

dsyer commented 7 years ago

Wow, that's a lot of code, thanks! I'm not sure quite what to make of it though, since there isn't yet a failing test case. Even in the REFRESH_ISSUE_WITHOUT_FIX branch, tests are green and the responses are all 200 OK. What would be the sign from a client or user's point of view, that the concurrency problem we are worried about here has been encountered? Can we not test that?

Did you need 3 projects BTW? (I thought probably the test case could have gone in the gateway, client app project.) I didn't really see what Zuul was needed for either, since I thought all we are really testing is that a client with a refresh token can get a new access token when it expires. I'm not really sure about that last statement until I have a test that fails though.

miguelfgar commented 7 years ago

Thanks @dsyer. Yes, it's a lot of code. It took me quite a while to polish and provide something minimum. You are right, the @IntegrationTest could have been placed in the own api-gateway project. However, I though it was cleaner to provide the two necessary pieces (zuul - api gateway and AS) as clean as possible. As you have seen here the client-for-tests project has only the function of raising multiple requests to "/me" at the same time.

Regarding the result of the test you are right, is not controlled with the test (that passes in green is not an indication that everything worked well). *The important thing here is to have a look at the logs in the api-gateway: REFRESH_ISSUE_WITHOUT_FIX shows in the log the values of the different access tokens obtained (and the posts to the AS to refresh) while REFRESH_ISSUE_WITHOUT_FIX shows in the log that the same access token is returned (and just 1 post to refresh to the AS). I copied an example of the logs in the readme. If you see this example of the logs you will see that the result is quite obvious: https://github.com/miguelfgar/tests#tests-results-comparison-branches-refresh_issue_without_fix-and-refresh_issue_with_fix Could you please, tell me what your thoughts?**

However, if it was necessary, I guess it could be possible to detect if the test passed well or not for this particular case inspecting the output in the log... at least this is a way to do it, but to be honest I guess when you see the log example you will not needed as it is self-explanatory. Otherwise I could just try to implement this.

ZuulProxy is necessary because is in charge to handle the OAuth2 tokens (@EnableOAuth2SSO) and so it is the component that delegates on OAuth2RestTemplate which in my humble opinion is the class where this concurrency issue could / should be controlled. Not sure how to implement this without ZuulProxy @EnableOAuth2SSO-annotated so all the other pieces that are injected by means auto-configuration are in the place (fortunatelly, spring-boot and spring-oauth2 auto-configuration does a lot "under the hood" :-) ). For me it made sense also to have Zuul because is a common / real scenario to instruct Zuul to handle OAuth2 tokens. If you have a look, a "minimalistic" project like this one contains little code in the api-security-gateway. I see it also necessary for the additional issue when using spring-session + redis backed httpSession implementation which I would say must be something common to scale Zuul in this scenario.

Thanks for your help

dsyer commented 7 years ago

I guess I'm missing something still. If the tests are green then all requests were successful, the client is happy, and every request was authenticated. Why do I care about the log output in the server?

miguelfgar commented 7 years ago

I completely understand what you say, this is right. However, the result depends on how the AS is configured.

1. If the AS is configured so the refresh token does NOT change (only the access token changes): When the refresh token does not change (and this depends on the AS) then the result is what you said: the client can get different access tokens for every request and will be happy. However, in the AS there will be N access token that have been created and need to be stored unnecessarily.

(Specially using the pattern: client passes sessionId in the requests to the server side (Zuul as entrypoint), Zuul handles the oauth2 access token for this client (1 at a time!) keeping it in the OAuth2clientContext (stored in HttpSession - session bean) it is not desirable to get N valid access tokens at a time, even though it would work).

This is the case of the test case I provided.

2. If the AS is configured so the refresh token changes (both refresh and access token changes): In this case, several request will trigger the refresh of the token in the AS but only the first one will succed. The rest will fail (as the refresh token has changed and can only be used one). Then you end up with client requests failing because of this reason. I have experienced this with a third party AS (OpenAM) which allows both setups ("issue a new refresh token on refreshing a token" or not).

If I managed to explain myself properly I'm sure you will see this clear if you had a look at the example logs that I provide in the readme.

Anyway, I can have a look and see how this could be detected in the @IntegrationTest.

Thanks again!

miguelfgar commented 7 years ago

Hi, just an update about this:

1. Changes in the test case @dsyer and myself did changes in the test case. Thanks @dsyer for the great refactoring and for putting it all together in a sincle project!

Branch REFRESH_ISSUE_WITHOUT_FIX: The test now is failing as expected (several threads try to refresh the same access token, they don't succeed and then there is an attempt to obtain a brand NEW access token using the authorization code).

Branch REFRESH_ISSUE_WITH_FIX: The test is passing as expected. Contains exactly the same code as the aforementioned branch but with the fix in OAuth2RestTemplate (basically a "synchronized" in a method). It includes an "assert" to check that the token refresh endpoint in the AS was called just once.

2. Answer by spring-session Team about the scenario using spring-session + Redis backed HttpSession implementation Reply here: https://github.com/spring-projects/spring-session/issues/635 I understand the point that the write to the HttpSession (in this case implemented by spring-session + Redis) is not triggered until the request is commited. However this is working with the default HttpSession implementation (in-memory) in the REFRESH_ISSUE_WITH_FIX branch (even we should consider that we have not confirmed that this is a good or the best solution yet), so I don't see much the different besides understanding that operations to Redis (specially writes) might be slower than in memory. I need to diggest the response for the spring-session + Redis scenario better but I hope there is a way to make it work for this scenarios (scaling Zuul) at least for the threads accessing Redis in the same node. An additional issue is to guarantee consistent refresh that could be triggered by different nodes (zuul instances at the same time) but this might be controlled maybe with RedisLockRegistry (I have never used it - http://docs.spring.io/spring-integration/api/org/springframework/integration/redis/util/RedisLockRegistry.html) or maybe even in case a refresh token fails with a retry... Just ideas for now..

What would suprise me is that nobody tried this combination earlier: Zuul (@EnableSSO) + spring-session + Redis backed HttpSession as in scenarios where scaling Zuul is required seems the solution to go (more that relying on container dependant solutions for session synchronization). I think it is interesting to see the way to make all this together as at the end of the day this becomes a very useful pattern (in microservices architectures, for instance). Once this works, if this is possible and we manage to do it It could be interesting even provide autoconfiguration / special annotations for this particular setup / pattern. This would be added value.. :-)

dsyer commented 7 years ago

OK, I finally got to understand all of this. In my opinion to fix this you have to look at the auth server, because the problem is really that it is invalidating access tokens when they are refreshed, whereas, in fact there's nothing invalid (necessarily) about such a token. And there's nothing to stop two instances of OAuth2RestTemplate from redeeming the same refresh token at the same time (even if you synchronize the method), especially considering that they might not even be in the same process, so fixing anything on the client is not going to work in general.

Here's a simple sequence diagram that shows the problem:

image

You could work around it today by using a TokenStore that doesn't remove access tokens when they are refreshed (there's even a method for that TokenStore.removeAccessTokenUsingRefreshToken(OAuth2RefreshToken)). So, for example, anyone using JWT tokens isn't affected because that method is a no-op by default.

miguelfgar commented 7 years ago

Hi,

Thanks so much @dsyer. I see your point and I agree that handling this race condition from the client it is complicated. I guess it is possible per single instance of you client (in our case the "security-gateway" Zuul @EnableOAuth2SSO-annotated). However, I agree with you and with @rwinch - who has helped also in the scenario with spring-session + Redis backed session as HttpSession implementation (thanks so much!) - that this is impossible to handle when you have several instances of you client (in my case several instances of our api gateway - that's why I have spring-session + redis to have a distributed session). The reason is that even though you could avoid concurrency problems refreshing the token per node in the end you always can ran into the concurrency problem between instances. I guess it makes more sense as you suggest (and more reliable) to find a setup in the Authorization Server that in the end simply makes our use case / scenario work.

The solution you provide, changing the Authorization Server setup (TokenStore.removeAccessTokenUsingRefreshToken(OAuth2RefreshToken) is not valid in my case because this only works with Authorization Servers implemented with spring-outh2. In my case it's a 3rd party AS: OpenAM. As I explained in previous posts (sorry, we have written a lot, specially me!) depending on the setup we apply in the AS (OpenAM in my case) I'm able to make it work. However, what I wanted to unveil opening this issue is that even working this was resulting in more access tokens in the AS than necessary. In this case the problem for us is not that when the concurrency situation arises two refreshes are performed and the second one is invalidating the first access token (as you draw). In our case OpenAM problably is not revoking the previous access token after refreshing it (or if it supports this, must be not configured like that currently). My concern was more that several access tokens were created (and alive!) in the AS related to the same user session unnecessarily (which has implications in performance when you have many users -> more tokens to store + security -> more tokens alive that potencially can be used (if stolen, for instance, even though this should not happen).

Just to close this, and I guess is a bit the recap that @rwinch made, in this case probably the best thing is to look for the solution that works in the AS even if we have to accept that this solution has a small implication in loss of security (if we can consider a minimal loss of security that we have more access tokens per user session live in the AS that strictly necessary). The concurrency issue - as you both pointed out - seems not possible to handle from the client side, specially if you have several instances of you client. However, after all the time we both have invested in this issue, I would like to show you exactly the scenarios we are experiencing depending on our AS setup as they differ a bit from what you draw (as our AS seems not to be revoking an access token when refreshed - so we are not experiencing this problem).

(Sorry, I didn't managed to draw the lifetime bars in the diagram sequence properly, so they are omitted, i guess is possible to follow the flow anyway..)

  1. AS (OpenAM) configured to provide a new REFRESH token every time you use the refresh token authserverchangingrefreshtoken

This setup does NOT work as the seconde refresh will not succeed (the refresh token has been already used) and besides the access (and refresh) token in the OAuth2ClientContext end up being NULL, so subsequent request will not find an access token.

  1. AS (OpenAM) configured not to change the refresh token (same refresh token for all refresh requests) authservernotchangingrefreshtoken

This case works (all refresh request keep using the same refresh token, never changes). However, in this example 2 access tokens will be created and will be alive in the AS until they expire. The reality is that 1 access token would have been enough. In this case are 2 because I only draw 2 threads (corresponding to concurrent request) but they might be more if more requests are received in high concurrency. As I say earlier, I guess this is the trade-off (more access tokens than necessary in the AS) as controlling concurrency from the client with several instances seems to be impossible.

Thanks so much for all your help!

ghost commented 6 years ago

@miguelfgar use case: Get new access token using refresh token when access token gets expires. I am using OpenAM as an Authorization Server, I have reached at the point of making user authenticated, get an access token, if access token expires how to refresh access token?. Question is: do I need to extends AuthorizationServerConfigurerAdapter class to configure authentication server, as I do have OpenAM as Authentication Server. I have created one class with DefaultTokenServices as a parent class, bt @override createAccessToken() method not getting called.

Please give me your pseudo source code to refresh access token, which OpenAM endpoint should I use?

Thank You, Vipin

loesak commented 5 years ago

Just to add to the conversation. We're seeing the same issue. Multiple requests coming in to obtain a new access token using a refresh token and one request "wins", causing the used refresh token to be deleted, thus causing the other requests to "lose".

I like @dsyer idea that refresh tokens probably shouldn't be deleted as long as they are still valid. I'm not sure if this is still the preferred solution but I'm going to give it a try here in a bit. This should be easier to implement than finding a way to hook into the OAuth server code and/or the OAuth Zuul code as those are not very hook-able.

I still think it would be an optimization if both the authorization server were to have a distributable lock around creating an access token. Additionally it would be an optimization if the SSO Proxy also had a distrbutable lock around obtaining a new access token for cases when the client doesn't have control over the AS behavior.

loesak commented 5 years ago

Thinking about it more, i do not think that simply not deleting refresh tokens is the right way to go. What happens when multiple parallel requests to get a new access token from an existing refresh token succeed? Will there be multiple new access tokens and refresh tokens (assuming you do not reuse refresh tokens) in the system when there should only be one? I'll be figuring out this soon when doing my testing. However, I feel like a distributable lock is still necessary.

has anyone solved this yet?

loesak commented 5 years ago

I have verified that when fully using the JWT token store with reuse refresh tokens off, it does create multiple new access tokens and refresh tokens. This does allow the system to work as desired but it is not ideal because the the last token that gets generated is what is remembered by the SSO Proxy (in this case) and the other generated tokens are simply forgotten but are still completely valid. Would be nice to have some sort of revocation available.

rishabhpoddar commented 5 years ago

Check out this blog post that talks about race conditions when dealing with changing refresh tokens. Are the race conditions there something similar to what you are talking about? Also, the implementation of session management that the blog links to does change refresh tokens each time a new access token is obtained (using the old refresh token) - and it does so in a way that is space efficient - stores only one refresh token and access token per user per device - at any time! It's currently only implemented in NodeJS, but we can implement one for Spring framework too - if anyone is interested.

skolisetty commented 4 years ago

I suspect this might be affecting me. I am saving tokens in the db.

Current logic from spring oauth2 library for refreshing the token is 

When token is expired, it does delete at Line 105 and after getting new refreshed token, it saves at line 131

It could be possible that first thread deleted and while it is in the process of refreshing and second thread made a call at Line 99 to get the token from db (which is deleted in by thread 1) and tries to acquire it using Authorization Code which will not be found in the context (since user isn't involved in providing that code) and results in throwing UserRedirectRequiredException.

I am handling this exception and making my other references as if account is unlinked.

My first thread will acquire it and goes through normally. Since second thread messed up the references in other tables, my db is in weird state where one table says account is linked and token table says that it is active.

psnindia commented 4 years ago

@miguelfgar I have gone through the whole post and really appreciated your analysis and explanation. We are facing the exact same problem in our application, I am not in favor of changing AS (Ping in our case) to reuse refresh tokens. However, we are using Spring Redis, and we cannot go back to HttpSession as we needed the distributed session for many other scenarios. I see your below comment about Redis way of handling this, do you have any luck with this? We tried and didn't work for us. Appreciate if you can share the updates if any.

"I'm still doing research in this scenario (Redis backed session) tunning "redisFlushMode" (changing to RedisFlushMode.IMMEDIATE) but I have not been sucessful yet."

We really want to make the combination (Zuul (@EnableSSO) + spring-session + Redis backed HttpSession) working. Thanks a lot in advance.

psnindia commented 4 years ago

I'm still doing research in this scenario (Redis backed session) tunning "redisFlushMode" (changing to RedisFlushMode.IMMEDIATE) but I have not been sucessful yet.

@miguelfgar I have gone through the whole post and really appreciated your analysis and explanation. We are facing the exact same problem in our application, I am not in favor of changing AS (Ping in our case) to reuse refresh tokens. However, we are using Spring Redis, and we cannot go back to HttpSession as we needed the distributed session for many other scenarios. I see your below comment about Redis way of handling this, do you have any luck with this? We tried and didn't work for us. Appreciate if you can share the updates if any.

"I'm still doing research in this scenario (Redis backed session) tunning "redisFlushMode" (changing to RedisFlushMode.IMMEDIATE) but I have not been sucessful yet."

We really want to make the combination (Zuul (@EnableSSO) + spring-session + Redis backed HttpSession) working. Thanks a lot in advance.

psnindia commented 4 years ago

I'm still doing research in this scenario (Redis backed session) tunning "redisFlushMode" (changing to RedisFlushMode.IMMEDIATE) but I have not been sucessful yet.

@miguelfgar I have gone through the whole post and really appreciated your analysis and explanation. We are facing the exact same problem in our application, I am not in favor of changing AS (Ping in our case) to reuse refresh tokens. However, we are using Spring Redis, and we cannot go back to HttpSession as we needed the distributed session for many other scenarios. I see your below comment about Redis way of handling this, do you have any luck with this? We tried and didn't work for us. Appreciate if you can share the updates if any.

"I'm still doing research in this scenario (Redis backed session) tunning "redisFlushMode" (changing to RedisFlushMode.IMMEDIATE) but I have not been sucessful yet."

We really want to make the combination (Zuul (@EnableSSO) + spring-session + Redis backed HttpSession) working. Thanks a lot in advance.

Hello Everyone, did anyone have luck solving this using Spring Boot + Redis + Zuul combination?Appreciate your response on this!!