Highly concurrent requests with `client_credentials` cause duplicate access token requests

Expected Behavior OAuth tokens could be reused. Even if multiple requests happen concurrently.

Current Behavior If a lot of token requests happen concurrently each request retrieves its own access token.

Context As described in https://stackoverflow.com/questions/78635726/what-is-the-expected-number-of-oauth2-client-logins-with-webflux-and-spring-secu we're issuing a lot of WebClient concurrently. In the meantime we've discoverd that the initial request has been cached. So the sample would be more correct if you'd something along

Flux.range(1, 1000)
    .flatMap { request() }

Issues like https://github.com/spring-projects/spring-security/issues/11461 show that it's a known limitation of the current implementation.

Having a look at the ClientCredentialsReactiveOAuth2AuthorizedClientProvider default implementation I drafted a version that caches the token requests using Caffeine like

class CachingClientCredentialsReactiveOAuth2AuthorizedClientProvider : ReactiveOAuth2AuthorizedClientProvider {

    private val accessTokenClient: ReactiveOAuth2AccessTokenResponseClient<OAuth2ClientCredentialsGrantRequest> =
        WebClientReactiveClientCredentialsTokenResponseClient()

    private val cache: AsyncLoadingCache<ClientRegistration, OAuth2AccessToken> = Caffeine.newBuilder()
        .expireAfter(OAuth2AccessTokenExpiry())
        .buildAsync { clientRegistration, _ -> getAccessToken(clientRegistration) }

    override fun authorize(context: OAuth2AuthorizationContext): Mono<OAuth2AuthorizedClient> {
        val clientRegistration: ClientRegistration = context.clientRegistration

        if (clientRegistration.authorizationGrantType != AuthorizationGrantType.CLIENT_CREDENTIALS) {
            return Mono.empty()
        }

        return Mono.fromFuture { cache.get(clientRegistration) }
            .map { OAuth2AuthorizedClient(clientRegistration, context.principal.name, it) }
    }

    private fun getAccessToken(clientRegistration: ClientRegistration): CompletableFuture<OAuth2AccessToken> =
        OAuth2ClientCredentialsGrantRequest(clientRegistration).toMono()
            .flatMap { accessTokenClient.getTokenResponse(it) }
            .onErrorMap(OAuth2AuthorizationException::class.java) {
                ClientAuthorizationException(it.error, clientRegistration.registrationId, it)
            }
            .map { it.accessToken }
            .toFuture()
}

class OAuth2AccessTokenExpiry : Expiry<ClientRegistration, OAuth2AccessToken> {

    override fun expireAfterCreate(
        clientRegistration: ClientRegistration,
        token: OAuth2AccessToken,
        currentTimeInNanos: Long,
    ): Long = token.cacheDurationInNanos()

    override fun expireAfterUpdate(
        clientRegistration: ClientRegistration,
        token: OAuth2AccessToken,
        currentTimeInNanos: Long,
        currentDurationInNanos: Long,
    ): Long = token.cacheDurationInNanos()

    override fun expireAfterRead(
        clientRegistration: ClientRegistration,
        token: OAuth2AccessToken,
        currentTimeInNanos: Long,
        currentDurationInNanos: Long,
    ): Long = token.cacheDurationInNanos()
}

private fun OAuth2AccessToken.cacheDurationInNanos(): Long =
    Duration.between(Instant.now(), expiresAt)
        .minus(Duration.ofMinutes(1))
        .toNanos()

The following chart shows our token requests with the default implementation and then switches to my version around 11:00: grafik

I'm well aware that spring-security has no dependency on caffeine (and likely won't introduce one) but maybe it's possible to include such an implementation somehow? I'm open for suggestions and willing to provide a PR.

Thanks for reaching out @kschlesselmann! I'm sorry to hear about your concurrency issues with multiple access token requests, but I'm glad that you have been able to develop a solution you feel good about.

As described in https://stackoverflow.com/questions/78635726/what-is-the-expected-number-of-oauth2-client-logins-with-webflux-and-spring-secu we're issuing a lot of WebClient concurrently.

If you intend to submit an enhancement request, please consider including all of the details for your enhancement in this issue and not require others to view the stackoverflow post, which can be deleted or edited outside of GitHub and lose context for this issue. If you would like to simply ask a question, there is no need to cross-post here as the team regularly reviews Stack Overflow, as does our community.

Issues like #11461 show that it's a known limitation of the current implementation.

It seems likely that your situation is slightly different from the reporter of that issue. See below.

If a lot of token requests happen concurrently each request retrieves its own access token.

From your stackoverflow question, it seems you are configuring AuthorizedClientServiceReactiveOAuth2AuthorizedClientManager but you are using this as part of processing HTTP requests from users. If so, this is not the intended use of this class and you should instead be using DefaultReactiveOAuth2AuthorizedClientManager.

Further, it appears you have not specified a principal name (e.g. anonymous) to "scope" your access tokens globally to the application. This would normally be done with ReactiveSecurityContextHolder (which uses reactor Context attributes), but I don't see an example in the documentation for how to do this so it's understandable if you missed it. Having said that, if access tokens were scoped to the application instead of each user session, this may not have been an issue for you at all, or at least much less significant.

I will open a separate issue to add this information to the documentation to help users find an easier solution.

I'm well aware that spring-security has no dependency on caffeine (and likely won't introduce one) but maybe it's possible to include such an implementation somehow?

This is not really something we'd want to pursue for the framework. However, as noted in gh-11461, there are extreme situations where highly concurrent requests can still cause occasional race conditions resulting in more requests to the token endpoint than necessary. (As I mentioned, I don't believe this is the case you're dealing with necessarily, as a change in your application may resolve it.)

I strongly feel that the framework shouldn't provide a direct solution to this. However, I will open another issue to track this and see how many community members chime in. If it is a common issue for folks (even after scoping the access token to the application), we can then decide what could be provided in the framework for assisting with such challenges.

What do you think of that as a path forward, does it work for you?

Hi @sjohnr, thanks for your insights!

I'm sorry for mixing up the sources (it was a pretty unstructured debugging path on my side :P).

In general you're correct that the issue is not exactly my issue but in this case I don't think it matters. You mean that ServerOAuth2AuthorizedClientRepository::loadAuthorizedClient will scope the OAuth2AuthorizedClientto the Authentication::name, don't you?

I provided a sample to reproduce the issue at https://github.com/kschlesselmann/spring-security-15295 . Here I make one client request that immediatly issues 100 WebClient requests. If I add a breakpoint to ServerOAuth2AuthorizedClientRepository::loadAuthorizedClient I can confirm, that the Authentication::name is the same for all 100 client requests. So I don't think setting the name using ReactiveSecurityContextHolder will change anything here, will it?

I hope my sample clarifies our issue.

I strongly feel that the framework shouldn't provide a direct solution to this. However, I will open another issue to track this and see how many community members chime in. If it is a common issue for folks (even after scoping the access token to the application), we can then decide what could be provided in the framework for assisting with such challenges.

What do you think of that as a path forward, does it work for you?

Why do you think that the framework should not handle concurrent token requests in a more efficient manner? But yes, feel free to open an issue to get more feedback from the community. That works for me.

@kschlesselmann thanks for the sample. It's hard to talk about things concretely without that and I should have asked for one before going very far, but thanks for providing it quickly.

You mean that ServerOAuth2AuthorizedClientRepository::loadAuthorizedClient will scope the OAuth2AuthorizedClientto the Authentication::name, don't you?

Yes, that's correct.

I provided a sample to reproduce the issue at https://github.com/kschlesselmann/spring-security-15295 . Here I make one client request that immediatly issues 100 WebClient requests.

This is one of the keys to the issue. To clarify, when I mentioned:

if access tokens were scoped to the application instead of each user session, this may not have been an issue for you at all, or at least much less significant.

the assumption I was making is that you have an OAuth2 client with a user in session. If many users are making requests, you will have different principal names. Based on your sample I see that you are using client_credentials to call a resource server with a single client, which then calls another resource server. Assuming your sample represents your actual use case, you may or may not need to scope your access tokens, depending on how many clients will be calling this resource server. (You would need to if more than one client.)

Regardless, when you fire 100 requests immediately you simulate very high load at a very specific moment in time. I don't imagine that it is common to have 100's or 1000's of requests in such close proximity to one another when there is no available access token (requiring a call to the token endpoint at that exact moment). In the most extreme cases, this would only happen once per expiry interval. If you truly have this situation (not just a simulation), then you likely fall into the category I mentioned above:

as noted in gh-11461, there are extreme situations where highly concurrent requests can still cause occasional race conditions resulting in more requests to the token endpoint than necessary.

Regarding your question,

Why do you think that the framework should not handle concurrent token requests in a more efficient manner?

I don't see this scenario as being common. The OAuth2 Client features of Spring Security focus primarily on the authorization_code grant and managing users in session with separate authorized client instances. I consider this race condition to be an edge case, and don't see an easy way the framework can handle it without impacting the majority use case negatively. Handling scalability issues is also not the domain of a security framework, IMHO. For that reason particularly, I feel the workaround you applied for your use case with caching seems a perfectly fine solution and am actually excited to see that it works well! I feel strongly it belongs at the application level, not in the framework.

Having said that, I don't want to argue a point based on opinion and that's why I think opening an issue to track community feedback and need is a good compromise. Based on the misunderstanding on my part (clarified by seeing a sample), I think we can actually use this issue for that. I will adjust the title to make it clearer what we're talking about here, which is a way to handle highly concurrent requests for the client_credentials grant type. Sound good?

If you truly have this situation (not just a simulation)

We do. The plot I provided in the original issue shows login requests of our production system.

I will adjust the title to make it clearer what we're talking about here, which is a way to handle highly concurrent requests for the client_credentials grant type. Sound good?

Sounds awesome! Let's see what will come of this. 👍🏻

The plot I provided in the original issue shows login requests of our production system.

I see, thanks. Can you clarify what you mean by "login requests"? I think that's what tripped me up in the original discussion because I'm imagining you're doing OIDC login based on the title of your stackoverflow question. You seem to be doing something else?

Can you clarify what you mean by "login requests"?

Sorry, maybe my wording is not really precise here. I meant requests for new access tokens of my client_credentials client.

spring-projects / spring-security

Highly concurrent requests with `client_credentials` cause duplicate access token requests #15295