spring-projects / spring-security

Spring Security
http://spring.io/projects/spring-security
Apache License 2.0
8.55k stars 5.79k forks source link

oauth2-client: Parallel refresh requests are very slow (WebFlux) #15145

Closed JJRdec closed 3 weeks ago

JJRdec commented 1 month ago

Expected Behavior

If there are parallel requests that require a refreshed token, this should only need to be done once - not by each concurrent request.

Current Behavior

Right now, if there are several concurrent requests that require a refresh they all make requests to the provider for a new token. This is not efficient.

Moreover, if the refresh token is reissued each time its used - concurrent refresh requests will fail, as the old refresh token is invalidated.

Context

    @Bean
    public ReactiveOAuth2AuthorizedClientManager authorizedClientManager(
            ReactiveClientRegistrationRepository clientRegistrationRepository,
            ServerOAuth2AuthorizedClientRepository authorizedClientRepository) {

        ReactiveOAuth2AuthorizedClientProvider authorizedClientProvider =
                ReactiveOAuth2AuthorizedClientProviderBuilder.builder()
                        .authorizationCode()
                        .refreshToken()
                        .build();

        DefaultReactiveOAuth2AuthorizedClientManager authorizedClientManager =
                new DefaultReactiveOAuth2AuthorizedClientManager(
                        clientRegistrationRepository, authorizedClientRepository);
        authorizedClientManager.setAuthorizedClientProvider(authorizedClientProvider);

        return authorizedClientManager;

    }
  1. The DefaultReactiveOAuth2AuthorizedClientManager handles all the oauth2 stuff nicely, loads (loadAuthorizedClient) the OAuth2AuthorizedClient from ReactiveClientRegistrationRepository
  2. Then DelegatingReactiveOAuth2AuthorizedClientProvider runs an authorize method for each provider (in our case: AuthorizationCodeReactiveOAuth2AuthorizedClientProvider, RefreshTokenReactiveOAuth2AuthorizedClientProvider)
  3. RefreshTokenReactiveOAuth2AuthorizedClientProvider always runs this logic on the context passed to the method to check if a refresh is required, but ignores the fact that OAuth2AuthorizedClient might have been updated (and that refresh has already happened).
    OAuth2AuthorizedClient authorizedClient = context.getAuthorizedClient();
        if (authorizedClient == null || authorizedClient.getRefreshToken() == null
                || !hasTokenExpired(authorizedClient.getAccessToken())) {
            return Mono.empty();
        }

Because there is no synchronization around the refreshing block and that OAuth2AuthorizedClient is loaded at DefaultReactiveOAuth2AuthorizedClientManager and never reloaded in `RefreshTokenReactiveOAuth2AuthorizedClientProvider, the refresh is done for all concurrent requests.

I'm wondering can this be optimism so that when a refresh is required it will block other requests from that user session, refresh the token (once) and then other requests will be able to use that refreshed token.

Example of slow requests we've seen at refresh

image

JJRdec commented 3 weeks ago

Any ideas on what is happening here?

sjohnr commented 3 weeks ago

Thanks for reacyhing out @JJRdec!

There seem to be two things you're looking to solve in this issue:

  1. Highly concurrent requests resulting in multiple token requests to the authorization server.
  2. Slow throughput of concurrent requests to the authorization server

Regarding (1), this has been discussed before (for example, see here and here). From my perspective, this is not really something the framework can solve generally. See comments above for more details and possible ideas for workarounds.

Regarding (2), I would highly encourage you to look into the authorization server to determine why concurrent requests (I don't know how many based on the information provided) are slow. There is not a good reason that several concurrent requests (even as many as 1k requests) would take that long, and the issue is likely on the authorization server side.

Also, I don't see enough information to go further with this issue, but please note that based on the information provided I don't believe there would likely be a change in the framework. For these reasons I'm going to close this issue. If you feel I have misunderstood anything (which is entirely possible), please provide a minimal sample to reproduce the issue you're seeing and I can take a deeper look.