spring-attic / spring-security-oauth

Support for adding OAuth1(a) and OAuth2 features (consumer and provider) for Spring web applications.
http://github.com/spring-projects/spring-security-oauth
Apache License 2.0
4.69k stars 4.05k forks source link

RedisTokenStore uname_to_access and client_id_to_access memory leak #1657

Open LukasVyhlidka opened 5 years ago

LukasVyhlidka commented 5 years ago

Summary

When the access key is stored, there is being set several expirations on keys related to the access key.

Unfortunately there are two redis keys that are not expired correctly. These keys are with prefixes _client_id_toaccess: and _uname_toaccess:. In following example I am going to describe only the _client_id_toaccess: keys.

Actual Behavior

Lets imagine there is just one client - web - that is used by all the users that logs in to the application. Every time there is a new access token generated it is stored into the list (or set in newer RedisTokenStore implementation) _client_id_toaccess:web. The expiration is set to whole key (not just to new value added).

In case the access token expiration is 5 minutes and your app is regularly visited - there is no 5 minutes window that no one generates new access token - then this list (or set) gets bigger and bigger, containing all the tokens that were stored previously and are already expired from all the other keys (e.g. access: and so on).

Version

We're using version 2.0.16 that is somehow older. On the other hand the same issue is even in the master branch. The only main difference between 2.0.16 and master branch is in used data structure. 2.0.16 uses the list (rPush operations) and master uses set (sAdd) that has the O(1) complexity to store and remove. But the expiration is completely the same.

LukasVyhlidka commented 5 years ago

I am Going to create a fix PR for this. I wonder why this was never figured out before.

hpq86zllw commented 5 years ago

Recently I have this problem too, one of our client_id_to_access list size is 4 million... it occupies 3g memory in the redis

LukasVyhlidka commented 5 years ago

You can take RedisTokenStore impl from my PR and place it into your codebase. I am going to do this today.

One trouble is that when you load my implementation, already stored access tokens in those two lists are not going to be used. So e.g. if your expiration time for access token is 5 minutes, then for those five minutes those lists won't contain all the access tokens it should be containing.

It can be migrated manually, e.g. using the redis-cli, but because it is just 5 minutes and because it is on redis keys that are not normally used, then I am not going to take care of this on our product... It will just solve itself in those 5 minutes.

LukasVyhlidka commented 5 years ago

@hpq86zllw:

So, I have deployed the version from my Pull Request (just copied the class into our project so it is overriden) into out production. In order token store to be properly cleared, you have to call method doMaintenance() once upon a time. I have created a @Scheduled method that calls it every 6 hours. Or it can be done even less frequent.

The example with the initialization (we're using Spring Boot):

@Configuration
@EnableScheduling
public class AuthorizationServerConfiguration {

   @Autowired
    private RedisConnectionFactory redisConnectionFactory;

   @Bean
    public RedisTokenStore redisTokenStore() {
        RedisTokenStore redisTokenStore = new RedisTokenStore(redisConnectionFactory);
        // any custom settings you need
        return redisTokenStore;
    }

    @Scheduled(fixedDelay = 6 * 60 * 60 * 1000) // 6 hours
    public void doRedisStoreMaintenance() {
        logger.info("Going to do RedisTokenStore maintenance.");
        RedisTokenStore redisTokenStore = redisTokenStore(); // this actualy gets already created bean (thanks to Spring Boot proxy)
        redisTokenStore.doMaintenance();
        logger.info("RedisTokenStore maintenance done.");
    }
lucasoares commented 4 years ago

Hello guys.

I encountered a problem with my security implementation and want to know if this also has anything to do with this issue.

Somehow my uname_to_access on redis ended having multiple values with the same token (I have 1 year expiration tokens for default) with same expiration time. This is really necessary?

Checking directly on code those objects have the same token: image

I also checked directly on debug for all values (not just the token used on equals method).

Asking here because my redis key has many and many values of the same token with exacly the same values (even the expiration time).

Apparently some of them has a different expiration time, but what is the reason for having multiple values for the same token with different expiration on this database?

Thank you.

LukasVyhlidka commented 4 years ago

@lucasoares I think that it is not related to this issue, this was about expired tokens.

If you have more equal tokens stored in the map you have to find what component stores it more times into the token store. The equals method on OAuth2AccessToken just validates the UUID of the token. When it is stored into the RedisTokenStore, it is stored in a Redis Set - it means that it is identified by whole stored datastructure and thus if it contains e.g. different validity (or anything else), it is treated as a different item and stored again.

I would suspect the access token renewal (using refresh token) mechanism. When original access token is still valid it is left in the Token Store as it is. And if your refresh functionality e.g. returns the same existing access token as the refreshed one, it might be stored again. But it is only a hypothesis of mine...

lucasoares commented 4 years ago

The RedisTokenStore doesn't deal with this collection as a set and this why everytime the token is stored again.

Everytime the same token is generated using the grant type password a new value are added to this list.

Your PR will also deal with the problem of having multiple values to the same token stored on redis (both uname_to_access and client_id_to_access)? I saw you changed every operation on redis to set operations instead of list operations.

I honestly think Spring should store only the token value instead of the whole object to this list. it would make all operations simpler and faster and now It isn't following the principle of a SSOT. But probably there are some reasons for that.

LukasVyhlidka commented 4 years ago

@lucasoares I changed it from a Set to a Sorted Set. It was actually a Set before. I think that the List datastructure was used in older versions. The Sorted Set is used because of fast operation to delete expired tokens (score is the time of expiration).

In my case I had a problem with huge amount of expired tokens still stored in Redis. And because Redis stores everything in memory, our server was slowly moving into full RAM state. This is what the PR solves, correct cleanup of expired tokens.

Your problem looks like a different situation. A still valid token is stored many times into the TokenStore. I took a quick look into the code and this case should be handled out of the TokenStore. When there is a generation of new token, the old one should be deleted (in case it exists). Open the class DefaultTokenServices and try to find whether there is not a problem.

And maybe open another issue and put a link here so we can move this conversation there :)

xiaocc2012 commented 4 years ago

thanks, i have this problem too. The value size of key client_id_to_access:xxx is about 200M, so when a token expired, will call function removeAccessToken then call redis lrem, this is a O(n) function, it will scan the list from begin to end to find the expired token and remove it. This cause the cpu going to 100%, so sad.

prateekgoel1 commented 3 years ago

Yeah I am facing the same, looks like this issue is still not fixed in master

LukasVyhlidka commented 2 years ago

Sorry for late reply. I was not able to accomplish my PR for this issue - https://github.com/spring-projects/spring-security-oauth/pull/1660/files

In our production we're using the method I introduced in that PR - (doMaintenance) - and it solves the issue. It is called once upon a time and because it uses the scan functionality it does not block redis on consume too much resources...

You can get inspired by that and either make the same override in you app or better introduce a PR that would be accepted by the community.