cristipufu / aspnetcore-redis-rate-limiting

Set up a Redis backplane for ASP.NET Core multi-node deployments, using the built-in Rate Limiting support that's part of .NET 7 and .NET 8.
MIT License
221 stars 22 forks source link

RedisTokenBucketRateLimiter on AWS Serverless Redis Cache #154

Closed altso closed 8 months ago

altso commented 8 months ago

RedisTokenBucketRateLimiter throws an error on Amazon ElastiCache Serverless for Redis. Every other limiter type works just fine.

StackExchange.Redis.RedisServerException: ERR This Redis command is not allowed from script script: 53aa7d296ba9ded783301cc275161b6e344ad383, on @user_script:13.
      at StackExchange.Redis.RedisDatabase.ScriptEvaluateAsync(String script, RedisKey[] keys, RedisValue[] values, CommandFlags flags) in /_/src/StackExchange.Redis/RedisDatabase.cs:line 1551
      at RedisRateLimiting.Concurrency.RedisTokenBucketManager.TryAcquireLeaseAsync()
      at RedisRateLimiting.RedisTokenBucketRateLimiter`1.AcquireAsyncCoreInternal()
cristipufu commented 8 months ago

Maybe related to this? https://github.com/cristipufu/aspnetcore-redis-rate-limiting/pull/47

redis.replicate_commands()

fyi @Namoshek

Namoshek commented 8 months ago

Could also be the local time = redis.call('TIME') call on line 13. If anyone finds a documentation of AWS regarding supported features and differences, I'm willing to look into this more.

However, it remains the question which standard we should support and prioritize? The official Redis implementation, AWS and/or more?

altso commented 8 months ago

TIME seems to be supported. Here is the full documentation https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/SupportedCommands.html

Namoshek commented 8 months ago

Alright, then maybe we should look into replication. In the changelog of 6.x they mention that the lua-replicate-commands option was removed. That sounds a lot like the command suspected by @cristipufu. If you have a test environment, maybe you can confirm by using a different Redis version?

If redis.replicate_commands() is not needed in ElastiCache, maybe we can run it conditionally. Whether it's required is not really clear from the documentation though (at least I couldn't find anything). However, really nice would be detecting the runtime environment from within the script, but I guess that's not going to be very easy.

altso commented 8 months ago

Amazon ElastiCache Serverless uses Redis 7.1 and there is no way to downgrade it as far as I know. I have a test environment I can access with redis-cli. I can test things for you, but you'll have to guide me - I am fairly new to Redis.

Namoshek commented 8 months ago

I don't really have much time to look into this right now, but there are parameter groups on AWS that can be used and changed to use an older Redis version. This may not work for serverless deployments, but it should for the custom deployments.

If this issue is in fact related to the redis.replicate_commands() call, there are multiple possible solutions:

  1. Detect the engine and/or version where the Lua scripts are running in. If I read this correctly, since Redis 7.0 the verbatim script replication is not supported anymore and effect replication is the default now. So this command may not be necessary for newer Redis versions at all. And maybe this issue is not limited to ElastiCache either.
  2. Replace the redis.call('TIME') call with a time parameter that is passed to the script. The issue with this approach is (or can be) that different application servers may have slightly different clocks and the rate limiting may have a slight drift, but in practice, this should be negligible.

As I've said I don't have much time to test this right now, but I would probably try to implement a condition for redis.replicate_commands() using redis.REDIS_VERSION_NUM. Coincidentally, this constant was introduced with Redis 7.0 and it yields nil on older versions, so we simply can use

if not redis.REDIS_VERSION_NUM then
  redis.replicate_commands()
end

to potentially fix the issue. I would leave testing this to someone with more time and better access to ElastiCache though.

altso commented 8 months ago

Thanks @Namoshek. I will test it and get back to you tomorrow.

altso commented 8 months ago

@Namoshek I cloned the repo, updated the lua script in RedisTokenBucketManager with your suggested fix and ran the tests - same error.

I was able to make it not to crash with an error by removing call to TIME like this:

...
local time = {1640995200, 0} --redis.call('TIME')`
...

I tested further using redis-cli and it looks like TIME is not supported in lua scripts.

altso:~$ redis-cli -h aws-redis-serverless --tls
aws-redis-serverless:6379> TIME
1) "1708045878"
2) "987504"
aws-redis-serverless:6379> EVAL "return redis.call('TIME')" 1 test test
(error) ERR This Redis command is not allowed from script script: 104ae65debf45156ded1e8e613705534c8e62c17, on @user_script:1.
aws-redis-serverless:6379>

I could not find any AWS documentation on that, but this answer (https://stackoverflow.com/a/51081809/94803):

The current time is non-deterministic i.e. it returns different values on repeated calls. This hurts replication. For this reason, the current time should be passed into your LUA script as a parameter.

I can submit a PR with a fix if you are ok with passing time as a parameter. Let me know.