vutran1710 / PyrateLimiter

⚔️Python Rate-Limiter using Leaky-Bucket Algorithm Family
https://pyratelimiter.readthedocs.io
MIT License
334 stars 36 forks source link

Linearly Increasing Memory consumption when pyrate_limiter is enabled with a redis backend #156

Closed aaditya-srivathsan closed 6 months ago

aaditya-srivathsan commented 6 months ago

Hello! I have been facing a strange issue in my python gprc webserver(with multi-processing enabled where essentially we have 2 servers running as forked process') optimized for I/O calls with rate limiting enabled. For some reason, we have been seeing linearly increasing memory consumption on our application. Upon further investigation, I noticed that disabling the rate limiter call (try_acquire) magically fixes this linearly increasing memory issue. Given we have our custom redis bucket factory implementation,. where the get function is structured something like

bucket = RedisBucket([rate], connection-string) self.schedule_leak(bucket, self.base_clock) Any ideas as to why this behavior might be occurring? I am using the pyrate-limiter version "3.1.0". For more context, this try_acquire is also invoked from an async function. While looking into the implementation of schedule_leak() i noticed that we do create a thread that essentially sleeps for the rate we specify for every unique bucket. Given my application could process multiple unique requests leading to multiple unique bucket creation, could that be a potential issue?

Thanks

vutran1710 commented 6 months ago

In v3.1 there is no limit to number of thread/process pool to run leaks at the background so that might be a problem with having a lot of separate buckets. Can you try bumping to v3.2.1 to see if the issue still persists?

aaditya-srivathsan commented 6 months ago

Thanks for the quick response. I can try adding a thread pool and re running my experiment to check the mem consumption post the change. A general question, how does one go about setting the number of process and does it have to have any correlation to the number of requests you application sees or the rate limits?

vutran1710 commented 6 months ago

In the constructor of Limiter there is a keyword argument called "threadpool". You can manually create your own threadpool then pass it to limiter

''' class Limiter: def init( self, argument: Union[BucketFactory, AbstractBucket, Rate, List[Rate]], clock: AbstractClock = TimeClock(), raise_when_fail: bool = True, max_delay: Optional[Union[int, Duration]] = None, thread_pool: Optional[ThreadPool] = None, ) '''

aaditya-srivathsan commented 6 months ago

Yeah i got that, but is there a recommendation on how large the threadpool should be? I see that the default implementation if we dont specify a threadpool creates one with 10 process' so i was curious to know how was this number decided

vutran1710 commented 6 months ago

It is an approximate number based on my machine hardware (MacOs m2).

I guess if you are using container instance (docker/kuberbetes etc) or such, the number might need to be smaller eg 2. It depends on the number of core of the machine I suppose.

aaditya-srivathsan commented 6 months ago

sounds good thanks again! just wanted to confirm if this in anyway depends on the rate you set

aaditya-srivathsan commented 6 months ago

Thanks for the help! closing this issue