nikolaposa / rate-limit

🚔 General purpose rate limiter implementation.
MIT License
269 stars 47 forks source link

LimitExceeded exception now contains an instance of Status instance #21

Closed cmizzi closed 4 years ago

cmizzi commented 4 years ago

When using the exception, it was impossible to know how long to wait before a new request could be executed. Now, the exception has an instance of `Status' which can be used to wait for the right time. The use of silent limitation is not suitable in all cases and forces the developer to create a new exception to get this information.

Use cases

<?php

use Illuminate\Http\Client\RequestException;
use Illuminate\Support\Facades\Http;
use RateLimit\Rate;
use RateLimit\RedisRateLimiter;
use Redis;

$rateLimiter = new RedisRateLimiter(new Redis());

while (true) {
    $token  = 'my-identifier';
    $status = $rateLimiter->limitSilently($token, Rate::perSecond(1));

    // If the limit is reached, wait until the token is available again.
    if ($status->limitExceeded()) {
        sleep($status->getResetAt()->getTimestamp() - now()->timestamp);
    }

    try {
        $data = Http::withToken($token)->get("http://acme/api/posts");
        // ...
    }

    catch (RequestException $e) {
        // ...
    }
}
<?php

use Illuminate\Http\Client\RequestException;
use Illuminate\Support\Facades\Http;
use RateLimit\Exception\LimitExceeded;
use RateLimit\Rate;
use RateLimit\RedisRateLimiter;
use Redis;

$rateLimiter = new RedisRateLimiter(new Redis());

while (true) {
    try {
        $token  = 'my-identifier';
        $status = $rateLimiter->limit($token, Rate::perSecond(1));

        $data = Http::withToken($token)->get("http://acme/api/posts");
        // ...
    }

    catch (RequestException $e) {
        // ...
    }

    catch (LimitExceeded $e) {
        sleep($e->status->getResetAt()->getTimestamp() - now()->timestamp);
    }
}
cmizzi commented 4 years ago

The exception is clean and only stops the execution of the rate limiter. If one wishes to wait until the token is available again using the exception, it is impossible to use the exception and a new one must be defined containing this information. We are just talking about syntax here:

I don't really understand why the exception should lack information about the silent execution.

Personally, I'm a little upset that I have to use the silent method when I want to get an exception, with clear information as to why the token is not usable.

vs

nikolaposa commented 4 years ago

I don't really understand why the exception should lack information about the silent execution.

Like I've said, the decision to segregate these interfaces was not inspired by syntax but context in which they are intended to be used in, so a consumer can choose which contract (loud or silent) suits him better. Also, RateLimiter typically works much faster, it does not need to fetch ttl information for example.

If you need status information, just use SilentRateLimiter, don't rely on RateLimiter. Fact that RedisRateLimiter implements both is just an implementation detail.

cmizzi commented 4 years ago

So, to raise an exception and get the status, it is absolutely necessary to create a custom exception in order to return the status and be able to wait until the token is free again... For example, if a service is declared to execute a request with a given token, it is impossible for the worker to know what happens if an exception is not thrown.

Maybe the implementation is not good, but I still believe that the status must be available in both cases, otherwise additional classes must be declared to be able to forward the exception to the executor.

nikolaposa commented 4 years ago

So, to raise an exception and get the status

Why would you ever do both? What's the problem in always using limitSilently() in that given context?

it is impossible for the worker to know what happens if an exception is not thrown.

What is "worker"?

cmizzi commented 4 years ago

Imagine a script run by hand. This script retrieves a token associated with an external API. The token is an entity that can make a request. It is much cleaner in this case than if the entity of the token can not perform the request (rate limit exceeded), the main thread (the main loop of the script) is aware that the request could not be executed (either to display debug, or to switch to another token until it is available again). In order for the information to pass through here, the best way remains the exception. To illustrate my point, here is an example:

<?php

$waiting = [];

// Huge list of API endpoints
$queue = new \SplQueue;

while (!$queue->isEmpty()) 
    $endpoint = $queue->dequeue();
    $host     = parse_url($token)['host'];

    // Requeue the endpoint if the domain is temporary not allowed because of
    // rate limit.
    if (isset($waiting[$host]) && $waiting[$host] < time()) {
        $queue->enqueue($token);
        continue;
    }

    try {
        // The request() method call internaly the rate limiter because this method
                // is shared between multiple services.
        $data = Token::findByHost($host)->request($endpoint);
    }

    catch (LimitExceeded $e) {
        // How to know when the token will be available ?
        //
        // $waiting[$host] = ...;
        // $waiting[$host] = $e->getStatus()->getResetAt()->getTimestamp();
    }
}

As long as the logic is not placed in the executor (here the loop), it is impossible to retrieve the information without an exception. If I use silent execution, the expected return value is not a status, but the result of the request and I don't want the loop to be blocked because of a rate limit on a given endpoint.

Do you understand where I'm going with this?

nikolaposa commented 4 years ago

Sorry for not responding earlier, I didn't have much time for open source.

I understand your point now. And while the example you gave clearly shows the shortcomings of current RateLimiter interface, I think it is not that common. The main reason I decided to segregate interfaces into RateLimiter and SilentRateLimiter was that I wanted to make default one fast, atomic, with as little communication as possible with backend (i.e. Redis). Putting any extra logic that goes beyond simple counter would mean performance degradation and loosing the original point that covers 90% of rate limit guard clause scenarios without the need for status information.