Partitioned (Design Generic) Rate Limiting APIs

rafikiassumani-msft commented 2 years ago

Need to work with the runtime team for Generic APIs approval
Need to understand the patterns for scoping rate limiting (Bad IP address, identifiers, Pattern for identifiers (Limit based on client id, IP address)

BrennanConroy commented 2 years ago

The proposed API for generic rate limiters will follow the API for non-generic rate limiters, because it keeps the rate limiting APIs aligned and currently we don't see any use cases that should cause the API to differ yet.

public abstract class GenericRateLimiter<TResource> : IAsyncDisposable, IDisposable
{
    public abstract int GetAvailablePermits(TResource resourceID);

    public RateLimitLease Acquire(TResource resourceID, int permitCount = 1);

    protected abstract RateLimitLease AcquireCore(TResource resourceID, int permitCount);

    public ValueTask<RateLimitLease> WaitAsync(TResource resourceID, int permitCount = 1, CancellationToken cancellationToken = default);

    protected abstract ValueTask<RateLimitLease> WaitAsyncCore(TResource resourceID, int permitCount, CancellationToken cancellationToken);

    protected virtual void Dispose(bool disposing) { }

    public void Dispose()
    {
        // Do not change this code. Put cleanup code in 'Dispose(bool disposing)' method
        Dispose(disposing: true);
        GC.SuppressFinalize(this);
    }

    protected virtual ValueTask DisposeAsyncCore()
    {
        return default;
    }

    public async ValueTask DisposeAsync()
    {
        // Perform async cleanup.
        await DisposeAsyncCore().ConfigureAwait(false);

        // Dispose of unmanaged resources.
        Dispose(false);

        // Suppress finalization.
        GC.SuppressFinalize(this);
    }
}

What is more interesting IMO is the potential implementations of an GenericRateLimiter and if we can make it easier for users to create one. A quick implementation would likely involve a dictionary with some sort of identifier (differs per resource type) for groups of resources and a different limiter for each group. For example, if I want to group a resource like HttpRequestMessage by request paths I might write the following helper method to get a rate limiter that will be used by an GenericRateLimiter implementation:

private readonly ConcurrentDictionary<string, RateLimiter> _limiters = new();
private readonly RateLimiter _defaultLimiter = new TokenBucketRateLimiter(new TokenBucketRateLimiterOptions(1, QueueProcessingOrder.OldestFirst, 1, TimeSpan.FromSeconds(1), 1, true));
private RateLimiter GetRateLimiter(HttpRequestMessage resource)
{
    if (!_limiters.TryGetValue(resource.RequestUri.AbsolutePath, out var limiter))
    {
        if (resource.RequestUri.AbsolutePath.StartsWith("/problem", StringComparison.OrdinalIgnoreCase))
        {
            limiter = new ConcurrencyLimiter(new ConcurrencyLimiterOptions(1, QueueProcessingOrder.NewestFirst, 1));
        }
        else
        {
            limiter = _defaultLimiter;
        }
        limiter = _limiters.GetOrAdd(resource.RequestUri.AbsolutePath, limiter);
    }

    return limiter;
}

The above starts showing some of the complexities of implementing an GenericRateLimiter.

Concurrent rate limiter creation, handled by TryGetValue and GetOrAdd on ConcurrentDictionary.
Having a default rate limiter.
Maintaining a large if else or switch statement for all the groupings.

And there are additional non-obvious concerns:

Each "grouping" of resources should have its own collection of limiters, otherwise if there is a collision of group names (e.g. "Post" HTTP method and "Post" path) then the order of requests would add a different rate limiter to the cache and not work in the expected way.
For the TokenBucket limiter (or any limiter that may use a timer for refreshing tokens) you should create a single timer instance and call refresh on all the TokenBucket limiters for efficiency.
There should be some sort of heuristic to retire limiters (if the limiter has all permits it can be removed from the dictionary).

To make GenericRateLimiter's easier to create we are proposing an API to be able to build an GenericRateLimiter and manage many of the complexities of implementing a custom GenericRateLimiter.

public class GenericRateLimitBuilder<TResource>
{
    public GenericRateLimitBuilder<TResource> WithPolicy<TKey>(Func<TResource, TKey?> keyFactory, Func<TKey, RateLimiter> limiterFactory) where TKey : notnull;

    public GenericRateLimitBuilder<TResource> WithConcurrencyPolicy<TKey>(Func<TResource, TKey?> keyFactory, ConcurrencyLimiterOptions options) where TKey : notnull;

    // Assuming we have a ReplenishingRateLimiter limiter abstract class
    // public GenericRateLimitBuilder<TResource> WithReplenishingPolicy(Func<TResource, TKey?> keyFactory, Func<TKey, ReplenishingRateLimiter> replenishingRateLimiter) where TKey : notnull;

    public GenericRateLimitBuilder<TResource> WithTokenBucketPolicy<TKey>(Func<TResource, TKey?> keyFactory, TokenBucketRateLimiterOptions options) where TKey : notnull;

    // might want this to be a factory if the builder is re-usable
    public GenericRateLimitBuilder<TResource> WithDefaultRateLimiter(RateLimiter defaultRateLimiter);

    public GenericRateLimiter<TResource> Build();
}

Details:

keyFactory is called to get a grouping identifier that the resource is part of, or null if the resource doesn't apply to the factory.
The factories are called in order until one of them returns a non-null identifier and then the limiterFactory is called to get the RateLimiter to apply to the resource (cached in a dictionary for the next time that identifier is used).

Questions:

Should the Func<TKey, RateLimiter> parameters accept TKey?
Should we provide Func<TKey, ValueTask<RateLimiter>> overloads? Would create sync-over-async when calling RateLimiter.Acquire().

One scenario that isn't handled by the builder proposed above is the ability to combine rate limiters. Imagine you want a global limiter of 100 concurrent requests to a service and also to have a per IP limit of 1 per second. The builder pattern only supports running a single rate limiter so there needs to be some other way to "chain" rate limiters. We believe this can be accomplished by providing a static method that accepts any number of GenericRateLimiters and combines them to create a single GenericRateLimiter that will run them in order when acquiring a lease.

+ static GenericRateLimiter<TResource> CreateChainedRateLimiter<TResource>(IEnumerable<GenericRateLimiter<TResource>> limiters);`

Additionally, we would like to add an interface for rate limiters that refresh tokens to make it easier to handle replenishing tokens from a single timer in generic code

+ public interface IReplenishingRateLimiter
+ {
+     public abstract bool TryReplenish();
+     // public TimeSpan ReplenishRate { get; }
+ }

public sealed class TokenBucketRateLimiter
 : RateLimiter
+ , IReplenishingRateLimiter

Alternatively we could use a new abstract class public abstract class ReplenishingRateLimiter : RateLimiter that the TokenBucketRateLimiter implements. Adding a class would add TryReplenish to the public API that a consumer might see (if they accepted ReplenishingRateLimiter instead of RateLimiter).

And finally, we would like to add an API for checking if a rate limiter is idle. This would be used to see which rate limiters are broadcasting that they aren't being used and we can potentially remove them from our GenericRateLimiter cache to reduce memory.

public abstract class RateLimiter : IAsyncDisposable, IDisposable
{
+    public abstract DateTime? IdleSince { get; }
// alternatives
//    bool IsInactive { get; }
//    bool IsIdle { get; }
}

Alternatively we could also add an interface, IIdleRateLimiter, that limiters can choose to implement, but we think a first class property is more appropriate in this scenario because you should be forced to implement the property to allow for book-keeping in the AggregrateRateLimiter.

Example usage:

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddHttpClient("RateLimited", o => o.BaseAddress = new Uri("http://localhost:5000"))
    .AddHttpMessageHandler(() =>
        new RateLimitedHandler(
            new GenericRateLimitBuilder<HttpRequestMessage>()
            // TokenBucketRateLimiter if the request is a POST
            .WithTokenBucketPolicy(request => request.Method.Equals(HttpMethod.Post) ? HttpMethod.Post : null,
                new TokenBucketRateLimiterOptions(1, QueueProcessingOrder.OldestFirst, 1, TimeSpan.FromSeconds(1), 1, true))
            // ConcurrencyLimiter if above limiter returns null and has a "cookie" header
            .WithPolicy(request => request.Headers.TryGetValues("cookie", out _) ? "cookie" : null,
                _ => new ConcurrencyLimiter(new ConcurrencyLimiterOptions(1, QueueProcessingOrder.NewestFirst, 1)))
            // ConcurrencyLimiter per unique URI
            .WithConcurrencyPolicy(request => request.RequestUri,
                new ConcurrencyLimiterOptions(2, QueueProcessingOrder.OldestFirst, 2))
            .Build()));

// ...

var factory = app.Services.GetRequiredService<IHttpClientFactory>();
var client = factory.CreateClient("RateLimited");
var resp = await client.GetAsync("/problem");

Kahbazi commented 2 years ago

Having a NoPolicy would be also useful. This could be used when we don't want to have any limitation if certain condition is true.

public class AggregateRateLimitBuilder<TResource>
{
    public AggregateRateLimitBuilder<TResource> WithNoPolicy<TKey>(Func<TResource, bool> condition);
}

halter73 commented 2 years ago

Here's the issue in the runtime repo tracking this work: https://github.com/dotnet/runtime/issues/65400

dotnet / aspnetcore

Partitioned (Design Generic) Rate Limiting APIs #37383