Rate limit APIs - Githubissues

This is a summary of the Design doc.

Background and Motivation

Outages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing rate limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing rate limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload.

Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) limits where no explicit release semantics are needed as the permits are replenished automatically over time. This component encompasses the Acquire/WaitAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a RateLimitLease type which indicates whether acquisition is successful and manages the lifecycle of the acquired permits.

Proposed API - Abstractions

namespace System.Threading.RateLimiting
{
  public abstract class RateLimiter
  {
    // An estimated count of available permits. Potential uses include diagnostics.
    public abstract int GetAvailablePermits();

    // Fast synchronous attempt to acquire permits
    // Set permitCount to 0 to get whether permits are exhausted
    public RateLimitLease Acquire(int permitCount = 1);

    // Implementation
    protected abstract RateLimitLease AcquireCore(int permitCount);

    // Wait until the requested permits are available or permits can no longer be acquired
    // Set permitCount to 0 to wait until permits are replenished
    public ValueTask<RateLimitLease> WaitAsync(int permitCount = 1, CancellationToken cancellationToken = default);

    // Implementation
    protected abstract ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
  }

  public abstract class RateLimitLease : IDisposable
  {
    // This represents whether lease acquisition was successful
    public abstract bool IsAcquired { get; }

    // Method to extract any general metadata. This is implemented by subclasses
    // to return the metadata they support.
    public abstract bool TryGetMetadata(string metadataName, out object? metadata);

    // This casts the metadata returned by the general method above to known types of values.
    public bool TryGetMetadata<T>(MetadataName<T> metadataName, [MaybeNullWhen(false)] out T metadata);

    // Used to get a list of metadata that is available on the lease which can be dictionary keys or static list of strings.
    // Useful for debugging purposes but TryGetMetadata should be used instead in product code.
    public abstract IEnumerable<string> MetadataNames { get; }

    // Virtual method that extracts all the metadata using the list of metadata names and TryGetMetadata().
    public virtual IEnumerable<KeyValuePair<string, object?>> GetAllMetadata();

    // Follow the general .NET pattern for dispose
    public void Dispose() { Dispose(true); GC.SuppressFinalize(this); }
    protected virtual void Dispose(bool disposing);
  }

  // Curated set of known MetadataName<T>
  public static class MetadataName : IEquatable<MetadataName>
  {
    public static MetadataName<TimeSpan> RetryAfter { get; } = Create<TimeSpan>("RETRY_AFTER");
    public static MetadataName<string> ReasonPhrase { get; } = Create<string>("REASON_PHRASE");

    public static MetadataName<T> Create<T>(string name) => new MetadataName<T>(name);
  }

  // Wrapper of string and a type parameter signifying the type of the metadata value
  public sealed class MetadataName<T> : IEquatable<MetadataName<T>>
  {
    public MetadataName(string name);
    public string Name { get; }
  }
}

The Acquire call represents a fast synchronous check that immediately returns whether there are enough permits available to continue with the operation and atomically acquires them if there are, returning RateLimitLease with the value RateLimitLease.IsAcquired representing whether the acquisition is successful and the lease itself representing the acquired permits, if successful. The user can pass in a permitCount of 0 to check whether the permit limit has been reached without acquiring any permits.

WaitAsync, on the other hand, represents an awaitable request to check whether permits are available. If permits are available, obtain the permits and return immediately with a RateLimitLease representing the acquired permits. If the permits are not available, the caller is willing to pause the operation and wait until the necessary permits become available. The user can also pass in a permitCount of 0 but and indicates the user wants to wait until more permits become available.

GetAvailablePermits() is envisioned as a flexible and simple way for the limiter to communicate the status of the limiter to the user. This count is similar in essence to SemaphoreSlim.CurrentCount. This count can also be used in diagnostics to track the usage of the rate limiter.

The abstract class RateLimitLease is used to facilitate the release semantics of rate limiters. That is, for non self-replenishing, the returning of the permits obtained via Acquire/WaitAsync is achieved by disposing the RateLimitLease. This enables the ability to ensure that the user can't release more permits than was obtained.

The RateLimitLease.IsAcquired property is used to express whether the acquisition request was successful. TryGetMetadata() is implemented by subclasses to allow for returning additional metadata as part of the rate limit decision. A curated list of well know names for commonly used metadata is provided via MetadataName which keeps a list of MetadataName<T>s which are wrappers of string and a type parameter indicating the value type. To optimize performance, implementations will need to pool RateLimitLease.

Usage Examples

For components enforcing limits, the standard usage pattern will be:

RateLimiter limiter = new SomeRateLimiter(options => ...)
// Synchronous checks
endpoints.MapGet("/acquire", async context =>
{
    // Check limiter using `Acquire` that should complete immediately
    using var lease = limiter.Acquire();
    // RateLimitLease was successfully obtained, the using block ensures
    // that the lease is released upon processing completion.
    if (lease.IsAcquired)
    {
        await context.Response.WriteAsync("Hello World!");
    }
    else
    {
        // Rate limit check failed, send 429 response
        context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
        return;
    }
}
// Async checks
endpoints.MapGet("/waitAsync", async context =>
{
    // Check limiter using `WaitAsync` which may complete immediately
    // or wait until permits are available. Using block ensures that
    // the lease is released upon processing completion.
    using var lease = await limiter.WaitAsync();
    if (lease.IsAcquired)
    {
        await context.Response.WriteAsync("Hello World!");
    }
    else
    {
        // Rate limit check failed, send 429 response
        context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
        return;
    }
}

Propsed API - Concrete Implementations

namespace System.Threading.RateLimiting
{
    // This specifies the behaviour of `WaitAsync` When PermitLimit has been reached
    public enum QueueProcessingOrder
    {
        OldestFirst,
        NewestFirst
    }

    public sealed class ConcurrencyLimiterOptions
    {
        public ConcurrencyLimiterOptions(
            int permitLimit, 
            QueueProcessingOrder queueProcessingOrder, 
            int queueLimit);

        // Specifies the maximum number of permits for the limiter
        public int PermitLimit { get; }
        // Permits exhausted mode, configures `WaitAsync` behaviour
        public QueueProcessingOrder QueueProcessingOrder { get; }
        // Queue limit when queuing is enabled
        public int QueueLimit { get; }
    }

    public sealed class TokenBucketRateLimiterOptions
    {
        public TokenBucketRateLimiterOptions(
            int tokenLimit, 
            QueueProcessingOrder queueProcessingOrder, 
            int queueLimit,
            TimeSpan replenishmentPeriod,
            int tokensPerPeriod,
            bool autoReplenishment = true);

        // Specifies the maximum number of tokens for the limiter
        public int TokenLimit { get; }
        // Permits exhausted mode, configures `WaitAsync` behaviour
        public QueueProcessingOrder QueueProcessingOrder { get; }
        // Queue limit when queuing is enabled
        public int QueueLimit { get; }
        // Specifies the period between replenishments
        public TimeSpan ReplenishmentPeriod { get; }
        // Specifies how many tokens to restore each replenishment
        public int TokensPerPeriod { get; }
        // Whether to create a timer to trigger replenishment automatically
        public bool AutoReplenishment { get; }
    }

    // Window based rate limiter options
    public sealed class FixedWindowRateLimiterOptions
    {
        public FixedWindowRateLimiterOptions(
            int permitLimit, 
            QueueProcessingOrder queueProcessingOrder, 
            int queueLimit,
            TimeSpan window,
            bool autoReplenishment = true);

        // Specifies the maximum number of tokens for the limiter
        public int PermitLimit { get; }
        // Permits exhausted mode, configures `WaitAsync` behaviour
        public QueueProcessingOrder QueueProcessingOrder { get; }
        // Queue limit when queuing is enabled
        public int QueueLimit { get; }
        // Specifies the duration of the window where the rate limit is applied
        public TimeSpan Window { get; }
        // Whether to create a timer to trigger replenishment automatically
        public bool AutoRefresh { get; }
    }

    public sealed class SlidingWindowRateLimiterOptions
    {
        public SlidingWindowRateLimiterOptions(
            int permitLimit, 
            QueueProcessingOrder queueProcessingOrder, 
            int queueLimit,
            TimeSpan window,
            int segmentsPerWindow,
            bool autoReplenishment = true);

        // Specifies the maximum number of tokens for the limiter
        public int PermitLimit { get; }
        // Permits exhausted mode, configures `WaitAsync` behaviour
        public QueueProcessingOrder QueueProcessingOrder { get; }
        // Queue limit when queuing is enabled
        public int QueueLimit { get; }
        // Specifies the duration of the window where the rate limit is applied
        public TimeSpan Window { get; }
        // Specifies the number of segments the Window should be divided into
        public int SegmentsPerWindow { get; set; }
        // Whether to create a timer to trigger replenishment automatically
        public bool AutoRefresh { get; }
    }

    // Limiter implementations
    public sealed class ConcurrencyLimiter : RateLimiter 
    { 
        public ConcurrencyLimiter(ConcurrencyLimiterOptions options);
        public override int GetAvailablePermits();
        protected override RateLimitLease AcquireCore(int permitCount);
        protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
    }
    public sealed class TokenBucketRateLimiter : RateLimiter 
    { 
        public FixedWindowRateLimiter(TokenBucketRateLimiter options);
        public bool TryReplenish();
        public override int GetAvailablePermits();
        protected override RateLimitLease AcquireCore(int permitCount);
        protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
    }
    public sealed class FixedWindowRateLimiter : RateLimiter 
    { 
        public FixedWindowRateLimiter(FixedWindowRateLimiterOptions options);
        public bool TryRefresh();
        public override int GetAvailablePermits();
        protected override RateLimitLease AcquireCore(int permitCount);
        protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
    }
    public sealed class SlidingWindowRateLimiter : RateLimiter 
    { 
        public SlidingWindowRateLimiter(SlidingWindowRateLimiterOptions options);
        public bool TryRefresh();
        public override int GetAvailablePermits();
        protected override RateLimitLease AcquireCore(int permitCount);
        protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
    }
}

For more details on how these options work, see the Design Doc.

Adoption samples

This API will be used in implementing ASP.NET Core middleware in .NET 6.0 and can be useful in implementing limits for various BCL types in the future including:

Channels
Pipelines
Streams
HttpClient

Sample implementation in Channels, note this is using slightly outdated API.

For more theoretical samples of RateLimiter implementations, see the Proof of Concepts in the Design Doc.

We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR.

Alternative Designs

Token bucket rate limiter external replenishment

The default implementation will allocate a new System.Threading.Timer to trigger permit replenishment. This can be expensive when many limiters are in use and a better pattern is to trigger the replenishment via a single Timer. The current proposal has two APIs to support this, a public void Replenish() on the limiter and a public bool AutoReplenishment { get;set; } on the options class.

Subclasses can override default behaviour

Instead of exposing the two APIs, we can make the class extensible and allow subclasses to add the Replenish() method as well as the external replenishment functionality. However, the AutoReplenishment still need to exist so the default implementation knows if a Timer needs to be created.

Heuristics based replenishment.

We can rely on recomputing the permit count based on how long since the last replenishment occurred on every invocation of Acquire, WaitAsync and GetAvailablePermits. However, we'll still need to allocate a Timer to process queued WaitAsync calls.

Separate abstractions for rate and concurrency limits

A design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a RateLimitLease and does not possess release semantics. In comparison, the proposed design where the release semantics for rate limits will no-op.

However, this design has the drawback for consumers of rate limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred.

A struct instead of class for RateLimitLease

This approach was considered since allocating a new RateLimitLease for each acquisition request is considered to be a performance bottleneck. The design evolved to the following:

// Represents a permit lease obtained from the limiter. The user disposes this type to release the acquired permits.
public struct RateLimitLease : IDisposable
{
  // This represents whether permit acquisition was successful
  public bool IsAcquired { get; }
  // This represents the count of permits obtained in the lease
  public int Count { get; }
  // This represents additional metadata that can be returned as part of a call to Acquire/AcquireAsync
  // Potential uses could include a RetryAfter value or an error code.
  public object? State { get; }
  // Private fields to be used by `onDispose()`, this is not a public API shown here for completeness
  private RateLmiter? _rateLimiter;
  private Action<RateLimiter?, int, object?>? _onDispose;
  // Constructor which sets all the readonly values
  public RateLimitLease(
    bool isAcquired, 
    int Count, 
    object? state, 
    RateLimiter? rateLimiter, 
    Action<RateLimiter?, int, object?>? onDispose);
  // Return the acquired permits, calls onDispose with RateLimiter
  // This can only be called once, it's an user error if called more than once
  public void Dispose();
}

However, this design became problematic with the consideration of including a AggregatedRateLimiter<TKey> which necessitates the existence of another struct RateLimitLease<TKey> with a private reference to the AggregatedRateLimiter<TKey>. This bifurcation of the return types of Acquire and WaitAsync between the AggregatedRateLimiter<TKey> and RateLimiter make it very difficult to consume aggregated and simple limiters in a consistent manner. Additional complexity in definiting an API to store and retrieve additional metadata is also a concern, see below. For this reason, it is better to make RateLimitLease a class instead of a struct and require implementations to pool if optimization for performance is required.

Additional concerns that needed to be resolved for a struct RateLimitLease are elaborated below:

Permit as reference ID

There was alternative proposal where the struct only contains a reference ID and additional APIs on the RateLimiter instance is used to return permits and obtain additional metadata. This is equivalent to the RateLimiter internally tracking outstanding permit leases and allow permit release via RateLimiter.Release(RateLimitLease.ID) or obtain additional metadata via RateLimiter.TryGetMetadata(RateLimitLease.ID, MetadataName). This shifts the need to pool data structures for tracking idempotency of Dispose and additional metadata to the RateLimiter implementation itself. This additional indirection doesn't resolve the bifurcation issue mentioned previously and necessitates additional APIs that are hard to use and implement on the RateLimiter, as such this alternative is not chosen.

RateLimitLease state

The current proposal uses a object State to communicate additional information on a rate limit decision. This is the most general way to provide additional information since the RateLimiter can add any arbitrary type or collections via object State. However, there is a tradeoff between the generality and flexibility of this approach with usability. For example, we have gotten feedback from ATS that they want a simpler way to specify a set of values such as RetryAfter, error codes, or percentage of permits used. As such, here are several design alternatives.

Interfaces

One option to support access to values is to keep the object State but require limiters to set a state that implements different Interfaces. For example, there could be a IRateLimiterRetryAfterHeaderValue interface that looks like:

public interface IRateLimiterRetryAfterHeaderValue
{
    string RetryAfter { get; }
}

Consumers of the RateLimiter would then check if the State object implements the interface before retrieving the value. It also puts burdens on the implementers of RateLimiters since they should also define a set interfaces to represent commonly used values.

Property bags

Property bags like Activity.Baggage and Activity.Tags are very well suited to store the values that were identified by the ATS team. For web work loads where these values are likely to be headers and header value pairs, this is a good way to express the State field on RateLimitLease. Specifically, the type would be either:

Option 1: IReadonlyDictionary<string,string?> State

However, there is a drawback here in terms of generality since it would mean that we are opinionated about the type of keys and values as strings. Alternatively we can modify this to be:

Option 2: IReadonlyDictionary<string,object?> State

This is slightly more flexible since the value can be any type. However, to use these values, the user would need to know ahead of time what the value for specific keys are and downcast the object to whatever type it is. Going one step further:

Option 3: IReadonlyDictionary<object,object?> State

This gives the most flexibility in the property bag, since we are no longer opinionated about the key type. But the same issue with option 2 remains and it's unclear whether this generality of key type would actually be useful.

Feature collection

Another way to represent the State would be something like a IFeatureCollection. The benefit of this interface is that while it is general enough to contain any type of value and that specific implementations can optimize for commonly accessed fields by accessing them directly (e.g. https://github.com/dotnet/aspnetcore/blob/52eff90fbcfca39b7eb58baad597df6a99a542b0/src/Http/Http/src/DefaultHttpContext.cs).

A `bool` returned by `TryAcquire` to indicate success/failure and throw for `WaitAsync` to indicate failure

An earlier iteration proposed the following API instead:

namespace System.Threading.RateLimiting
{
  public abstract class RateLimiter
  {
    // An estimated count of permits. Potential uses include diagnostics.
    abstract int GetAvailablePermits();

    // Fast synchronous attempt to acquire permits.
    // Set requestedCount to 0 to get whether permit limit has been reached.
    abstract bool Acquire(int requestedCount, out RateLimitLease lease);

    // Wait until the requested permits are available.
    // Set requestedCount to 0 to wait until permits are replenished.
    // An exception is thrown if permits cannot be obtained.
    abstract ValueTask<RateLimitLease> WaitAsync(int requestedCount, CancellationToken cancellationToken = default);
  }

  public struct RateLimitLease: IDisposable
  {
    // This represents additional metadata that can be returned as part of a call to TryAcquire/WaitAsync
    // Potential uses could include a RetryAfter value.
    public object? State { get; init; }

    // Constructor
    public RateLimitLease(object? state, Action<RateLimitLease>? onDispose);

    // Return the acquired permits
    public void Dispose();

    // This static field can be used for rate limiters that do not require release semantics or for failed concurrency limiter acquisition requests.
    public static RateLimitLease NoopSuccess = new RateLimitLease(null, null);
  }

This was proposed since the method name TryAcquire seemed to convey the idea that it is a quick synchronous check. However, this also impacted the shape of the API to return bool by convention and return additional information via out parameters. If a limiter wants to communicate a failure for a WaitAsync, it would throw an exception. This may occur if the limiter has reached the hard cap. The drawback here is that these scenarios, which may be frequent depending on the scenario, will necessitate an allocation of an Exception type.

Another alternative was identified with WaitAsync returning a tuple, i.e. ValueTask<(bool, RateLimitLease)> WaitAsync(...). The consumption pattern would then look like:

(bool successful, RateLimitLease lease) = await WaitAsync(1);
if (successful)
{
    using lease;
    // continue processing
}
else
{
    // limit reached
}

Release APIs on RateLimiter

Instead of using RateLimitLease to track release of permits an alternative approach proposes adding a void Release(int releaseCount) method on RateLimiter and require users to call this method explicitly. However, this requires the user to call release with the correct count which can be error prone and the RateLimitLease approach was preferred.

Partial acquisition and release

Currently, the acquisition and release of permits is all-or-nothing.

Additional APIs will be needed to allow for the ability to acquire a part of the requested permits. For example, 5 permits were requested but willing to accept a subset of the requested permits if not all 5 is available.

Similarly, additional APIs can be added to RateLimitLease to facilitate the release a part of the acquired permits. For example, 5 permits are obtained, but as processing continues, each permit can be released individually.

These APIs are not included in this proposal since no concrete use cases has been currently identified.

Risks

This is a proposal for new API and main concerns include:

Consumption patterns of rate limiters should be simple and idiomatic to prevent pitfalls.
The default rate and/or concurrency limiters should suffice in most general use cases.
The abstraction should should be expressive enough to allow for customized rate limiters.

Tagging subscribers to this area: @tarekgh, @buyaa-n, @krwq See info in area-owners.md if you want to be subscribed.

Issue Details

## Background and Motivation Outages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing resource limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing resource limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload. Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) resource limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) resource limits where no explicit release semantics are needed as the resource is replenished automatically over time. This component encompasses the TryAcquire/AcquireAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a `Resource` type which manages the lifecycle of the acquired resources. ## Proposed API ```c# public interface IResourceLimiter { // An estimated count of resources. E long EstimatedCount { get; } // Fast synchronous attempt to acquire resources. // Set requestedCount to 0 to get whether resource limit has been reached. bool TryAcquire(long requestedCount, out Resource resource); // Wait until the requested resources are available. // Set requestedCount to 0 to wait until resource is replenished. // An exception is thrown if resources cannot be obtained. ValueTask AcquireAsync(long requestedCount, CancellationToken cancellationToken = default); } public struct Resource : IDisposable { // This represents additional metadata that can be returned as part of a call to TryAcquire/AcquireAsync // Potential uses could include a RetryAfter value. public object? State { get; init; } // Constructor public Resource(long count, object? state, Action? onDispose) // Return the acquired resources public void Dispose() // This static field can be used for rate limiters that do not require release semantics or for failed concurrency limiter acquisition requests. public static Resource NoopResource = new Resource(null, null); } // Extension methods public static class ResourceLimiterExtensions { public static bool TryAcquire(this IResourceLimiter limiter, out Resource resource) { return limiter.TryAcquire(1, out resource); } public static ValueTask AcquireAsync(this IResourceLimiter limiter, CancellationToken cancellationToken = default) { return limiter.AcquireAsync(1, cancellationToken); } } ``` The struct `Resource` is used to facilitate the release semantics of resource limiters. That is, for non self-replenishing, the returning of the resources obtained via TryAcquire/AcquireAsync is achieved by disposing the `Resource`. This enables the ability to ensure that the user can't release more resources than was obtained. ## Usage Examples For components enforcing limits, the standard usage pattern will be: ```c# if (limiter.TryAcquire(1, out var resource)) { // Resource obtained successfully. using (resource) { // Continue with processing // Resource released when disposed } } else { // Limit exceeded, no resources obtained } ``` In cases where it is known that the resource limiter is a rate limit with no-op release semantics, the usage can be simplified to: ```c# if (limiter.TryAcquire(1, out _)) { // Resource obtained successfully. // Continue with processing } else { // Limit exceeded, no resources obtained } ``` This API will be useful in implementing limits for various BCL types including: - Channels - Pipelines - Streams - HttpClient Ongoing work to prototype intended usage in these BCL types and default implementations for Fixed Window, Sliding Window, TokenBucket algorithms. For example, a rate limit applied to a BoundedChannel: ```c# // Rate limiter added to options var rateLimiter = new FixedWindowRateLimiter(resourcePerSecond: 5); var rateLimitedChannel = Channel.CreateBounded(new BoundedChannelOptions(5) { WriteRateLimiter = rateLimiter }); // This channel will now only write 5 times per second rateLimitedChannel.Writer.TryWrite("New message"); ``` Ongoing experiments in ASP.NET Core for application in Kestrel server limits and a middleware for enforcing limits on request processing is ongoing at https://github.com/dotnet/aspnetcore/tree/johluo/rate-limits. We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR. ## Alternative Designs Major variants considered ### Separate abstractions for rate and concurrency limits A design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a `Resource` and does not possess release semantics. In comparison, the proposed design where the release semantics for rate limits will no-op. However, this design has the drawback for consumers of resource limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred. ### Release APIs on IResourcelimiter Instead of using the `Resource` struct to track release of resources an alternative approach proposes adding a `void Release(long releaseCount)` method on `IResourceLimiter` and require users to call this method explicitly. However, this requires the user to call release with the correct count which can be error prone and the `Resource` approach was preferred. ### A class instead of struct for Resource This approach allows for subclassing to include additional metadata instead of an `object? State` property on the struct. However, it was deemed that potentially allocating a new `Resource` for each acquisition request is too much and a struct was preferred. ### Partial acquisition and release Currently, the acquisition and release of resources is all-or-nothing. Additional APIs will be needed to allow for the ability to acquire a part of the requested resources. For example, 5 resources were requested but willing to accept a subset of the requested resources if not all 5 is available. Similarly, additional APIs can be added to `Resource` to facilitate the release a part of the acquired resource. For example, 5 resources are obtained, but as processing continues, each resource can be released individually. These APIs are not included in this proposal since no concrete use cases has been currently identified. ## Risks This is a proposal for new API and main concerns include: - Consumption patterns of resource limiters should be simple and idiomatic to prevent pitfalls. - The default rate and/or concurrency limiters should suffice in most general use cases. - The abstraction should should be expressive enought to allow for customized resource limiters.

Author:	JunTaoLuo
Assignees:	-
Labels:	`api-suggestion`, `area-System.Resources`, `untriaged`
Milestone:	-

Tagging subscribers to this area: @eerhardt, @maryamariyan See info in area-owners.md if you want to be subscribed.

Issue Details

## Background and Motivation Outages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing resource limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing resource limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload. Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) resource limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) resource limits where no explicit release semantics are needed as the resource is replenished automatically over time. This component encompasses the TryAcquire/AcquireAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a `Resource` type which manages the lifecycle of the acquired resources. ## Proposed API ```c# public interface IResourceLimiter { // An estimated count of resources. Potential uses include diagnostics. long EstimatedCount { get; } // Fast synchronous attempt to acquire resources. // Set requestedCount to 0 to get whether resource limit has been reached. bool TryAcquire(long requestedCount, out Resource resource); // Wait until the requested resources are available. // Set requestedCount to 0 to wait until resource is replenished. // An exception is thrown if resources cannot be obtained. ValueTask AcquireAsync(long requestedCount, CancellationToken cancellationToken = default); } public struct Resource : IDisposable { // This represents additional metadata that can be returned as part of a call to TryAcquire/AcquireAsync // Potential uses could include a RetryAfter value. public object? State { get; init; } // Constructor public Resource(long count, object? state, Action? onDispose); // Return the acquired resources public void Dispose(); // This static field can be used for rate limiters that do not require release semantics or for failed concurrency limiter acquisition requests. public static Resource NoopResource = new Resource(null, null); } // Extension methods public static class ResourceLimiterExtensions { public static bool TryAcquire(this IResourceLimiter limiter, out Resource resource) { return limiter.TryAcquire(1, out resource); } public static ValueTask AcquireAsync(this IResourceLimiter limiter, CancellationToken cancellationToken = default) { return limiter.AcquireAsync(1, cancellationToken); } } ``` These APIs will likely be added to a new namespace and assembly, potentially `System.Threading.ResourceLimits`. The struct `Resource` is used to facilitate the release semantics of resource limiters. That is, for non self-replenishing, the returning of the resources obtained via TryAcquire/AcquireAsync is achieved by disposing the `Resource`. This enables the ability to ensure that the user can't release more resources than was obtained. ## Usage Examples For components enforcing limits, the standard usage pattern will be: ```c# if (limiter.TryAcquire(1, out var resource)) { // Resource obtained successfully. using (resource) { // Continue with processing // Resource released when disposed } } else { // Limit exceeded, no resources obtained } ``` In cases where it is known that the resource limiter is a rate limit with no-op release semantics, the usage can be simplified to: ```c# if (limiter.TryAcquire(1, out _)) { // Resource obtained successfully. // Continue with processing } else { // Limit exceeded, no resources obtained } ``` This API will be useful in implementing limits for various BCL types including: - Channels - Pipelines - Streams - HttpClient Ongoing work to prototype intended usage in these BCL types and default implementations for Fixed Window, Sliding Window, TokenBucket algorithms. For example, a rate limit applied to a BoundedChannel: ```c# // Rate limiter added to options var rateLimiter = new FixedWindowRateLimiter(resourcePerSecond: 5); var rateLimitedChannel = Channel.CreateBounded(new BoundedChannelOptions(5) { WriteRateLimiter = rateLimiter }); // This channel will now only write 5 times per second rateLimitedChannel.Writer.TryWrite("New message"); ``` Ongoing experiments in ASP.NET Core for application in Kestrel server limits and a middleware for enforcing limits on request processing is ongoing at https://github.com/dotnet/aspnetcore/tree/johluo/rate-limits. We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR. ## Alternative Designs Major variants considered ### Separate abstractions for rate and concurrency limits A design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a `Resource` and does not possess release semantics. In comparison, the proposed design where the release semantics for rate limits will no-op. However, this design has the drawback for consumers of resource limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred. ### Release APIs on IResourcelimiter Instead of using the `Resource` struct to track release of resources an alternative approach proposes adding a `void Release(long releaseCount)` method on `IResourceLimiter` and require users to call this method explicitly. However, this requires the user to call release with the correct count which can be error prone and the `Resource` approach was preferred. ### A class instead of struct for Resource This approach allows for subclassing to include additional metadata instead of an `object? State` property on the struct. However, it was deemed that potentially allocating a new `Resource` for each acquisition request is too much and a struct was preferred. ### Partial acquisition and release Currently, the acquisition and release of resources is all-or-nothing. Additional APIs will be needed to allow for the ability to acquire a part of the requested resources. For example, 5 resources were requested but willing to accept a subset of the requested resources if not all 5 is available. Similarly, additional APIs can be added to `Resource` to facilitate the release a part of the acquired resource. For example, 5 resources are obtained, but as processing continues, each resource can be released individually. These APIs are not included in this proposal since no concrete use cases has been currently identified. ## Risks This is a proposal for new API and main concerns include: - Consumption patterns of resource limiters should be simple and idiomatic to prevent pitfalls. - The default rate and/or concurrency limiters should suffice in most general use cases. - The abstraction should should be expressive enough to allow for customized resource limiters.

Author:	JunTaoLuo
Assignees:	-
Labels:	`api-ready-for-review`, `area-Extensions-Primitives`, `area-System.Resources`, `untriaged`
Milestone:	6.0.0

@JunTaoLuo - ~what namespace and assembly are you proposing these new APIs should be added to?~

Oh, I see it buried in the proposal:

These APIs will likely be added to a new namespace and assembly, potentially System.Threading.ResourceLimits.

Can you put the proposed namespace in the Proposed API section?

Do we expect any types in dotnet/runtime will implement IResourceLimiter? How do we think user's will get instances of IResourceLimiter objects?

Can you put the proposed namespace in the Proposed API section?

Will do!

Do we expect any types in dotnet/runtime will implement IResourceLimiter?

Yes, I expect we'll be adding default implementations to the BCL such as Rate Limiters (Fixed Window, Sliding Window, Token Bucket) and potentially a semaphore based Concurrency Limiter. I've so been prototyping these implementations but I don't have anything reviewable yet. I will share more details soon. I expect that as a result of these prototypes, there will be some additional APIs, such as an enum/option to configure between stack/queue when waiting via AcquireAsync.

While this does seem like a good idea to have, this API can be misused. And, most systems can already manage their resources very efficiently.

Tagging subscribers to this area: @carlossanlop See info in area-owners.md if you want to be subscribed.

Issue Details

## Background and Motivation Outages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing resource limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing resource limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload. Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) resource limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) resource limits where no explicit release semantics are needed as the resource is replenished automatically over time. This component encompasses the TryAcquire/AcquireAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a `Resource` type which manages the lifecycle of the acquired resources. ## Proposed API ```c# namespace System.Threading.ResourceLimit { public interface IResourceLimiter { // An estimated count of resources. Potential uses include diagnostics. long EstimatedCount { get; } // Fast synchronous attempt to acquire resources. // Set requestedCount to 0 to get whether resource limit has been reached. bool TryAcquire(long requestedCount, out Resource resource); // Wait until the requested resources are available. // Set requestedCount to 0 to wait until resource is replenished. // An exception is thrown if resources cannot be obtained. ValueTask AcquireAsync(long requestedCount, CancellationToken cancellationToken = default); } public struct Resource : IDisposable { // This represents additional metadata that can be returned as part of a call to TryAcquire/AcquireAsync // Potential uses could include a RetryAfter value. public object? State { get; init; } // Constructor public Resource(long count, object? state, Action? onDispose); // Return the acquired resources public void Dispose(); // This static field can be used for rate limiters that do not require release semantics or for failed concurrency limiter acquisition requests. public static Resource NoopResource = new Resource(null, null); } // Extension methods public static class ResourceLimiterExtensions { public static bool TryAcquire(this IResourceLimiter limiter, out Resource resource) { return limiter.TryAcquire(1, out resource); } public static ValueTask AcquireAsync(this IResourceLimiter limiter, CancellationToken cancellationToken = default) { return limiter.AcquireAsync(1, cancellationToken); } } } ``` The struct `Resource` is used to facilitate the release semantics of resource limiters. That is, for non self-replenishing, the returning of the resources obtained via TryAcquire/AcquireAsync is achieved by disposing the `Resource`. This enables the ability to ensure that the user can't release more resources than was obtained. ## Usage Examples For components enforcing limits, the standard usage pattern will be: ```c# if (limiter.TryAcquire(1, out var resource)) { // Resource obtained successfully. using (resource) { // Continue with processing // Resource released when disposed } } else { // Limit exceeded, no resources obtained } ``` In cases where it is known that the resource limiter is a rate limit with no-op release semantics, the usage can be simplified to: ```c# if (limiter.TryAcquire(1, out _)) { // Resource obtained successfully. // Continue with processing } else { // Limit exceeded, no resources obtained } ``` This API will be useful in implementing limits for various BCL types including: - Channels - Pipelines - Streams - HttpClient Ongoing work to prototype intended usage in these BCL types and default implementations for Fixed Window, Sliding Window, TokenBucket algorithms. For example, a rate limit applied to a BoundedChannel: ```c# // Rate limiter added to options var rateLimiter = new FixedWindowRateLimiter(resourcePerSecond: 5); var rateLimitedChannel = Channel.CreateBounded(new BoundedChannelOptions(5) { WriteRateLimiter = rateLimiter }); // This channel will now only write 5 times per second rateLimitedChannel.Writer.TryWrite("New message"); ``` Ongoing experiments in ASP.NET Core for application in Kestrel server limits and a middleware for enforcing limits on request processing is ongoing at https://github.com/dotnet/aspnetcore/tree/johluo/rate-limits. We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR. ## Alternative Designs Major variants considered ### Separate abstractions for rate and concurrency limits A design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a `Resource` and does not possess release semantics. In comparison, the proposed design where the release semantics for rate limits will no-op. However, this design has the drawback for consumers of resource limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred. ### Release APIs on IResourcelimiter Instead of using the `Resource` struct to track release of resources an alternative approach proposes adding a `void Release(long releaseCount)` method on `IResourceLimiter` and require users to call this method explicitly. However, this requires the user to call release with the correct count which can be error prone and the `Resource` approach was preferred. ### A class instead of struct for Resource This approach allows for subclassing to include additional metadata instead of an `object? State` property on the struct. However, it was deemed that potentially allocating a new `Resource` for each acquisition request is too much and a struct was preferred. ### Partial acquisition and release Currently, the acquisition and release of resources is all-or-nothing. Additional APIs will be needed to allow for the ability to acquire a part of the requested resources. For example, 5 resources were requested but willing to accept a subset of the requested resources if not all 5 is available. Similarly, additional APIs can be added to `Resource` to facilitate the release a part of the acquired resource. For example, 5 resources are obtained, but as processing continues, each resource can be released individually. These APIs are not included in this proposal since no concrete use cases has been currently identified. ## Risks This is a proposal for new API and main concerns include: - Consumption patterns of resource limiters should be simple and idiomatic to prevent pitfalls. - The default rate and/or concurrency limiters should suffice in most general use cases. - The abstraction should should be expressive enough to allow for customized resource limiters.

Author:	JunTaoLuo
Assignees:	-
Labels:	`api-suggestion`, `area-System.Threading.Channels`, `untriaged`
Milestone:	6.0.0

While this does seem like a good idea to have, this API can be misused. And, most systems can already manage their resources very efficiently.

This feedback isn't specific enough to comment on. Can you clarify?

This feedback isn't specific enough to comment on. Can you clarify?

In very poorly designed multithreaded applications, this could cause problems when thread A is expecting a value from thread B and can't get it because thread B is being resource limited. As this is very machine-specific and testing for this would be pretty hard, I think this might be an issue in extremely remote cases. Anyways, I absolutely don't think this alone should be a valid reason to scrap this whole API idea, I just think it's worthwhile to mention.

@mangod9 - Can you comment why you moved this issue to the System.Threading.Channels area? While this API proposal is related to Channels, it isn't specific to Channels. From the description above, it is also related to:

Pipelines
Streams
HttpClient

Basically any resource that could be "limited".

So putting it in the System.Threading.Channels area doesn't seem appropriate to me. I assume if this gets implemented as proposed, this will create a whole new "area".

Moved back to threading. We discussed this with @stephentoub and it's generic enough.

sure, seems reasonable.

Do you expect that e.g. HttpClient will depend on these interfaces?

From the description so far, this looks like one of those abstractions that meets with the rest of the stack in the app model specific libraries. It is why I have put it into Extensions initially.

It's not extensions. We discussed HttpClient, Sockets, Pipelines and Channels. The intent is to design it so that it can be used that low in the stack.

Ok, it sounds like the design needs a lot more work and it will be a lot more complex than what's here so far. A doc in http://github.com/dotnet/designs may be a better place to iterate on it.

We've been iterating for months 😄 with various teams. @JunTaoLuo can you put the design notes on dotnet/designs?

I've added a PR to dotnet/designs: https://github.com/dotnet/designs/pull/215. I took what I had in https://github.com/aspnet/specs/tree/main/design-notes/ratelimit and formatted it according to the template in the repo but I'll probably spend some more time adding more details and polishing some sections throughout the day. It's ready for a first look though.

A few notes on our discussions on 5/21, I'll update the proposal document and the API and samples shortly:

Naming of Resource
- This was thought to be too general, consider something like ResourceLease or ResourcePermits
The API with bool on ResourceLease/ResoucePermits is considered more consistent and easy to use
Namespace and assembly
- We want to place this in the System. namespace since we think other System. components may adopt these APIs, and these APIs may therefore eventually move into the shared framework. Currently the consideration is System.Theading.ResourceLimits but we are open to suggestions on where to put these APIs. We were hesitant to adopt a namespace like System.ResourceLimits since we are not sure these APIs warrant creating a new second level area.
- We don't want to put these APIs in Microsoft.Extensions.* since this would prevent us from being able to use them in components such as System.Threading.Channels.
Deliverables in 6.0
- We won't commit to using this in Channels/Pipelines/Streams/HttpClient in .NET 6.0 since there's no explicit use case yet. We will prototype usages to ensure the API can be adopted in these components in the future.
- We aim to ship this as an OOB package in dotnet/runtime in .NET 6. It will not be in the Microsoft.NETCore.App shared framework
- The APIs will target netstandard2.0

@JunTaoLuo no need to file a new item -- I forgot that you already file this one. I've marked it as ready for review and blocking so it shows up on our backlog.

Video

API Review notes:

The current design says that Dispose must be called only once. That is against our general IDisposable guidelines, because sometimes the state of things gets tied together. (The first Dispose should do the release semantics, subsequent calls should no-op).
- There was discussion that it should look like IValueTaskSource, with the struct just having an incremented ID that represents which request it is. (e.g. public long LeaseId { get; })
- Another alternative is just making ResourceLease be a class, especially if "state" is allocated for each call to Acquire.
The public object State on the lease doesn't feel like it's set up to be successful in solving the scenarios that it is projected for.
- One suggested approach is an IReadOnlyDictionary<SomeKey,object>, where SomeKey is a strongly typed string.
"EstimatedCount" isn't really clear (available?, total? currently consumed?)
- Something like EstimatedAvailability would better say what it is a count of.
EstimatedCount also probably shouldn't be a property, so that it can better convey it's a point-in-time (and thus shouldn't be used in a for loop condition).
- Together, this becomes something like GetEstimatedAvailability()
"ResourceLimiter" feels a bit over-reaching. "RateLimiter" seems more appropriate to the described situations. (Though "RateLease" does feel odd)
The namespace has both concern with "Threading" and "ResourceLimits", the "ResourceLimits" being the same over-reaching concern of ResourceLimiter.
Next time @bartonjs will probably suggest that Acquire and WaitAsync use the template method pattern.
- public ResourceLease Acquire(int requestedCount) { if (requestedCount < 0) { throw new ArgumentOutOfRangeException(nameof(requestedCount) } return AcquireCore(requestedCount); }
- protected abstract ResourceLease AcquireCore(int requestedCount);
- Same for WaitAsync

Hey all, I've updated the proposal after trying out some of the recommendations from API review. The two biggest changes are:

Naming

From the feedback from API review, I've renamed ResourceLimiter to RateLimiter and moved it to the System.Runtime.RateLimits namespace. I've also updated other names to better suit this change in naming such as ResourceLease to PermitLease. I've also changed EstimatedCount to AvailablePermits() which is now a function instead of a property.

PermitLease changed from struct to abstract class

This change was made after realizing that it will create problems when used with aggregated rate limiters. Specifically, the aggregated rate limiters will need to return a PermitLease<T> since it needs to hold a reference to AggregatedRateLimiter<T>. Changing it to an abstract class also resolves a lot of concerns such as idempotent dispose and data structure for storing additional metadata. In terms of performance, it is now the responsibility of the rate limiter to pool these leases when needed. I've included some additional notes in the API proposal section and alternative designs with additional reasons for this decision.

Additional metadata will now be retrieved via TryGetMetadata(MetadataName metadataName, [NotNullWhen(true)] out object? metadata) where MetadataName is a wrapper for string with a set of well defined values for commonly used metadata. Each defined MetadataName will have a corresponding extension method to extract the value.

Template method pattern

Is the goal to implement a default argument check? How about uint instead of int?

I'm still working on updating the design doc so that will remain out of date for a little while.

Template method pattern

Is the goal to implement a default argument check?

It's to ensure that all derived implementations have the same baseline argument validation, and enables them to only write the core logic that matters. While Framework Design Guidelines doesn't outright say "no public virtual or public abstract members" it does suggest that they're often regretted later and recommends the Template Method Pattern.

How about uint instead of int?

That would solve this particular problem, but isn't very .NET-ty.

I see, I'll update the proposal and samples to use the template method pattern.

Video

API Review notes:

Looking at System.Runtime. again, we feel that implies something too low level, so we changed back to System.Threading.
Now that it's in System.Threading, we believe that System.Threading.RateLimiting rolls off the tongue better than System.Threading.RateLimits.
All of the extension methods should be non-virtual instance methods, or default parameters
AvailablePermits needs a verb => GetAvailablePermits
We felt that the self-documenting behavior of default parameters was better for Acquire and WaitAsync, which justified keeping the template method pattern.
We did a pretty hefty adjustment to the PermitLease.TryGetMetadata shape to allow the well known keys to indicate what types the metadata value for each key uses.
We made PermitLease.Dispose follow the Dispose pattern, because that's the pattern.
We renamed PermitLease to RateLimitLease to give it better affinity to the RateLimiter type.
- There may be a better name for the permitCount parameters given this rename, but we don't have
There's a slightly open question as to whether "MetadataName" is overly general, and warrants an affinitized name.

Package name: System.Threading.RateLimiting Primary namespace: System.Threading.RateLimiting

namespace System.Threading.RateLimiting
{
    public abstract class RateLimiter
    {
        public abstract int GetAvailablePermits();
        public RateLimitLease Acquire(int permitCount = 1);
        protected abstract RateLimitLease AcquireCore(int permitCount);
        public ValueTask<RateLimitLease> WaitAsync(int permitCount = 1, CancellationToken cancellationToken = default);
        protected abstract ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken);
    }
    public abstract class RateLimitLease : IDisposable
    {
        public abstract bool IsAcquired { get; }

        public abstract bool TryGetMetadata(string metadataName, out object? metadata);
        public bool TryGetMetadata<T>(MetadataName<T> metadataName, [MaybeNullWhen(false)] out T metadata);            

        public abstract IEnumerable<string> MetadataNames { get; }
        public virtual IEnumerable<KeyValuePair<string, object?>> GetAllMetadata();

        public void Dispose() { Dispose(true); GC.SuppressFinalize(this); }
        protected virtual void Dispose(bool disposing);
    }
    public static class MetadataName
    {
        public static MetadataName<TimeSpan> RetryAfter { get; } = Create<TimeSpan>("RETRY_AFTER");
        public static MetadataName<string> ReasonPhrase { get; } = Create<string>("REASON_PHRASE");

        public static MetadataName<T> Create<T>(string name) => new MetadataName<T>(name);
    }
    public sealed class MetadataName<T> : IEquatable<MetadataName<T>>
    {
        public MetadataName(string name);
        public string Name { get; }
    }
}

Video

We decided to rename the QueueProcessingOrder members from { ProcessOldest, ProcessNewest } to { OldestFirst, NewestFirst }
Instead of having a base RateLimiterOptions class, just copy the properties into the derived types.
The mutability of the options types raised some supportability questions. So we recommend sealing all the options types and removing the property setters.
- There was further discussion around removing the options types altogether and collapsing them into ctor parameters.
TokenBucket's settings shouldn't mix the names "permit" and "token". Unify on "token".
There was a discussion around the exposed options and the limits to the Acquire method.
- Since the rate limiters do not expose the PermitLimit (max) value a call to Acquire that exceeds the PermitLimit shouldn't be an ArgumentException (ArgumentExceptions generally mean "caller bug", and in this case the caller couldn't know the input was out of bounds)
- InvalidOperationException is OK
- Returning the RateLimitLease saying that it couldn't be acquired is potentially also OK, but there was a suggestion that might lead to infinite/excessive retries on a case that will never succeed.
TokenBucketLimiter.Replenish isn't authoritative (whether it replenishes or not depends on configuration), but the method name doesn't suggest that.
- It became public bool TryReplenish()
During discussion the need to pulse/refresh the fixed and sliding window timers came up, which added TryRefresh methods and AutoRefresh configuration properties.

namespace System.Threading.RateLimiting
{
    // This specifies the behaviour of `WaitAsync` When PermitLimit has been reached
    public enum QueueProcessingOrder
    {
        ProcessOldest,
        ProcessNewest
    }

    public sealed class ConcurrencyLimiterOptions
    {
        public ConcurrencyLimiterOptions(int permitLimit, QueueProcessingOrder queueProcessingOrder, int queueLimit);

        // Specifies the maximum number of permits for the limiter
        public int PermitLimit { get; }
        // Permits exhausted mode, configures `WaitAsync` behaviour
        public QueueProcessingOrder QueueProcessingOrder { get; }
        // Queue limit when queuing is enabled
        public int QueueLimit { get; }
    }

    public sealed class TokenBucketRateLimiterOptions
    {
        public TokenBucketRateLimiterOptions(
            int tokenLimit, 
            QueueProcessingOrder queueProcessingOrder, 
            int queueLimit,
            TimeSpan replenishmentPeriod,
            int tokensPerPeriod,
            bool autoReplenishment = true);

        // Specifies the maximum number of permits for the limiter
        public int TokenLimit { get; }
        // Permits exhausted mode, configures `WaitAsync` behaviour
        public QueueProcessingOrder QueueProcessingOrder { get; }
        // Queue limit when queuing is enabled
        public int QueueLimit { get; }
        // Specifies the period between replenishments
        public TimeSpan ReplenishmentPeriod { get; }
        // Specifies how many tokens to restore each replenishment
        public int TokensPerPeriod { get; }
        // Whether to create a timer to trigger replenishment automatically
        // This parameter is optional
        public bool AutoReplenishment { get; }
    }

    // Window based rate limiter options
    public sealed class FixedWindowRateLimiterOptions
    {
        public FixedWindowRateLimiterOptions(
            int permitLimit, 
            QueueProcessingOrder queueProcessingOrder, 
            int queueLimit,
            TimeSpan window,
            bool autoRefresh = true);

        // Specifies the maximum number of permits for the limiter
        public int PermitLimit { get; }
        // Permits exhausted mode, configures `WaitAsync` behaviour
        public QueueProcessingOrder QueueProcessingOrder { get; }
        // Queue limit when queuing is enabled
        public int QueueLimit { get; }
        // Specifies the duration of the window where the rate limit is applied
        public TimeSpan Window { get; }

        public bool AutoRefresh { get; }
    }

    public sealed class SlidingWindowRateLimiterOptions
    {
        public SlidingWindowRateLimiterOptions(
            int permitLimit, 
            QueueProcessingOrder queueProcessingOrder, 
            int queueLimit,
            TimeSpan window,
            int segmentsPerWindow,
            bool autoRefresh = true);

        // Specifies the maximum number of permits for the limiter
        public int PermitLimit { get; }
        // Permits exhausted mode, configures `WaitAsync` behaviour
        public QueueProcessingOrder QueueProcessingOrder { get; }
        // Queue limit when queuing is enabled
        public int QueueLimit { get; }
        // Specifies the duration of the window where the rate limit is applied
        public TimeSpan Window { get; }
        // Specifies the number of segments the Window should be divided into
        public int SegmentsPerWindow { get; }

        public bool AutoRefresh { get; }
    }

    // Limiter implementations
    public sealed class ConcurrencyLimiter : RateLimiter 
    { 
        public ConcurrencyLimiter(ConcurrencyLimiterOptions options);
        public override int GetAvailablePermits();
        protected override RateLimitLease AcquireCore(int permitCount);
        protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
    }
    public sealed class TokenBucketRateLimiter : RateLimiter 
    { 
        public FixedWindowRateLimiter(FixedWindowRateLimiterOptions options);
        // Attempts replenish the bucket, returns true if enough time had elapsed and it replenishes; otherwise, false.
        public bool TryReplenish();
        public override int GetAvailablePermits();
        protected override RateLimitLease AcquireCore(int permitCount);
        protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
    }
    public sealed class FixedWindowRateLimiter : RateLimiter 
    { 
        public FixedWindowRateLimiter(FixedWindowRateLimiterOptions options);
        public bool TryRefresh();
        public override int GetAvailablePermits();
        protected override RateLimitLease AcquireCore(int permitCount);
        protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
    }
    public sealed class SlidingWindowRateLimiter : RateLimiter 
    { 
        public SlidingWindowRateLimiter(SlidingWindowRateLimiterOptions options);
        public bool TryRefresh();
        public override int GetAvailablePermits();
        protected override RateLimitLease AcquireCore(int permitCount);
        protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
    }
}

What is going to happen to this now? The labels suggest this will make it into .NET 6, has work on this started already / is it assigned to someone? Would a community contribution be helpful here? I'm very excited to get my hands on this 😄

@HurricanKai you should hopefully see a WIP draft pull request that will reference this issue. Excited for this too!

Looking forward to it!

Moving to 7.0 as these APIs won't be ready for 6.

In many networking scenarios, rate limits change dynamically based on messages being received, e.g. based on retry-after HTTP headers. It seems to me like a "message" needs to be one of the inputs to the APIs computing the limits.

Based on the examples, I can see how this could be considered a solution for rate limiting in HTTP applications. Some limitations I can see are:

It looks as if the same set of permits would be shared across all users. Do you envisage developers needing to maintain a collection of limits per user or is the concept of scope something that could be built-in?
Are you considering how limits may be synchronised across multiple instances of the application?

The need to always check the "did I really get the lease I wanted" property (which is currently IsAcquired, I think?) after every call is a real deal breaker for me. That's far too easy to mess up... to the point that I would probably just keep copy/pasting the same wrappers classes I paste into every project I start rather than adopt this. And in many scenarios, you won't realize you've got incorrect code until you try to run your workloads at scale, since the only interaction will be Acquire/Dispose, and whatever the limit was trying to limit isn't limited... and then you have an outage. That seems like a troubling pattern.

While working through the API reviews above and implementing the ConcurrencyLimiter and TokenBucketRateLimiter there were a couple changes made that differed from the API reviews:

public bool TryGetMetadata<T>(MetadataName<T> metadataName, [MaybeNullWhen(false)] out T metadata); -> public bool TryGetMetadata<T>(MetadataName<T> metadataName, [MaybeNull] out T metadata); This is because a null metadata value is valid so the attribute should reflect that.

InvalidOperationException from Acquire or WaitAsync for permit count greater than permit limit was changed to ArgumentOutOfRangeException, the docs for the exception indicate it is ideal for this scenario, and the base RateLimiter.Acquire/WaitAsync documents ArgumentOutOfRangeException not InvalidOperationException (since it can't know), see https://github.com/aspnet/AspLabs/pull/387#discussion_r730152043 for PR comment.

For those interested in trying out the rate limiting APIs and current implementations you can find the package System.Threading.RateLimiting on the NuGet feed https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet6-transport/nuget/v3/index.json. Early feedback is appreciated, let us know what works and what doesn't work!

I think this is very exciting work (also the larger work in ASP.NET).

If a RateLimitLease can be in a "not acquired" state then it's not actually a lease at all. It's a failed attempt at obtaining a lease. If this design is kept, that class could be named more accurately i.e. RateLimitRequest or RateLimitAcquision. The issue speaks to this point:

The RateLimitLease.IsAcquired property is used to express whether the acquisition request was successful.

Here it says that it's a request and not a lease per-se. I read the discussion of the alternative design (returning bool). I think we can have the best of both worlds:

class RateLimitRequestResult //Not disposable, just a DTO
{
    public RateLimitLease Lease { get; } //Can return null
    //All the metadata members here
}

class RateLimitLease : IDisposable
{
    //This class is purely a handle to the lease
}

var result = rateLimiter.Acquire(1);
if (result.Lease != null)
{
     using var lease = result.Lease;
     //Use lease and metadata
}
else
{
    //Use metadata only
}

The current design reminds me of the regex design where you get back a Match object with Match.Success == false. I don't think that design was a success, or I must not be understanding the point.

What is GetAvailablePermits supposed to return when the count is entirely unknown? Some algorithms might not have anything useful to tell (for example, a rate limiter obtaining its leases by querying a remote service). Maybe make it int? TryGetAvailablePermits() instead?

dotnet / runtime

Rate limit APIs #52079

Background and Motivation

Proposed API - Abstractions

Usage Examples

Propsed API - Concrete Implementations

Adoption samples

Alternative Designs

Token bucket rate limiter external replenishment

Subclasses can override default behaviour

Heuristics based replenishment.

Separate abstractions for rate and concurrency limits

A struct instead of class for RateLimitLease

Permit as reference ID

RateLimitLease state

Interfaces

Property bags

Feature collection

A `bool` returned by `TryAcquire` to indicate success/failure and throw for `WaitAsync` to indicate failure

Release APIs on RateLimiter

Partial acquisition and release

Risks

dotnet / runtime

Rate limit APIs #52079

Background and Motivation

Proposed API - Abstractions

Usage Examples

Propsed API - Concrete Implementations

Adoption samples

Alternative Designs

Token bucket rate limiter external replenishment

Subclasses can override default behaviour

Heuristics based replenishment.

Separate abstractions for rate and concurrency limits

A struct instead of class for RateLimitLease

Permit as reference ID

RateLimitLease state

Interfaces

Property bags

Feature collection

A bool returned by TryAcquire to indicate success/failure and throw for WaitAsync to indicate failure

Release APIs on RateLimiter

Partial acquisition and release

Risks

A `bool` returned by `TryAcquire` to indicate success/failure and throw for `WaitAsync` to indicate failure