helidon-io / helidon

Java libraries for writing microservices
https://helidon.io
Apache License 2.0
3.44k stars 561 forks source link

Provide adaptive concurrency limits #8897

Open arouel opened 1 week ago

arouel commented 1 week ago

Why

There may be several reasons why adaptive concurrency limiting is preferred over using a fixed limit:

  1. Dynamic System Conditions: In a distributed system, conditions such as load, resource availability, and topology can change frequently due to factors like auto-scaling, partial outages, code deployments, or fluctuations in traffic patterns. A fixed concurrency limit cannot adapt to these dynamic conditions, leading to either under-utilization of resources or overwhelmed services.

  2. Latency Sensitivity: Different services or use cases may have varying sensitivity to latency. A fixed concurrency limit cannot account for these differences, potentially leading to either excessive queuing and high latency or under-utilization of resources. An adaptive approach can adjust the limit based on observed latencies, maintaining desired performance characteristics.

  3. Simplicity and Autonomy: Manually determining and configuring fixed concurrency limits for every service or instance can be a complex and error-prone process, especially in large-scale distributed systems. An adaptive approach can autonomously and continuously adjust the limit without manual intervention, simplifying operations and reducing the risk of misconfiguration.

  4. Resilience and Self-Healing: By automatically adjusting the concurrency limit based on observed conditions, an adaptive approach promotes resilience and self-healing capabilities. It allows services to shed excessive load during periods of high demand or resource constraints, preventing cascading failures and promoting graceful degradation.

While a fixed concurrency limit may be easier to reason about and configure initially, it lacks the flexibility and adaptability required in modern, dynamic distributed systems. An adaptive approach provides the ability to continuously optimize performance, resource utilization, and resilience in the face of changing conditions, ultimately leading to a more robust and efficient system.

Suggestion

Ideally, a user would be able to describe the limiting algorithm in the [ListenerConfig](https://helidon.io/docs/v4/apidocs/io.helidon.webserver/io/helidon/webserver/ListenerConfig.html#maxConcurrentRequests()) that fit their needs instead of a fixed number for maxConcurrentRequests. The Limit and Limiter interfaces from Netflix's concurrency limits library are a good starting point. In the first iteration we should provide the following implementations

Instead of passing a Semaphore for requests in the ServerListener to the ConnectionHandler we would pass a Limiter implementation that holds the configured Limit algorithm. The Limiter would be used instead if the Semaphore to acquire a token per request. If no token can be acquired the limit is exceeded and the request can be rejected.

While implementing a Proof of Concept (PoC), I asked myself where do we want to place the Limiting API. I guess, we need a new submodule concurrency-limits which holds Limit and Limiter interfaces and a standard set of implementations. The webserver module then depends on concurrency-limits.

Another question is, how do we want to make the various limiting algorithm configurable. Today, we have just the single property maxConcurrentRequests, but in future we want to choose from a set of different implementations, e.g. no limit, fixed limit, AMID limit, Vegas limit etc.

When testing the PoC, I noticed that when the access log feature is activated, rejected requests are not logged in the access log file. Is this behavior intentional or is this a bug?

Additionally, extending the metrics (looking at KeyPerformanceIndicatorMetricsImpls) would be helpful, to be able to observe how a service is doing. I'm thinking here about the following request limiting metrics:

romain-grecourt commented 1 week ago

@spericas @tomas-langer @danielkec FYI.

tomas-langer commented 2 days ago

Hello, this sounds like a great idea. I will provide a few answer for questions you posted:

Some other thoughts: