Currently, ARC maintains a global view of requests originating from a given sink; however, different request partitions can have correlated response behavior from the upstream. For example, a Datadog Logs sink where some incoming events have different API keys, one of which is invalid, can cause ARC to think it should back off all requests, and not just the partition with the invalid API key.
Ideally, ARC would maintain separate limits per partition key that can impact upstream responses so that partitions don't conflict.
Open questions:
How should the documentation/configuration for ARC be updated to support the limits being applied per relevant partition key?
Should we tag ARC metrics with the partition key used to segment the state? I think we'd need to, but this could result in a high cardinality depending on the partition key.
A note for the community
Problem
Broken off from https://github.com/vectordotdev/vector/issues/21402
Currently, ARC maintains a global view of requests originating from a given sink; however, different request partitions can have correlated response behavior from the upstream. For example, a Datadog Logs sink where some incoming events have different API keys, one of which is invalid, can cause ARC to think it should back off all requests, and not just the partition with the invalid API key.
Ideally, ARC would maintain separate limits per partition key that can impact upstream responses so that partitions don't conflict.
Open questions:
Configuration
No response
Version
v0.42.0
Debug Output
No response
Example Data
No response
Additional Context
No response
References