hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.44k stars 4.43k forks source link

Support retry delays in Consul Connect routers #10851

Open evandam opened 3 years ago

evandam commented 3 years ago

Feature Description

I'm not sure what the Envoy config would look like, but it would be really helpful if the retry delay and backoff strategy was configurable in a service router.

For example, being able to set an initial delay of 100ms with exponential retries would be nice.

Use Case(s)

Kind = "service-router"
Name = "httpbin"
Routes = [
  {
    Destination {
      ServiceSubset = ""
      RequestTimeout = "1s"
      NumRetries = 2
      RetryOnConnectFailure = true
      RetryOnStatusCodes = [500]
      # NOTE: Fields to add
      RetryDelay = "100ms"
      RetryStrategy = "exponential"
    }
  },
]

Thanks folks!

kisunji commented 3 years ago

Hi @evandam, Envoy already uses a jittered exponential backoff strategy as outlined here.

Could you elaborate on the use-case of a RetryDelay? I believe envoy's default is 25ms up to a max wait of 250ms (10 * base)

evandam commented 3 years ago

Good to know, thanks! I think it's mostly just something I was expecting to be able to set with all the other config options. For example, if a service is unavailable and we're getting 503s back, maybe we want a bit of a longer delay before we retry to let the upstream recover before retrying again. Just a thought, thanks!

kisunji commented 3 years ago

Got it! Given the nature of exponential backoff, I think tuning retries with a configurable base/delay is not as valuable (and sometimes misleading if the exponential strategy is not obvious to the operator).

That being said, we'll leave this issue open for contributors to take on.

github.com/envoyproxy/go-control-plane/envoy/config/route/v3.RetryPolicy.RetryBackoff can be injected in this block https://github.com/hashicorp/consul/blob/v1.10.1/agent/xds/routes.go#L313