Open ronyrv13 opened 3 weeks ago
I don't think a retry policy will cause a different choice from the weighted_clusters
. I believe one of the clusters is chosen, and then all retries will happen within that cluster.
You could probably work around this by having a single cluster with both DNS names, and have different weights within the cluster endpoints.
Thank you, @ggreenway, for your response. I did check that option, but I couldn't test it because each of my backend endpoints requires a different API key in the headers.
Essentially, we need to manage different API keys for different endpoints, and I didn't find any filter or derivative that supports this with a single cluster in Envoy.
If we can address this issue, I could easily switch back to a single cluster with multiple endpoints. Do you have any suggestions for overcoming this header blocker?
you can create two aggregated clusters like: AgCluster1:
In aggregated clusters, sub clusters are in priority order. For example in AgCluster1, aoai_gpt35_cluster_1 is higher priority and all traffic will be sent to it and if it fails traffic will be forwarded to aoai_gpt35_cluster_2 Similarly in AgCluster2, aoai_gpt35_cluster_2 is higher priority with failover possibility to aoai_gpt35_cluster_1
Now you can put these two aggregated clusters in weighted clusters with 100-100 weight:
Traffic will go 50-50 to each, where within each aggregated cluster, highest priority sub-cluster has a backup/failover sub-cluster
I am using Envoy to route traffic between multiple clusters. I've configured Envoy to retry requests when a 429 (Too Many Requests) response is received. However, despite configuring the retry_policy with retriable_status_codes: [429], Envoy does not appear to retry the request to the other cluster when a 429 response is received from the first cluster.
Envoy Configuration
Below is a simplified version of the configuration I am using:
Expected Behavior
When the primary cluster (aoai_gpt35_cluster_1) returns a 429 status code, Envoy should retry the request on the secondary cluster (aoai_gpt35_cluster_2).
Actual Behavior
Envoy does not retry the request on the secondary cluster after receiving a 429 response from the primary cluster. Instead, the request fails after receiving the 429 response, without retrying to the other cluster as expected.
If this is a configuration issue, I would appreciate guidance on how to properly configure retries for multiple clusters. Otherwise, it may be a bug with retry handling in the presence of multiple clusters.