Closed ronyrv13 closed 1 month ago
I don't think a retry policy will cause a different choice from the weighted_clusters
. I believe one of the clusters is chosen, and then all retries will happen within that cluster.
You could probably work around this by having a single cluster with both DNS names, and have different weights within the cluster endpoints.
Thank you, @ggreenway, for your response. I did check that option, but I couldn't test it because each of my backend endpoints requires a different API key in the headers.
Essentially, we need to manage different API keys for different endpoints, and I didn't find any filter or derivative that supports this with a single cluster in Envoy.
If we can address this issue, I could easily switch back to a single cluster with multiple endpoints. Do you have any suggestions for overcoming this header blocker?
you can create two aggregated clusters like: AgCluster1:
In aggregated clusters, sub clusters are in priority order. For example in AgCluster1, aoai_gpt35_cluster_1 is higher priority and all traffic will be sent to it and if it fails traffic will be forwarded to aoai_gpt35_cluster_2 Similarly in AgCluster2, aoai_gpt35_cluster_2 is higher priority with failover possibility to aoai_gpt35_cluster_1
Now you can put these two aggregated clusters in weighted clusters with 100-100 weight:
Traffic will go 50-50 to each, where within each aggregated cluster, highest priority sub-cluster has a backup/failover sub-cluster
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.
I am using Envoy to route traffic between multiple clusters. I've configured Envoy to retry requests when a 429 (Too Many Requests) response is received. However, despite configuring the retry_policy with retriable_status_codes: [429], Envoy does not appear to retry the request to the other cluster when a 429 response is received from the first cluster.
Envoy Configuration
Below is a simplified version of the configuration I am using:
Expected Behavior
When the primary cluster (aoai_gpt35_cluster_1) returns a 429 status code, Envoy should retry the request on the secondary cluster (aoai_gpt35_cluster_2).
Actual Behavior
Envoy does not retry the request on the secondary cluster after receiving a 429 response from the primary cluster. Instead, the request fails after receiving the 429 response, without retrying to the other cluster as expected.
If this is a configuration issue, I would appreciate guidance on how to properly configure retries for multiple clusters. Otherwise, it may be a bug with retry handling in the presence of multiple clusters.