spring-cloud / spring-cloud-gateway

An API Gateway built on Spring Framework and Spring Boot providing routing and more.
http://cloud.spring.io
Apache License 2.0
4.52k stars 3.32k forks source link

Implement Retryable feature of spring cloud LoadBalancer for ReactiveLoadBalancerClientFilter #3045

Open noisette44 opened 1 year ago

noisette44 commented 1 year ago

Currently, the ReactiveLoadBalancerClientFilter, which implements load balancer usage, does not take into account the retry features specific to the load balancer (e.g., the use of maxRetriesOnNextServiceInstance).

It appears that the ReactiveLoadBalancerClientFilter is primarily based on the functionality of the ReactorLoadBalancerExchangeFilterFunction (from spring-cloud-commons), but there is no implementation that incorporates the features of the RetryableLoadBalancerExchangeFilterFunction.

In my case, the use of Spring Cloud Gateway Retry does not necessarily target a different instance than the first one that failed (it depends on concurrency).

Sample with load balancer with 2 instances (I1 - DOWN and I2 - UP)

Successful case Request 1 => LB choose I1 => FAILED Retry Request 1 => LB choose I2 => OK

Failed case Request 1 => LB choose I1 => FAILED Request 2 (concurrency call) => LB choose I2 => OK Retry Request 1 => LB choose I1 again => FAILED

Thank you for your help.

spring-cloud-gateway ReactiveLoadBalancerClientFilter.java

spring-cloud-commons ReactorLoadBalancerExchangeFilterFunction.java RetryableLoadBalancerExchangeFilterFunction.java

kzander91 commented 7 months ago

Unfortunately, the documentation doesn't mention this limitation, according to the info box in the section on load balancing:

Gateway supports all the LoadBalancer features.

zhaozhiguang commented 6 months ago

I also encountered this problem,Configure retry. enabled: true, but it seems that the gateway does not have load balancing retry

iLaoke commented 3 months ago

Interesting question, I also face a similar problem, but I want to perform some special business operations before retry, such as identifying the previous service instance as invalid, and then removing the selection of this abnormal service instance when the service instance is selected next time. But I currently don't know how to catch the 500 failures or exceptions of the first execution and add these special business operations before the next retry.