Open dimzul opened 2 months ago
@spring-cloud-issues any update on this issue? i was also facing same problem mentioned in this issue and was looking for help. I also commented in this open issue https://github.com/spring-cloud/spring-cloud-gateway/issues/561 . pls provide an update when we are getting this issue fixed? I tried all the work arounds mentioned with no luck.
@dimzul did you find any workarounds for this problem? I am happy to connect with you to discuss further.
@bindupatnaik, unfortunately, no: all provided solutions don't have any effect on DNS cache TTL in Netty. I've debugged it locally and tested in real cluster and got the same result with default TTL applied. Also no effect with switching to JVM built-in resolver via:
@Override
public HttpClient customize(HttpClient httpClient) {
httpClient
.resolver(DefaultAddressResolverGroup.INSTANCE)
.tcpConfiguration(tcpClient -> tcpClient.resolver(DefaultAddressResolverGroup.INSTANCE));
return httpClient;
}
If you find a solution, please share it here.
@dimzul
This configuration is not quite correct. You either use the HttpClient#resolver
or HttpClient#tcpConfiguration
but never both.
I would recommend HttpClient#resolver
. HttpClient#tcpConfiguration
is deprecated and everything that you can configure there, you can configure with direct invocation of HttpClient
.
@Override
public HttpClient customize(HttpClient httpClient) {
httpClient
.resolver(DefaultAddressResolverGroup.INSTANCE)
.tcpConfiguration(tcpClient -> tcpClient.resolver(DefaultAddressResolverGroup.INSTANCE));
return httpClient;
}
DefaultAddressResolverGroup.INSTANCE
is the JDK's built-in domain name lookup mechanism so you need to use the JDK configuration for the ttl
.
I also do not recommend using HttpClient#from
which is also deprecated.
@Override public HttpClient customize(HttpClient httpClient) { httpClient .resolver(DefaultAddressResolverGroup.INSTANCE) .tcpConfiguration(tcpClient -> tcpClient.resolver(DefaultAddressResolverGroup.INSTANCE)); return httpClient; }
Note that the fluent config methods in reactor-netty's HttpClient
don't modify the instance -- they configure and return a duplicated instance. This has bitten me before, and was ultimately solved by reassigning each call or returning the entire chain. Try this:
@Override
public HttpClient customize(HttpClient httpClient) {
return httpClient
.resolver(DefaultAddressResolverGroup.INSTANCE)
.tcpConfiguration(tcpClient -> tcpClient.resolver(DefaultAddressResolverGroup.INSTANCE));
}
Problem
In k8s environment multiple instances of the same service are hidden by k8s Service name (like,
my-test.my-namespace.svc.cluster.local
). Same goes with DNS servers in k8s: multiple instances of it are hidden by k8s Service. In a case when one DNS server instance dies and emerges on a new k8s node with another IP address, due to DNS cache in Netty (transitive dependency of project-reactor) via DnsNameResolverBuilder and DefaultAuthoritativeDnsServerCache, IP addresses of DNS servers are cached forInteger.MAX_VALUE
seconds by default and old/cached IP address is used for DNS resolution. This results in a request to the IP address with no listening DNS server and causes next error:Steps to reproduce
Following suggestions by @violetagg and @spencergibb on customizing DNS cache TTL and TcpClient in Spring Cloud Gateway, a next configuration was made:
Having such a configuration, multiple instances of
DnsNameResolverBuilder
were created: 2 with the configured cache TTL and 2 with the default cache TTL:But when an actual request comes in, the
DnsNameResolverBuilder
with a default cache TTL configuration is used and DNS cache with default TTL (2147483647 seconds) is applied:Expected result
There is a way to configure DNS cache TTL via Spring Framework.
Versions
spring boot/spring-cloud-starter-gateway/spring-boot-starter-webflux: 3.2.8 reactor-netty-http: 1.1.21 netty: 4.1.111.Final