spring-projects / spring-boot

Spring Boot
https://spring.io/projects/spring-boot
Apache License 2.0
74.95k stars 40.65k forks source link

Springboot cache the IP address of the DNS Endpoint forever #19376

Closed muthu-cs closed 4 years ago

muthu-cs commented 4 years ago

When an IP changes in a dns name, the spring applications are continue to send the request to the old IP and taking the new IP address. Even-though java DNS cahce ttl settings are disabled and at the OS level dns resolution is pointing to the new IP. It seems spring boot resolved the dns for the very first time and caching the IP, afterwards it never refreshes.

snicoll commented 4 years ago

@muthu-cs I am not sure I understood what you mean by "DB Endpoint". Did you mean the db health indicator? If so, Spring Boot isn't doing anything more than asking a connection to the connection pool. Can you please clarify what you meant and why do you think this is an issue in Spring Boot?

muthu-cs commented 4 years ago

@snicoll you can assume any DNS endpoint. We are using Aurora DB cluster endpoints, which is nothing but a route 53/DNS name, when we failover a database, the IP of the DB changes, however the DNS name remains the same. Though at the OS level DNS resolved to the new IP, spring applications continue to connect to the OLD db on the old IP. it seems spring caches the IP during the application bootup and it never resolves the DNS later on. hope that helps.

snicoll commented 4 years ago

I am afraid it doesn’t. Can you please answer my questions?

muthu-cs commented 4 years ago

During our test, a vanilla java program trying to access a DNS endpoint always resolves to the latest IP bound to the DNS endpoint. however spring applications resolves to an IP during application boot up time and it never resolves to the latest IP unless we reboot the app server. it seems spring/some spring library which does the DNS resolution caches the IP during initialization and never resolves to DNS or honors the DNS cahce TTL.

snicoll commented 4 years ago

I am afraid your already said that and that didn’t answer my questions either. I see no evidence of spring boot being involved in this (vs. the connection pool configuration, that you haven’t shared either). To help us move forward, please share a small sample (zip or github repo) we can run ourselves to reproduce the issue.

spring-projects-issues commented 4 years ago

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

muthu-cs commented 4 years ago

Sure Will provide the sample program to re-produce tomorrow.

spring-projects-issues commented 4 years ago

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open the issue.

jaswindervirk commented 4 years ago

we recently upgraded our spring webflux application to 2.1.10 and are facing this issue. We are using route 53 for some microsevice but when that service shift their load to failover we are still hitting their old ip. It seems spring somehow caches the old ip at the time when application starts. Possible solution - 1. reboot your application

  1. While making TCP connection set keepAlive false to make new HTTP connection every time
wilkinsona commented 4 years ago

@jaswindervirk If making a new connection every time fixes the problem, it does not sound like a DNS caching problem. If it were, requiring a new connection every time would not help as the new connection would attempt to connect to the old IP address. It sounds to me like a kept-alive connection is not being closed when you attempt to shift the load. Reuse of that connection will then result in requests being sent to the old IP address.

njssferreira commented 4 years ago

During our test, a vanilla java program trying to access a DNS endpoint always resolves to the latest IP bound to the DNS endpoint. however spring applications resolves to an IP during application boot up time and it never resolves to the latest IP unless we reboot the app server. it seems spring/some spring library which does the DNS resolution caches the IP during initialization and never resolves to DNS or honors the DNS cahce TTL.

Hey, did you manage to solve this issue? I'm facing exactly the same issue...

jemmy-dangi12 commented 3 years ago

Facing the Exact Same issue. Please let us know how did you resolve it

smithapitla commented 3 years ago

We are facing same issue. Please let us know how to resolve this.

YuLimin commented 2 years ago

Please provide the sample project for re-produce this issue then analyze what's the root cause.

agaddamu commented 2 years ago

Experiencing the same issue. Traffic toggle isn't being honored. Old IP address is being cached.

Scenario

East and West regions East region - blue and green stacks West Region - blue and green stacks Active Stacks : Blue Inactive Stacks : Green

Deployment Action

Toggled 100% traffic from Blue to Green and observed traffic sticking to the Blue stacks.

Remediation

  1. Toggled 100% traffic to Single Region (East) and brought down the traffic to 0% on WEST
  2. Waited for 5 mins
  3. Toggled traffic back to both regions and did not observe sticky ip addresses on WEST
  4. Repeated the same for EAST.

Any guidance on the internals of Spring WebClient

wilkinsona commented 2 years ago

@agaddamu It depends on the underlying HTTP client that you're using with WebClient. The default is Reactor Netty. Without knowing how your traffic toggle behaves, I doubt that anyone will be able to offer much guidance. As said above, it could be a problem with keep-alive connections or it could be something else entirely. If you have any further questions, please follow up on Stack Overflow or Gitter. As mentioned in the guidelines for contributing, we prefer to use GitHub issues only for bugs and enhancements.

/cc @violetagg for awareness

rstoyanchev commented 2 years ago

By WebClient internals, you mean the underlying HTTP client. By default, this is Reactor Netty. Please check the section on host name resolution and the available settings for caching and eviction.

wilkinsona commented 2 years ago

Thanks, @rstoyanchev!

praveenchandran commented 2 years ago

Maybe this could help someone stumbling upon this issue. The issue could be related to default ttl of JVM DNS cache. It is set to Forever (-1) by default. You can set the cache TTL in your jre/lib/security/java.security file as follows : networkaddress.cache.ttl=60 networkaddress.cache.negative.ttl=60

PS: Setting the value too low could introduce latencies and increase the number of DNS lookups in your application. Ref: https://docs.oracle.com/javase/7/docs/technotes/guides/net/properties.html

ikorchynskyi commented 1 year ago

This may be the keep-alive issue. I had a similar problem with RestTemplate, and seems that the final fix is to add Connection: close header to requests to the other microservices.

viettrung9012 commented 1 year ago

I was facing a similar issue today when setting cache TTL programmatically using Security.setProperty("networkaddress.cache.ttl", "0");, it cannot override the value set in jre/lib/security/java.security. What I found out is that when running SpringBoot from IntellijIDEA by default it runs with JMX agent, and when the agent starts up it will call InetAddressCachePolicy once, and hence the value will be fetched from the java.security file, if we run without JMX agent, then using Security.setProperty("networkaddress.cache.ttl", "0") works fine.

TLDR: agent can cause Cache Policy to be fixed before SpringBootApplication starts. The more reliable method to set the cache setting would be making changes to the jre/lib/security/java.security file.

My setup was SpringBoot 2.7.16, JDK 11

banandh commented 11 months ago

We are also facing the same issue. In UAT we dont have option to update JVM security so we were updating the values via command line and system.setProperty but thats not helping us. We are again ended up the IP address caching issue. Need to understand what is going wrong. Also how to verify if the DNS cache flush out happens properly?