spring-cloud / spring-cloud-netflix

Integration with Netflix OSS components
http://cloud.spring.io/spring-cloud-netflix/
Apache License 2.0
4.87k stars 2.44k forks source link

Eureka Service Discovery, after a network failure, updates the client status to DOWN, but when the HEARTBEAT is done from the client it does not change status to UP #4278

Open viktoralfa19 opened 5 months ago

viktoralfa19 commented 5 months ago

Please, we have the following problem in production: For some reason (we assume network problems), the client sends the status of the instance to be updated as DOWN in the Service Discovery, but when everything normalizes the network, the client begins to send the HEARTBEAT normally with the status UP, but the instance in Eureka is still in the DOWN state, it does not change, which causes the applications that consume that service or client to receive a 404 error in their requests.

In summary, the instance on the Eureka Service Discovery server marks the client as DOWN, but does not change its status back to UP, to make its requests available to whoever needs them.

Context(Important production values ​​have been moved to a local environment):

Clients are microservices developed in .Net 6

The current configuration of the services are these:

"eureka": {
    "client": {
      "serviceUrl": "http://localhost:8761/eureka/",
      "shouldRegisterWithEureka": true,
      "shouldFetchRegistry": false,
      "health": {
        "enabled": false,
        "checkEnabled": false
      },
      "validateCertificates": false
    },
    "instance": {
      "appName": "iservice-api",
      "port": 5007,
      "leaseRenewalIntervalInSeconds": 20,
      "leaseExpirationDurationInSeconds": 25,
      "instanceEnabledOnInit": true,
      "statusPageUrlPath": "/health",
      "homePageUrlPath": "/health",
      "healthCheckUrlPath": "/health",
      "preferIpAddress": true
    }

In program class we load the eureka client like this:

hostBuilder.AddDiscoveryClient()

We have a custom HelthCheck in each client with this path /health

The Eureka server is mounted on a Spring Boot application.

This is your configuration:

spring:
  application:
    name: discovery-service

server:
    port: 8761
eureka:
    environment: ${ENVIRONMENT}
    client:
        registerWithEureka: false
        fetchRegistry: false
    server:
        waitTimeInMsWhenSyncEmpty: 2
        enableSelfPreservation: false

We use the packages: spring-boot-starter-parent - version: 3.2.4

spring-cloud-starter-netflix-eureka-server - version: 4.1.1

And the entry point is simply this:

@SpringBootApplication
@EnableEurekaServer
public class ServiceDiscoveryApplication {

public static void main(String[] args) {
SpringApplication.run(ServiceDiscoveryApplication.class, args);
}

}

Something similar to the error has been replicated locally: We raise our Eureka server, and our clients:

Captura de pantalla 2024-04-11 a la(s) 18 29 31

It is normally recorded:

Captura de pantalla 2024-04-11 a la(s) 18 31 35

But then we mark the instance as DOWN:

Captura de pantalla 2024-04-11 a la(s) 18 27 29 Captura de pantalla 2024-04-11 a la(s) 18 29 00

And when everything is OK, the heartbeats are sent normally and received OK, but the service status is still DOWN

Captura de pantalla 2024-04-11 a la(s) 18 28 24

We can even replicate all of this with Postman, if you can guide us on what we are doing wrong or what we need to configure, we would greatly appreciate it.

shawnplay commented 4 months ago

Have you set the eureka.instance.initial-status configuration and what is its value?

viktoralfa19 commented 3 months ago

In the client this is configuration:

image

OlgaMaciaszek commented 5 days ago

Hello @viktoralfa19, thanks for reporting the issue. Please learn how to properly format code and logs. Please provide a minimal, complete, verifiable example that reproduces the issue - in the form of a link to a separate repo with an executable app and steps to perform in order to recreate the issue.