spring-cloud / spring-cloud-netflix

Integration with Netflix OSS components
http://cloud.spring.io/spring-cloud-netflix/
Apache License 2.0
4.87k stars 2.44k forks source link

Eureka initial status should be Starting #2834

Open twicksell opened 6 years ago

twicksell commented 6 years ago

Suggested enhancement: Default eureka.instance.initialStatus to Starting rather than Up. This may add a bit of registration lag in the case of demos, but I think it is the generally safer approach. Having an initial status of Starting means that the application should not receive traffic until it completely finishes initialization.

In the case where an application is slow to start (due to cache warming, off box calls, etc.) an application would be unable to successfully respond to incoming requests during this time, leading to service instability any time new instances are launched.

This should also cover the case where an application's health check returns a negative value due to some misconfiguration. Instead of registering as Up and then transitioning to Down while taking some traffic in between, the app would transition from Starting (no traffic) to Down (still no traffic).

ryanjbaxter commented 6 years ago

So does it then become the developers responsibility to set the status to UP?

twicksell commented 6 years ago

The HealthCheckHandler will start reporting UP/DOWN as soon as its attached, so the developer shouldn't need to get involved. Although the way we've done it internally (prior to using Spring Cloud Netflix) was that the healtcheck handler would only be attached at the time of the last ApplicationContextEvent.

One way or another, you don't want to be registered as UP with the remote eureka server until the context has been fully created. In the case of long startup times, its helpful to get an initial registration of Starting earlier rather than later so you can leverage Eureka registry data in tooling. You can now observe how long apps take to start. But if you didn't care about that, you could just ensure all Eureka registration happens after app context creation.

twicksell commented 6 years ago

Maybe a useful point of context, internally we've had developers request some extra lifecycle such that they could manually transition their service from Starting to Up, and then let the health check handler manage status from that point forward. We had that feature for quite a while and ended up retiring it because of the additional complexity. We instead ask developers to constrain any initialization within the lifecycle of the AppContext / Injector (in the case of Guice). That way there is only one application lifecycle to worry about.

ryanjbaxter commented 6 years ago

I thought it was the heartbeat that determined whether an app was UP by default. If actuator is on the path and you set

eureka:
  client:
    healthcheck:
      enabled: true

then we will propagate the actuator health status. I could be wrong....

twicksell commented 6 years ago

The heartbeat is just replicating what the local instance considers its status to be out to the remote Eureka server. If the healthcheck handler hasn't been attached yet, and we change this default, heartbeat will replicate Starting to the remote servers. If you attach the healthcheck handler, it'll replicate Up or Down based on what the healthcheck says.

spencergibb commented 6 years ago

eureka.instance.initial-status=UP would be the backward compatible option.

spencergibb commented 6 years ago

@twicksell what would attaching the healthcheck handler later look like? I think we're in favor of this with changes to the healthcheck handler.

twicksell commented 6 years ago

I'd think something along the lines of:

    @Autowired
    EurekaClient client;

    @Autowired
    HealthCheckHandler handler; // should be org.springframework.cloud.netflix.eureka.EurekaHealthCheckHandler

    @EventListener(ApplicationReadyEvent.class)
    public void attachHealthCheckHandlerToEureka() {
        client.registerHealthCheck(handler);
    }

The only edge case (which I think is handled, but pointing out for completeness) is that if the user disables all Spring Boot healthchecks, we would still want some default "always-healthy" fallback behavior for that health check handler. Otherwise, the application would remain in the Starting status forever.

brenuart commented 6 years ago

Could be I missed something, but using an Eureka HealthCheckHandler is not required to support instances starting with a STARTING status then UP once fully started.

We do have the same requirement and simply did the following: First, configure eureka with eureka.instance.initial-status = STARTING Second, add the following to change the status to UP when the application is fully started:

@Configuration
static class EurekaUpWhenStarted {
    @Autowired
    private ApplicationInfoManager appInfoManager;

    @EventListener
    public void onApplicationReadyEvent(ApplicationReadyEvent event) {
        appInfoManager.setInstanceStatus(InstanceStatus.UP);
    }

    @EventListener
    public void onContextStartedEvent(ContextStartedEvent event) {
        appInfoManager.setInstanceStatus(InstanceStatus.UP);
    }
}

Note that the status reported by the AppInfoManager will be discarded in favour of the status reported by any registered health check. So with that small config you have the best of both worlds:

But be aware that using an Eureka HealthCheckHandler may lead to strange surprises as described in #1571.

Working like a charm at least up to Edgware.SR3

twicksell commented 6 years ago

You're correct @brenuart that health check is not required. We've just always used some form of healthcheck, and consistently knowing the status is coming from a HealthCheck handler (even if its a no-op one) feels a little cleaner. But that's just an opinion, functionally what you described would be the same if you do want to disable all health check integration with Eureka.

brenuart commented 6 years ago

As I said, the status will always come from the healthcheck if one is registered and returns a non-null value - otherwise the status from AppInfoManager is taken.

The bottom line is STARTING/UP scenario and supporting healthcheck are two different issues. And it is true that currently STARTING/UP is not supported, but the solution should not involve nor require a healthcheck to be registered.

Sorry if I misunderstood your point - it's getting late already here :( I just wanted to point out that the simple configuration shown above (+ a fix for #1571) was enough for us to support STARTING/UP + Healthchecks.

spencergibb commented 6 years ago

We are to far gone for 2.0, maybe a later major release.