spring-cloud / spring-cloud-commons

Common classes used in different Spring Cloud implementations
Apache License 2.0
707 stars 702 forks source link

Calling refresh on actuator refresh endpoint results in Application run/startup failed #380

Open bowczarek opened 6 years ago

bowczarek commented 6 years ago

Hi there, In my current project we're making use of spring cloud configuration where every configuration client within particular application/micro service, is refreshing configuration in time interval manner, lets say each 10 minutes. Please bear in mind that we're also using failFast=true.

In order to handle scheduled refreshes, we have more less below solution:

@Configuration
@EnableScheduling
public class RefreshContextConfig {

    private RefreshEndpoint refreshEndpoint;

    public RefreshContextConfig(RefreshEndpoint refreshEndpoint) {
        this.refreshEndpoint = refreshEndpoint;
    }

    @Scheduled(fixedDelay = 600000, initialDelay = 600000)
    public void refreshContextPeriodically() {
        refreshEndpoint.refresh();
    }
}

Now the problem we're having is, if in any case configuration server is not available (or git server where we keep the configuration) and this refreshEndpoint.refresh(); fails, it's always logged as (using boot 1.5.x):

2018-06-08 13:23:42.796 INFO  [my_api,28997f9536412494,28997f9536412494]  [pool-2-thread-1] c.c.c.ConfigServicePropertySourceLocator :Fetching config from server at: http://localhost:8888
2018-06-08 13:23:42.811 ERROR [my_api,28997f9536412494,28997f9536412494]  [pool-2-thread-1] o.s.boot.SpringApplication               :Application startup failed

java.lang.IllegalStateException: Could not locate PropertySource and the fail fast property is set, failing

The error is misleading, it says that Application startup failed but in fact the refresh failed instead, application is still running, so it should say Refresh failed instead.

It looks like there's some overlook in the design, unless I'm missing something. Namely, failFast is probably meant to be evaluated during an application startup, to fail the application immediately (as docs say). However, the programmatic refresh or rest call refresh to /refresh endpoint while application is up and running still results in the same misleading message and the application is not stopped. I would expect that either validation of failFast is skipped while calling refresh and should be only considered at bootstrap or the application should actually be stopped. I'm not sure what was the design purpose but it's not clear at the moment.

If we take a look into the code, we can clearly see that the refresh method always registers BootstrapApplicationListener, even if it's actually called after the application is already bootstrapped (it looks odd from design standpoint).

If we also take a look into the documentation of actuator endpoints we can find following:

POST to /env to update the Environment and rebind @ConfigurationProperties and log levels
/refresh for re-loading the boot strap context and refreshing the @RefreshScope beans
/restart for closing the ApplicationContext and restarting it (disabled by default)
/pause and /resume for calling the Lifecycle methods (stop() and start() on the ApplicationContext)

so I would assume that even though it says re-loading the boot strap context apart from refreshing the @RefreshScope beans it should behave in a different way as I described above.

I also tried with spring boot 2.x, and here seems we have a small improvement, this time log looks like:

2018-06-08 13:25:30.736 INFO  [my_api,cdfafc417d45d831,cdfafc417d45d831]  [pool-3-thread-1] c.c.c.ConfigServicePropertySourceLocator :Fetching config from server at: http://localhost:8888
2018-06-08 13:18:59.943 ERROR [my_api,9d2a7bef3b11bff1,9d2a7bef3b11bff1]  [pool-3-thread-1] o.s.boot.SpringApplication               :Application run failed
java.lang.IllegalStateException: Could not locate PropertySource and the fail fast property is set, failing

So now, we're getting Application run failed instead of Application startup failed :) However, it's still misleading.

Finally, if we disable failFast (for the sake of test, we still need it), then of course it does not throw exception this time, instead it logs following warn:

2018-06-08 14:22:05.996 INFO  [my_api,5e70755697f2d008,5e70755697f2d008]  [pool-2-thread-1] s.c.a.AnnotationConfigApplicationContext :Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@4ac44338: startup date [Fri Jun 08 14:22:05 CEST 2018]; root of context hierarchy
2018-06-08 14:22:06.036 INFO  [my_api,5e70755697f2d008,5e70755697f2d008]  [pool-2-thread-1] trationDelegate$BeanPostProcessorChecker :Bean 'configurationPropertiesRebinderAutoConfiguration' of type [org.springframework.cloud.autoconfigure.ConfigurationPropertiesRebinderAutoConfiguration$$EnhancerBySpringCGLIB$$b882a29e] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2018-06-08 14:22:06.149 INFO  [my_api,5e70755697f2d008,5e70755697f2d008]  [pool-2-thread-1] c.c.c.ConfigServicePropertySourceLocator :Fetching config from server at: http://localhost:8888
2018-06-08 14:22:06.153 WARN  [my_api,5e70755697f2d008,5e70755697f2d008]  [pool-2-thread-1] c.c.c.ConfigServicePropertySourceLocator :Could not locate PropertySource: I/O error on GET request for "http://localhost:8888/my_api/development": Connection refused (Connection refused); nested exception is java.net.ConnectException: Connection refused (Connection refused)
2018-06-08 14:22:06.154 INFO  [my_api,5e70755697f2d008,5e70755697f2d008]  [pool-2-thread-1] o.s.boot.SpringApplication               :The following profiles are active: development
2018-06-08 14:22:06.156 INFO  [my_api,5e70755697f2d008,5e70755697f2d008]  [pool-2-thread-1] s.c.a.AnnotationConfigApplicationContext :Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@3b88507d: startup date [Fri Jun 08 14:22:06 CEST 2018]; parent: org.springframework.context.annotation.AnnotationConfigApplicationContext@4ac44338
2018-06-08 14:22:06.194 INFO  [my_api,5e70755697f2d008,5e70755697f2d008]  [pool-2-thread-1] o.s.boot.SpringApplication               :Started application in 0.395 seconds (JVM running for 48.556)

This is also misleading if you pay attention to it, unless it's by design. Namely, it says :Started application in 0.395 seconds (JVM running for 48.556) which is not the case I guess. It suggests that application was restarted, like we would've called /restart endpoint or something like that.

bowczarek commented 6 years ago

As suggested by @wilkinsona I moved the issue here (old one was raised in spring-boot https://github.com/spring-projects/spring-boot/issues/13427).

davinderrai commented 5 years ago

Vuican