spring-cloud / spring-cloud-commons

Common classes used in different Spring Cloud implementations
Apache License 2.0
707 stars 704 forks source link

MapPropertySources created by EnvironmentPostProcessor run twice and 2nd run wins #476

Closed jxblum closed 5 years ago

jxblum commented 5 years ago

In a nutshell, when the org.springframework.cloud:spring-cloud-services-starter-service-registry dependency is added to the CLASSPATH of a Spring Boot, Pivotal GemFire ClientCache application deployed to PCF, connected to PCC, the dependency loses critical, contextual information stored in the Environment by Spring Boot for Pivotal GemFire (SBDG) using a custom PropertySource.

As a result, the Spring Boot, Pivotal GemFire ClientCache application (based on SBDG) is unable to authenticate and connect to the PCC instance deployed in PCF.

You can successfully reproduce this issue by doing the following:

  1. Clone the SBDG project.

  2. Create a topic branch from 1.0.0.M3 (here).

NOTE: You cannot simply branch from the HEAD of master since I have already checked in a "workaround" to this issue in commit 2e26325.

  1. Add the org.springframework.cloud:spring-cloud-services-starter-service-registry to the CLASSPATH of the SBDG spring-geode-autoconfigure module (for example).

  2. Then, run the AutoConfiguredCloudSecurityContextIntegrationTest class and see the test fail!

Without the org.springframework.cloud:spring-cloud-services-starter-service-registry dependency on the CLASSPATH, the AutoConfiguredCloudSecurityContextIntegrationTest passes as expected.

After in-depth analysis, my debugging revealed the main culprit in this issue involves the BootstrapApplicationListener class.

In summary...

The BootstrapApplicationListener, as its name implies, creates a new (temporary) "bootstrap" ApplicationContext along with a new (also, temporary) Environment object. This "bootstrap" context becomes the "parent" of the main ApplicationContext started/initialized by the Spring Boot application on startup. The BootstrapApplicationListener additionally copies/removes critical information (e.g. a SBDG custom PropertySource) from the main Environment to the "bootstrap" Environment object, which is later discarded from the "bootstrap" Environment. This "bootstrap" Environment is somehow passed to the SDG Security configuration handling classes, thus causing the underlying SDG @Configuration class used to authenticate the GemFire ClientCache app instance with PCC in PCF to not have the necessary Security (auth) credentials available to it upon inspecting the Environment, leading to an eventual authentication error.

Anyway, I tracked the beginning of this issue down to this snippet of code.
Because SBDG used a PropertiesPropertySource to capture the Security (auth) credentials from the VCAP environment into a custom named PropertySource, and the Spring PropertiesPropertySource is an EnumerablePropertySource, this is how the SBDG custom name PropertySource gets copied and lost, unlike other PropertySources, which Spring Boot adds, that won't get lost because they are not EnumerablePropertySources. :(

Additional details for this problem may/will be provided below, in follow up comments, as necessary.

scottfrederick commented 5 years ago
  1. Add the org.springframework.cloud:spring-cloud-services-starter-service-registry to the CLASSPATH of the SBDG spring-geode-autoconfigure module (for example).

I was able to follow the steps above but add this Spring Cloud dependency instead of the Spring Cloud Services dependency:

testRuntime("org.springframework.cloud:spring-cloud-starter-netflix-eureka-server:2.0.2.RELEASE") {
    exclude group: "org.springframework.boot", module: "spring-boot-starter-logging"
}

This also reproduces the problem, and reduces the surface area a bit to Spring Cloud OSS libs.

spencergibb commented 5 years ago

This in fact is the minimum required

testRuntime("org.springframework.cloud:spring-cloud-starter:2.0.2.RELEASE")
spencergibb commented 5 years ago

See https://github.com/spring-cloud/spring-cloud-commons/commit/12b9bfbe for the original commits of related functionality.

spencergibb commented 5 years ago

I am unable to reproduce this in a small standalone project. I added VCAP_APPLICATION and VCAP_SERVICES locally and it works fine. Something about running in PCF and John's test makes the problem occurr (In john's test the env post processor runs 4 times, all my tests run it just twice). If anyone can help repdocuce this locally, that would help. /cc @wxlund

wxlund commented 5 years ago

Why isn't the test case described by @jxblum adequate for confirming the bug where the PCC/PCF env is mocked? We know the test case of pushing to the app to PCF breaks every time its executed and this bug caused a lot of lost time in discovering and isolating.

spencergibb commented 5 years ago

@wxlund I need a way to reproduce it locally so we can find a fix (we still don't know why it happens in john's test vs locally) and test against regression.

smanukyan commented 5 years ago

@spencergibb - we have seen this issue consistently on local as well as in PCF, here is the github code to replicate the issue https://github.com/wxlund/spring-boot-for-gemfire-geode-samples

spencergibb commented 5 years ago

@smanukyan That is the project I was using locally.

smanukyan commented 5 years ago

@spencergibb - I just tried it locally, and it failed again. The app was running locally and pointing to remote PCC instance running as a service in PCF via VCAP_SERVICES local env var. Have you been running the PCC server locally or in PCF?

spencergibb commented 5 years ago

@smanukyan a local geode.

smanukyan commented 5 years ago

@spencergibb - can you please confirm, you can replicate the issue when your local app is connected to PCC running in PCF?

spencergibb commented 5 years ago

I don't know how to do that

spencergibb commented 5 years ago

https://github.com/spring-projects/spring-boot-data-geode/pull/22

jxblum commented 5 years ago

And then... https://github.com/spring-projects/spring-boot-data-geode/pull/22#issuecomment-462470763

jxblum commented 5 years ago

Additionally, my test is local, FTR. It only mocks a PCF (with PCC) environment.

jxblum commented 5 years ago

Finally, I would also add... 1) why does SCC's BootstrapApplicationListener only care about EnumerablePropertySource objects (?), and 2) why does this class remove EnumerablePropertySource objects from the main Environment object (which it copied to the bootstrapEnvironment) when it finishes its work?

jxblum commented 5 years ago

1 more thing... I wrote a example/test Spring Boot application with Spring Cloud on the classpath reproducing the issue. This example/test does not use SBDG.

See here for more details: https://github.com/jxblum/spring-boot-cloud-example

smanukyan commented 5 years ago

thank you for fixing the bug! we are currently using spring-cloud-commons version: 2.1.0.M1 , will this issue be fixed there as well? or we will need to fall back to 2.0.3.RELEASE?

spencergibb commented 5 years ago

why are you using a milestone? This will be part of 2.0.3.RELEASE and 2.1.1.RELEASE

markpollack commented 5 years ago

FWIW, I ran into the same issue with environmentpostprocessors for the new java-cfenv project and didn't report it since it was tied up in a larger project to reproduce (data flow). Does this mean env-post-processors will no longer run twice?

I'm hacking around this now for logging by keeping an invocation count

https://github.com/pivotal-cf/java-cfenv/blob/master/java-cfenv-boot/src/main/java/io/pivotal/cfenv/spring/boot/CfDataSourceEnvironmentPostProcessor.java#L71

spencergibb commented 5 years ago

@markpollack no it doesn't.