spring-projects / spring-data-couchbase

Provides support to increase developer productivity in Java when using Couchbase. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-couchbase
Apache License 2.0
277 stars 191 forks source link

ConfigException when used together with spring boot devtools #1567

Open tallinn1960 opened 2 years ago

tallinn1960 commented 2 years ago

In a Spring Boot application (Spring Boot 2.7.3) using both spring-boot-starter-couchbase and spring-boot-starter-devtools during application startup this exception happens:

com.couchbase.client.core.error.ConfigException: Could not locate a single global configuration

The application runs fine nevertheless and is passing all tests.

[ Edit: later discovered it was not related to devtools ] However, if I boot the application with devtools disabled (spring.devtools.add-properties=false), the exception does not happen. So it looks like spring devtools and spring boot data couchbase interfere with each other.

This is with a dockerized couchbase cluster setup and a connection string contacting each node via localhost on different management ports. This is the connection string:

spring.couchbase.connection-string=couchbase://127.0.0.1?io.networkResolution=external&timeout.kvTimeout=10s,\
  couchbase://127.0.0.1:11210?io.networkResolution=external&timeout.kvTimeout=10s,\
  couchbase://127.0.0.1:12210?io.networkResolution=external&timeout.kvTimeout=10s
mikereiche commented 2 years ago

AbstractCouchbaseConfiguration has the @Configuration annotation.

It might be useful to investigate your class that starts the spring application.

tallinn1960 commented 2 years ago

Wether I use AbstractCouchbaseConfiguration or not does not make a difference. What seemingly makes a difference is the tide, the weather, the phase of the moon or altogether - as sometimes my application comes up with that exception, sometimes without it. I looked into the code and found a remark on race conditions in DefaultConfigurationProvider::loadAndRefreshGlobalConfig. As that is the place where the exception happens. I also noticed that the Exception does happen even with devtools disabled. Due to its blinker nature, I missed that on the first post.

I simplified the cluster setup to bind to standard ports on 127.0.0.1-4, all bound to loopback on the docker host, as there are a number of issues both on the client and the server side when changing the standard ports of a node, but those are unrelated to the topic here. So the connection string is now just couchbase://127.0.0.1,127.0.0.2,127.0.0.3. The fourth node on 127.0.0.4 is just offering backup services and not supposed to handle clients. If I do not separate the backup service to the fourth node and remove that service from the other nodes, rebalancing the cluster is no longer working. I assign an alternate address to the backup node nevertheless, as not doing so is causing a Nullpointer exception in the couchbase drivers. I notified Couchbase about this and they confirmed the issue. This is all due to using the "alternate address setup" as described in the couchbase documentation, which appears to be full of bugs.

And this is the exception, with some lines added to make clear where it happens, when it happens:

com.couchbase.client.core.error.ConfigException: Could not locate a single global configuration
    at com.couchbase.client.core.config.DefaultConfigurationProvider.lambda$loadAndRefreshGlobalConfig$19(DefaultConfigurationProvider.java:411) ~[core-io-2.3.3.jar:na]
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:44) ~[reactor-core-3.4.22.jar:3.4.22]
    at reactor.core.publisher.Mono.subscribe(Mono.java:4397) ~[reactor-core-3.4.22.jar:3.4.22]
    at reactor.core.publisher.Mono.subscribeWith(Mono.java:4512) ~[reactor-core-3.4.22.jar:3.4.22]
    at reactor.core.publisher.Mono.subscribe(Mono.java:4368) ~[reactor-core-3.4.22.jar:3.4.22]
    at reactor.core.publisher.Mono.subscribe(Mono.java:4304) ~[reactor-core-3.4.22.jar:3.4.22]
    at reactor.core.publisher.Mono.subscribe(Mono.java:4276) ~[reactor-core-3.4.22.jar:3.4.22]
    at com.couchbase.client.core.Core.initGlobalConfig(Core.java:404) ~[core-io-2.3.3.jar:na]
    at com.couchbase.client.java.AsyncCluster.<init>(AsyncCluster.java:265) ~[java-client-3.3.3.jar:na]
    at com.couchbase.client.java.Cluster.<init>(Cluster.java:300) ~[java-client-3.3.3.jar:na]
    at com.couchbase.client.java.Cluster.connect(Cluster.java:261) ~[java-client-3.3.3.jar:na]
    at org.springframework.data.couchbase.config.AbstractCouchbaseConfiguration.couchbaseCluster(AbstractCouchbaseConfiguration.java:133)

Maybe its a bug to address to Couchbase?

mikereiche commented 2 years ago

Thanks for the stacktrace. I've created CBSE-12736 for you. The same issue has also been reported by another organization.

mikereiche commented 2 years ago

I assign an alternate address to the backup node nevertheless, as not doing so is causing a Nullpointer exception in the couchbase drivers. I notified Couchbase about this and they confirmed the issue. This is all due to using the "alternate address setup" as described in the couchbase documentation, which appears to be full of bugs.

There is an open issue https://issues.couchbase.com/browse/JCBC-1883 - if alternate address is used, there is an assumption that everything has an alternate address. If you are experiencing a different issue, let me know.

This is all due to using the "alternate address setup" as described in the couchbase documentation, which appears to be full of bugs.

I didn't find any open issues on the alternate address setup documentation. Can you please let me know what needs to be fixed? Thanks.

mikereiche commented 2 years ago

The application runs fine nevertheless and is passing all tests.

According to the other CBSE - it's just an informational message that should be output at DEBUG without the stack trace. Here's the text from the other CBSE - it's not the same as your situation, but the result is the same.

After looking into this, I think this is harmless but we can improve our code to print it at debug level I think.

So here is what's going on:

They only provide one bootstrap hostname which is not the one that the cluster host uses. So they bootstrap off of "oc-cb-01-srv.eu-oc-xxxxx-yyyyy.svc" but the hostname in the config is "oc-cb-01-0000.oc-cb-01.eu-oc-xxxxx-yyyyy.svc". We try to retrieve a config from the original seed node, but in the meantime a bucket config arrives and swaps out the seed node. Since those are just strings the client has no idea that this host is the same as the other just with a different name and cancels the operation. That's why you see "Reason: TARGET_NODE_REMOVED". We'll gracefully handle that and try with the new one. The warning there is originally in place to tell the user that we could not get a global config at all, which here is only the case temporarily. Again, we handle it gracefully and it is harmless but I think we can add more logic to promote this to debug level if we encounter this exact scenario

tallinn1960 commented 2 years ago

I didn't find any open issues on the alternate address setup documentation. Can you please let me know what needs to be fixed? Thanks.

Oh, that is a misunderstanding. It is not that the documentation is "full of bugs", but setting up alternate addresses and ports reveals many bugs in the software, at least when using a java client. One thing I noticed is that the client when in java only uses the alternate hostnames but not the alternate port assignments, that is why I switched to a 127.0.0.1-4 and standard ports on all nodes setup. There are server issues as well with such a setup as I mentioned. As with the Nullpointer exception I reported those on the Couchbase forum, but issues are yet unconfirmed.

mikereiche commented 2 years ago

I reported those on the Couchbase forum, but issues are yet unconfirmed

Aside from this issue and the NPE/JCBC-1883, I find one other post :

https://forums.couchbase.com/t/rebalance-issues-working-with-a-dockerized-couchbase-cluster-setup/34518

btw - you should be able to report bugs directly at issues.couchbase.com. forums.couchbase.com is mainly for user discussion.

If you are an Enterprise customer, you should contact support directly.

hantsy commented 5 months ago

after upgraded Spring Boot 3.3, I also encountered this issue, Could not locate a single global configuration

mikereiche commented 5 months ago

@hantsy - "Could not locate a single global configuration" is just a warning message and the sdk will retry on it's own.
It appears that the server was shut down giving results in the "remote side disconnected unexpectedly" errors