Closed akraskovski closed 4 years ago
Thanks for bringing this up @akraskovski. We've noticed several hiccups with regards to reconnecting the AxonServerQueryBus
, like #1397 and #1386. Especially #1397 points towards what you are suggesting as well.
My guess is that your issue will thus be resolved with that fix in place. If you have the time, making a local snapshot build of framework would allow you to verify reconnects work with this in place. If you do not have the time to build a snapshot yourself, I'd suggest to wait for 4.3.2 to be released, somewhere this week.
Hope this clarifies your options @akraskovski!
Thanks, @smcvb for the answer!
We'll wait for this fix in 4.3.2
@akraskovski, I am writing because I've just released 4.3.2. Would you mind giving it a spin and seeing whether the same issue still occurs?
@smcvb Hi! Just checked it using tagged 4.3.2 revision and it seems that everything is working!
Logs from Spring application after AxonServer restart now are the following: ` 19:24:01 [AxonServerConnectionManager-0] INFO o.a.a.c.AxonServerConnectionManager - Reusing existing channel
19:24:01 [AxonServerConnectionManager-0] INFO o.a.a.c.AxonServerConnectionManager - Re-subscribing commands and queries
19:24:01 [AxonServerConnectionManager-0] INFO o.a.a.c.command.AxonServerCommandBus - Resubscribing Command handlers with AxonServer
19:24:01 [AxonServerConnectionManager-0] INFO o.a.a.c.command.AxonServerCommandBus - Creating new command stream subscriber
19:24:01 [AxonServerConnectionManager-0] INFO o.a.a.c.query.AxonServerQueryBus - Creating new query stream subscriber `
And in the dashboard screen, I become able to see registered query handlers from this app
It's really good news, that solution was found so quickly. Thank you! Will wait for uploaded 4.3.2 spring-boot-starter release in maven central :)
That's great news @akraskovski! I will close this issue as resolved then.
Thanks again for filing this concern with us and hope to see/chat with you in the future!
Hello @smcvb,
I faced with the same issue but with latest version of Axon
AxonServer 4.4.9 (running in k8s) Spring Boot app with axon-spring-boot-starter 4.4.6
Spring Boot app logs:
17:35:52.642 WARN i.a.a.c.impl.AxonServerManagedChannel : Connecting to AxonServer node [localhost:8124] failed: UNAVAILABLE: io exception
17:35:52.642 INFO i.a.a.c.impl.AxonServerManagedChannel : Failed to get connection to AxonServer. Scheduling a reconnect in 2000ms
17:35:54.647 INFO i.a.a.c.impl.AxonServerManagedChannel : Requesting connection details from localhost:8124
17:35:55.878 INFO i.a.a.c.impl.AxonServerManagedChannel : Successfully connected to localhost:8124
17:35:56.250 INFO i.a.a.connector.impl.ControlChannelImpl : Connected instruction stream for context 'default'. Sending client identification
17:35:56.250 INFO i.a.a.c.command.impl.CommandChannelImpl : CommandChannel for context 'default' connected, 12 command handlers registered
17:36:09.782 INFO o.a.e.TrackingEventProcessor : Fetched token: IndexTrackingToken{globalIndex=8534} for segment: Segment[0/0]
17:36:09.782 INFO o.a.e.TrackingEventProcessor : Fetched token: IndexTrackingToken{globalIndex=8534} for segment: Segment[0/0]
17:36:09.787 INFO o.a.e.TrackingEventProcessor : Fetched token: IndexTrackingToken{globalIndex=8534} for segment: Segment[0/0]
There are no logs like this (about Query Channel) for example:
16:42:38.153 INFO i.a.a.c.query.impl.QueryChannelImpl : QueryChannel for context 'default' connected, 6 registrations resubscribed
And also on Axon dashboard there are no Query handlers.
Hi @dimazelinskyi,
do you have any occurrences of QueryChannel for context 'default' connected
in the beginning of the logs, where the application started? It's probably literally QueryChannel for context 'default' connected, 0 registrations resubscribed
.
The reconnect logic for both the Command and Query channels are identical. The only situation I currently see where one connects and the other doesn't is when there are no registered Query Handlers.
Or do you see anything like Error on QueryChannel for context
being logged? That should happen around the time where the connection is logged.
Hi @abuijze,
Thanks for your answer.
It's probably literally QueryChannel for context 'default' connected, 0 registrations resubscribed. The only situation I currently see where one connects and the other doesn't is when there are no registered Query Handlers.
I think it was because query handlers
were correctly working and returning expected values. (Unfortunately for now I do not have whole logs from starting app)
Error on QueryChannel for context
I do not have such message in logs.
This is what happens from reconnecting point:
Fetched token: IndexTrackingToken{globalIndex=8534} for segment: Segment[0/0]
Released claim
Released claim
Releasing claim on token and preparing for retry in 60s
Released claim
Requesting connection details from axon:8124
Connecting to AxonServer node [axon:8124] failed: UNAVAILABLE: io exception
Failed to get connection to AxonServer. Scheduling a reconnect in 2000ms
Requesting connection details from axon:8124
Connecting to AxonServer node [axon:8124] failed: UNAVAILABLE: io exception
Failed to get connection to AxonServer. Scheduling a reconnect in 2000ms
Requesting connection details from axon:8124
....
Successfully connected to axon:8124
Connected instruction stream for context 'default'. Sending client identification
CommandChannel for context 'default' connected, 12 command handlers registered
Fetched token: IndexTrackingToken{globalIndex=8534} for segment: Segment[0/0]
It looks like it totally skip reconnecting QueryHandlers (no query handlers on dashboard)
As temporary solution I downgraded to version axonserver:4.4.8
@abuijze if you have any suggestions how to trace such cases or any other steps that can help with this it would be really good.
Thanks!
I don't think the issue is related to the version of AxonServer you're using. Especially the difference between 4.4.8 and 4.4.9 should not be able to influence the connectivity.
Instead, I think the issue is in the connector. However, looking at the code, I currently don't see a path where it could fail to attempt a reconnect on the QueryChannel. If you have any logs that you can share which are related to losing connectivity, that would be very helpful.
Do you have any other axon related dependencies in your project, other than the Axon SpringBoot Starter?
I don't think the issue is related to the version of AxonServer you're using. Especially the difference between 4.4.8 and 4.4.9 should not be able to influence the connectivity.
Instead, I think the issue is in the connector. However, looking at the code, I currently don't see a path where it could fail to attempt a reconnect on the QueryChannel. If you have any logs that you can share which are related to losing connectivity, that would be very helpful.
All logs related to reconnecting issue that I have I added above in the comments.
Do you have any other axon related dependencies in your project, other than the Axon SpringBoot Starter?
Except from Spring-Boot-Starter
we have
+--- org.axonframework:axon-test:4.4.6
| \--- org.axonframework:axon-eventsourcing:4.4.6 (*)
This is our Spring-Boot-Starter
+--- org.axonframework:axon-spring-boot-starter:4.4.6
| +--- org.axonframework:axon-spring-boot-autoconfigure:4.4.6
| | +--- org.axonframework:axon-spring:4.4.6
| | | +--- org.axonframework:axon-configuration:4.4.6
| | | | +--- org.axonframework:axon-eventsourcing:4.4.6
| | | | | +--- org.axonframework:axon-modelling:4.4.6
| | | | | | +--- org.axonframework:axon-messaging:4.4.6
| | | | | | | +--- org.slf4j:slf4j-api:1.7.28 -> 1.7.30
| | | | | | | +--- com.thoughtworks.xstream:xstream:1.4.14
| | | | | | | \--- xpp3:xpp3:1.1.4c
| | | | | | \--- org.slf4j:slf4j-api:1.7.28 -> 1.7.30
| | | | | \--- org.axonframework:axon-messaging:4.4.6 (*)
| | | | +--- org.axonframework:axon-disruptor:4.4.6
| | | | | +--- org.axonframework:axon-eventsourcing:4.4.6 (*)
| | | | | \--- com.lmax:disruptor:3.4.2
| | | | +--- org.axonframework:axon-modelling:4.4.6 (*)
| | | | \--- org.axonframework:axon-messaging:4.4.6 (*)
| | | +--- org.slf4j:slf4j-api:1.7.28 -> 1.7.30
| | | +--- org.axonframework:axon-eventsourcing:4.4.6 (*)
| | | +--- org.axonframework:axon-modelling:4.4.6 (*)
| | | \--- org.axonframework:axon-messaging:4.4.6 (*)
| | \--- org.axonframework:axon-server-connector:4.4.6
| | \--- io.axoniq:axonserver-connector-java:4.4.6
| | +--- org.slf4j:slf4j-api:1.7.30
| | +--- io.grpc:grpc-protobuf:1.33.0
| | | +--- io.grpc:grpc-api:1.33.0
| | | | +--- io.grpc:grpc-context:1.33.0
| | | | +--- com.google.code.findbugs:jsr305:3.0.2
| | | | +--- com.google.guava:guava:29.0-android
| | | | | +--- com.google.guava:failureaccess:1.0.1
| | | | | +--- com.google.guava:listenablefuture:9999.0-empty-to-avoid-conflict-with-guava
| | | | | +--- com.google.code.findbugs:jsr305:3.0.2
| | | | | +--- org.checkerframework:checker-compat-qual:2.5.5
| | | | | +--- com.google.errorprone:error_prone_annotations:2.3.4
| | | | | \--- com.google.j2objc:j2objc-annotations:1.3
| | | | +--- com.google.errorprone:error_prone_annotations:2.3.4
| | | | \--- org.codehaus.mojo:animal-sniffer-annotations:1.18
| | | +--- com.google.code.findbugs:jsr305:3.0.2
| | | +--- com.google.protobuf:protobuf-java:3.12.0
| | | +--- com.google.api.grpc:proto-google-common-protos:1.17.0
| | | +--- io.grpc:grpc-protobuf-lite:1.33.0
| | | | +--- io.grpc:grpc-api:1.33.0 (*)
| | | | +--- com.google.code.findbugs:jsr305:3.0.2
| | | | +--- com.google.guava:guava:29.0-android (*)
| | | | +--- com.google.errorprone:error_prone_annotations:2.3.4
| | | | \--- org.codehaus.mojo:animal-sniffer-annotations:1.18
| | | +--- com.google.guava:guava:29.0-android (*)
| | | +--- com.google.errorprone:error_prone_annotations:2.3.4
| | | \--- org.codehaus.mojo:animal-sniffer-annotations:1.18
| | +--- io.grpc:grpc-stub:1.33.0
| | | +--- io.grpc:grpc-api:1.33.0 (*)
| | | +--- com.google.guava:guava:29.0-android (*)
| | | +--- com.google.errorprone:error_prone_annotations:2.3.4
| | | +--- org.codehaus.mojo:animal-sniffer-annotations:1.18
| | | \--- com.google.code.findbugs:jsr305:3.0.2
| | +--- io.grpc:grpc-netty:1.33.0
| | | +--- io.grpc:grpc-core:1.33.0
| | | | +--- io.grpc:grpc-api:1.33.0 (*)
| | | | +--- com.google.code.gson:gson:2.8.6
| | | | +--- com.google.android:annotations:4.1.1.4
| | | | +--- org.codehaus.mojo:animal-sniffer-annotations:1.18
| | | | +--- com.google.errorprone:error_prone_annotations:2.3.4
| | | | +--- com.google.guava:guava:29.0-android (*)
| | | | +--- io.perfmark:perfmark-api:0.19.0
| | | | | \--- com.google.code.findbugs:jsr305:3.0.2
| | | | \--- com.google.code.findbugs:jsr305:3.0.2
| | | +--- io.netty:netty-codec-http2:4.1.51.Final -> 4.1.50.Final (*)
| | | +--- io.netty:netty-handler-proxy:4.1.51.Final -> 4.1.50.Final (*)
| | | +--- com.google.guava:guava:29.0-android (*)
| | | +--- com.google.errorprone:error_prone_annotations:2.3.4
| | | +--- io.perfmark:perfmark-api:0.19.0 (*)
| | | +--- org.codehaus.mojo:animal-sniffer-annotations:1.18
| | | \--- com.google.code.findbugs:jsr305:3.0.2
| | \--- io.netty:netty-tcnative-boringssl-static:2.0.31.Final
| +--- org.axonframework:axon-spring:4.4.6 (*)
| +--- org.axonframework:axon-configuration:4.4.6 (*)
| +--- org.axonframework:axon-eventsourcing:4.4.6 (*)
| +--- org.axonframework:axon-modelling:4.4.6 (*)
| \--- org.axonframework:axon-messaging:4.4.6 (*)
It looks like dependencies are clean
We've yesterday released a new version of the axonserver-connector-java
, which is a bit more thorough in performing reconnects if the channel states to be unavailable.
Might be worth giving this a try to see whether it resolves your problem @dimazelinskyi.
You can tap into this, by updating your Axon Framework version to 4.4.7.
This will automatically also pull in axonserver-connector-java
4.4.7.
We've updated with latest version that you suggested. Will see. Thanks!
Hi guys! I'm duplicating the issue message from google groups since I'm not the only one who caught this problem :)
AxonServer 4.3.1 (running in k8s stateful set. manifest was used from the official doc) Spring Boot app with axon-spring-boot-starter 4.3.1
The situation by steps(for running instance of AxonServer):
In Axon dashboard I'm able to see available list of commands and queries. Then AxonServer restarts
Spring Boot app becomes reconnect to AxonServer, and when connection is established prints the following logs:
Here we didn't see a message like "Creating new query stream subscriber", so after successful reconnection, there are no queries in the Axon dashboard. Since I'm no able to push queries via queryGateway because no @QueryHandler is registered after the restart.
P.S with command subscribers, everything is good. It shows commands in the dashboard and handlers still working.
P.S The solution which helped to me(changed directly in AxonFramework source code) is:
AxonServerCommandBus.CommandProcessor.getSubscriberObserver#onError
(line 792) - has line"subscriberStreamObserver = null;"
, which means, that new observer will be created on next connection.AxonServerQueryBus.QueryProcessor.getSubscriberObserver#onError
(line 721) - doesn't have "setting to null" the observer. So, it never re-creates the observer.Hope you'll help to find the answer!
Thanks!