Closed GreenRover closed 1 year ago
Ping @Mrc0113
The solace
binder health indicator currently only captures the health of the binder's PubSub+ session. It does not currently capture any binding health statuses since their flows were configured to try reconnecting forever. But it seems that some cases (like this one) aren't captured by the flow reconnect feature.
To add this, we'll need to do some digging around to see if there's a good way to add binding statuses to the binder's health indicator (since its not a composite health indicator) or if/how other SCSt binders capture binding health statuses.
If there's no good way to add this to the existing indicator, we might have to add some custom config option to change the solace
binder health indicator into a composite health indicator, which when enabled, would show the health statuses for both the binder's session as well as for all its bindings.
This has been logged in the Solace Jira and we are targeting Q1CY2023 for a fix.
@GreenRover Just double checking, but did you post the correct stacktrace?
Looking again at the one you posted, this looks like an error on the producer side. But you shouldn't get this error by just deleting the queue. Deleting the queue should result in a different error, and it would be one on the consumer.
I was only able to reproduce a similar stacktrace by killing the session (e.g. session reconnect attempts exhausted). But the health for that is already captured by the existing health indicator (i.e. the PubSub+ session health).
I retestet now with:
Is the queue that you're deleting the one the input binding is consuming messages from? Or do you have a queue subscribed to the output binding destination, tms/monitoring/monalesy/p/v1/serviceState/request
, and that is the queue you are deleting?
the queue i delete is the from the input binding
With release 2.5.0 when a queue was deleted manually the service logs:
2023-05-05T10:46:20.725+0200 WARN Received error while trying to read message from endpoint scst/wk/sensor.XXX/plain/sensor/FOOO/_/_
com.solacesystems.jcsmp.JCSMPErrorResponseException: 503: Unknown Queue
at com.solacesystems.jcsmp.impl.flow.BindRequestTask.execute(BindRequestTask.java:211) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.impl.flow.SubFlowManagerImpl.handleAssuredCtrlMessage(SubFlowManagerImpl.java:570) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.protocol.impl.TcpClientChannel.handleAssuredCtrlMsg(TcpClientChannel.java:1768) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.protocol.impl.TcpClientChannel.handleMessage(TcpClientChannel.java:1733) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.protocol.nio.impl.SubscriberMessageReader.processRead(SubscriberMessageReader.java:98) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.protocol.nio.impl.SubscriberMessageReader.read(SubscriberMessageReader.java:140) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.protocol.smf.SimpleSmfClient.read(SimpleSmfClient.java:1206) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.protocol.nio.impl.SyncEventDispatcherReactor.processReactorChannels(SyncEventDispatcherReactor.java:206) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.protocol.nio.impl.SyncEventDispatcherReactor.eventLoop(SyncEventDispatcherReactor.java:157) ~[sol-jcsmp-10.16.0.jar:?]
at com.solacesystems.jcsmp.protocol.nio.impl.SyncEventDispatcherReactor$SEDReactorThread.run(SyncEventDispatcherReactor.java:338) ~[sol-jcsmp-10.16.0.jar:?]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
But /actuator/health
is still in status up. And the application is forever a zombi. Because neither the queues will be recreated nor the application will going to be killed.
@GreenRover , apologies for taking so long to address this issue. Did you also log an RT for this? If so, do you know the ticket reference #?
Hello Andreaw, no there is no related RT. ReTest is succesfully.
Scenario:
Result:
Expected: The SolaceBinderHealthIndicator of the application change to unhealthy.
Because mas last 10 pull requests was not merged i only create an issue