Closed zatlodan closed 4 months ago
Thanks for the report. Best for you to upgrade to latest patch version, 2.10.11. If issue persists let us know.
Will leave open for now.
I was facing this with nats-server version 2.9.25
after upgrading to version 2.10.12
issue resolved.
I was facing this with nats-server
version 2.9.25
after upgrading toversion 2.10.12
issue resolved.
Thanks for the reply. We will be updating NATS on our prod environment this week, I will post an update as soon I can.
We have updated all our NATS server environments to version 2.10.12
.
We have cleared the affected streams of any messages and recreated the consumers.
The issue is still there, but different. A week of monitoring and we have a hanging message in 3 of our 10 streams. Currently its just a single message.
The difference now is that only one of the instances see the message being stuck. In some cases its the leader of the stream/consumer and in some cases its not.
I think https://github.com/nats-io/nats-server/pull/5270 fixes and is available in 2.10.14
Thanks for the update @zatlodan, that is a condition that we were able to reproduce and was addressed in the v2.10.14 release from last week.
Okay, thank you for the response, we will update to 2.10.14 and will let you know.
Seems that the issue is no longer present after the update to 2.10.14.
Thank you all for help and I will now close this issue :+1:
Observed behavior
Stream (STREAM_B_Q) with a single consumer and retention set to
WorkQueue
reporting non zero message count after all messages are consumed and acknowledged by said consumer. This stream (STREAM_B_Q) is sourcing from another stream with retention set toLimits
(STREAM_A).This behavior has occurred after a large amount of data was inserted into the source stream (STREAM_A).
Some more details:
STREAM_A
This is the source stream into which the data were published.
Config
State
STREAM_B_Q
This is the
WorkQueue
with issues.Config
State
Consumer
View from metrics This is the
jetstream_stream_total_messages
metric on the streamSTREAM_B_Q
in the time the issue has arrisen. You can see 0 messages in the stream before the published bulk and1980
after.Cluster info 4 nodes, all same version and HW specs, same private network. No leaf nodes connected.
Expected behavior
All messages are removed from the stream after acknowledgement. Stream reporting 0 total messages.
Server and client version
Server: Version: 2.10.5 Git Commit: 0883d32 Go Version: go1.21.4
Consuming JS client: https://www.npmjs.com/package/nats Version: 2.15.1
CLI used to check Version: 0.0.35
Host environment
No response
Steps to reproduce
The issue is quite flaky and occurs randomly throughout the month. But it seems to be triggered by sudden spikes in published data in the source stream.