Open Ann-Geo opened 1 year ago
This issue seems to be present even if the max_msgs are increased to 10x. I am seeing that over multiple runs the messages in the streams are not getting fully consumed. Unconsumed messages get accumulated in the streams and eventually they total up to the max_msgs size of the stream and blocks subscribers receiving messages. I am also seeing multiple warnings related to the consumer from this blocked stream:
Jetstream consumer .... is not current
RAFT .... Resetting WAL state
Consumer ... error on store update from snapshot entry: old update ignored
RAFT ... 20000 append entries pending
However I am not sure if any of these are related to the particular stream-blocking behaviour I am observing. Sometimes triggering a cluster step-down for the blocked stream using nats cli tool seems to clear the blocking of the stream, but loses all the messages in the stream. Also this won't permanently solve the problem and the blocking reoccurs in the future runs. Also once the stream is blocked, it is not possible to get the consumer info neither using js_GetConsumerInfo API (returns timeout) nor using nats command line tool (gives context deadline exceeded error).
is there any resolution for this issue ?
Will loop in @levb to take a look since it is the C client.
This problem appears to be occurring only when using the stream storage as memory. When configured to use the file store, the consumer/stream does not give any errors or show blocking behaviour.
Any updates on this issue ?
Have you tried the latest server version, 2.9.19?
nats-server version used: 2.9.17 nats.c client library version: 3.6.1 Issue: The jetstream streams seem to be blocking any subscribers in the middle of the run preventing subscribers to receive any messages from the streams. The server log shows a warning on the consumer for the blocked stream when this happens: "Consumer ... error on store update from snapshot entry: old update ignored."
Would it be possible to know what is blocking the subscription from the stream ?
More info: A three server jetstream cluster is used for the runs here. Server configuration for one of the servers in the cluster:
Stream configuration:
Consumer configuration for the stream: