Closed sskserk closed 11 months ago
Is the server version really 2.3.2? The latest server release is 2.10.5.
nats-server: v2.9.22
Client lib in go.mod: require ( github.com/nats-io/nats.go v1.31.0 )
ok latest on 2.9.x is 2.9.24.
Can we get more information?
Why no ack on the stream? Do you see memory growth in the server? How fast are messages entering the stream? What rate for msgs and bytes? How many different subjects?
Can you share output of nats stream info
for the above stream?
What does nats consumer report
show for that stream as well?
Setting the "NoAck" to True is just an attempt to fight the issue. Have a theory that ACKs just require much time. In reality, the "NoAck" doesn't provide observable boost in performance.
Nats stream info
Subjects: A2S_NOTIFICATIONS.*
Replicas: 1
Storage: File
Options:
Retention: Limits
Acknowledgments: false
Discard Policy: Old
Duplicate Window: 2m0s
Allows Msg Delete: true
Allows Purge: true
Allows Rollups: false
Limits:
Maximum Messages: unlimited
Maximum Per Subject: unlimited
Maximum Bytes: unlimited
Maximum Age: 1h0m0s
Maximum Message Size: unlimited
Maximum Consumers: unlimited
Cluster Information:
Name: SOME_CLUSTER
Leader: devops-multi8nodes-nats-px-02
State:
Messages: 29,408
Bytes: 130 MiB
First Sequence: 1 @ 2023-11-16 03:21:23 UTC
Last Sequence: 29,408 @ 2023-11-16 03:46:05 UTC
Active Consumers: 1
Number of Subjects: 1
Nats consumer info:
? Select a Stream A2S_NOTIFICATIONS
? Select a Consumer queue-metric
Information for Consumer A2S_NOTIFICATIONS > queue-metric created 2023-11-16T03:35:46+03:00
Configuration:
Name: queue-metric
Delivery Subject: _INBOX.H1xRi0QIWDD7xYZZgmK0BU
Filter Subject: A2S_NOTIFICATIONS.*
Deliver Policy: Last Per Subject
Deliver Queue Group: queue-metric
Ack Policy: Explicit
Ack Wait: 30.00s
Replay Policy: Instant
Max Ack Pending: 1,000
Flow Control: false
Cluster Information:
Name: tantor
Leader: devops-multi8nodes-nats-px-02
State:
Last Delivered Message: Consumer sequence: 12,824 Stream sequence: 30,267 Last delivery: -473.001ms ago
Acknowledgment Floor: Consumer sequence: 12,824 Stream sequence: 30,267 Last Ack: -474.442ms ago
Outstanding Acks: 0 out of maximum 1,000
Redelivered Messages: 0
Unprocessed Messages: 0
Active Interest: Active using Queue Group queue-metric
The no-ack for publishing into a stream is not recommended unless there is a specific reason to do so. Normally it allows the server to receive messages at a higher rate then can be stored in the stream. We track this internally and reserve the right to drop these messages if the server is under pressure.
For your consumer, does a normal non-queue group consumer with normal delivery work? Meaning eliminate the queue group and the deliver last by subject?
Also, possible to upgrade to 2.10.5?
Upgraded:
# nats-server version
nats-server: v2.10.5
The following code works, but the message rate is still very low.
qSubscription, err := jStreamContext.Subscribe(
subjectName,
func(msg *nats.Msg) {
m, _ := msg.Metadata()
log.Info(m.Sequence.Stream)
msg.Ack()
},
nats.DeliverLastPerSubject(),
)
NATS configuration file:
server_name: devops-multi8nodes-nats-px-01
# Client port of 4222 on all interfaces
port: 4222
monitor_port: 8222
logfile: /opt/logs/nats/nats.log
logfile_size_limit: 1GB
jetstream {
store_dir=/opt/data/nats
}
cluster {
name: tantor
listen: devops-multi8nodes-nats-px-01:6222
connect_retries: 10
authorization: {
user: natsAdmin #you can use any name
password: ****
timeout: 0
}
routes = [
nats://natsAdmin:****@devops-multi8nodes-nats-px-02:6222
nats://natsAdmin:****@devops-multi8nodes-nats-px-03:6222
]}
tls { #certificates location
cert_file: "/opt/nats/certs/fullchain.pem"
key_file: "/opt/nats/certs/privkey.pem"
timeout: 10.0
}
NATS bench
# nats bench --stream="A2S_NOTIFICATIONS" "queue-metric.*" --server devops-multi8nodes-nats-px-02 --sub 1 --pub 1
12:43:01 Starting Core NATS pub/sub benchmark [subject=queue-metric.*, multisubject=false, multisubjectmax=100000, msgs=100,000, msgsize=128 B, pubs=1, subs=1, pubsleep=0s, subsleep=0s]
12:43:01 Starting subscriber, expecting 100,000 messages
12:43:01 Starting publisher, publishing 100,000 messages
Finished 0s [============================================================================================================================================================================] 100%
Finished 0s [============================================================================================================================================================================] 100%
NATS Pub/Sub stats: 1,499,928 msgs/sec ~ 183.10 MB/sec
Pub stats: 1,099,047 msgs/sec ~ 134.16 MB/sec
Sub stats: 754,115 msgs/sec ~ 92.06 MB/sec
But the sample given here (https://github.com/nats-io/go-nats-examples/blob/main/api-examples/subscribe_async/main.go) barely receives 100 msg/second
@derekcollison we just noticed that our messages have the 40K payload. Can it be the reason?
What is a recommended approach - many small messages or a few big?
Have you tried removing the log.Info line from your sample program to see if that has an affect?
@derekcollison sorry for the wrong alarm,
finally we found the issue. There is a mixture of factors: 1) out publishers send messages of a big size 2) max ack pending parameter used by queue groups was set to a miserable values 3) slow messages processing logic
After fixing the mentioned issues we do observe the perfect processing pace. NATS JetStreaming is just a amazing!! Thank you for the good product!
Observed behavior
Hi Nats Guys,
we are facing extremely low performance of JetStream consumers.
The following fragment code processes approximately 100 messages per second. Then it hangs for a couple of seconds.
Corresposind jStreamContext is created as:
We have an increased amount of JetStream messages which we can't process on time.
Expected behavior
Messages at least pushed by server with a higher rate
Server and client version
Nats 2.3.2
Host environment
Linux/Ubuntu
Steps to reproduce
No response