Open mikluko opened 2 years ago
Thanks, would it be possible to privately send us the complete storage directory for each server?
i encounter the same problem today. i made thousands of MQTT connections to nats cluster, minutes later one node crashed. then i restart the crashed node, i notice it takes too long(more than 60 seconds) to listen on TCP PORT 4222, 8222, 1833 after the process started. i stopped testing. no heavy workload any more. only a few nats requests other programs made. a few minutes later, another node crashed.
it seems something wrong with the raft logic.
here is the log:
panic: [server226-S-R2F-ziqQhGPV] Placed an entry at the wrong index, ae is &{leader:1txmfIKU term:184 commit:0 pterm:0 pindex:0 entries: 1}, seq is 4, n.pindex is 0
goroutine 14561 [running]: github.com/nats-io/nats-server/server.(raft).storeToWAL(0xc000dee2c0, 0xc00673ce80, 0xf, 0xc0364cca40) /home/travis/gopath/src/github.com/nats-io/nats-server/server/raft.go:2774 +0x357 github.com/nats-io/nats-server/server.(raft).processAppendEntry(0xc000dee2c0, 0xc00673ce80, 0xc0066c26c0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/raft.go:2636 +0x8c5 github.com/nats-io/nats-server/server.(raft).runAsFollower(0xc000dee2c0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/raft.go:1464 +0x1ce github.com/nats-io/nats-server/server.(raft).run(0xc000dee2c0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/raft.go:1416 +0xed created by github.com/nats-io/nats-server/server.(*Server).startGoRoutine /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:2839 +0xc5
Any other details on what led up to the first crash? This is a forced panic since in our code due to state corruption. Were you killing terminals on windows in this test as well?
this is on Ubuntu18.04,3 nodes cluster on 3 machines,i was testing mqtt connection pressure,i didn't close any node on purpose before the panic happen。i am not sure what led to the first crash,maybe too frequent MQTT connection,maybe the raft state got messed。 the crashed node received about 5000 TCP connections before it crashed。 then another node fail to receive more TCP connections although it didn't crash。 the last node continue runs well,it can accept 15K connections before i stop the test。
在 2021年9月1日,21:08,Derek Collison @.***> 写道:
Any other details on what led up to the first crash? This is a forced panic since in our code due to state corruption. Were you killing terminals on windows in this test as well?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
And server version 2.4.0 correct?
yes, linux amd64 server version 2.4.0 ----- 原始邮件 ----- 发件人:Derek Collison @.> 收件人:nats-io/nats-server @.> 抄送人:carr123 @.>, Comment @.> 主题:Re: [nats-io/nats-server] Placed an entry at the wrong index (#2391) 日期:2021年09月01日 22点35分
And server version 2.4.0 correct?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Thanks.
maybe after the last issue i reported fixed,this one could disappear as well. I will test it again after new server version come out. thanks to your efforts.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
@derekcollison i leave office, no testing clients running. 12 hours later, i came back, one node crashed. this time the panic log is different, it worth a look. you can download from http://101.200.84.208/nohup.txt this download link expires in 3 days.
Seems the system you are running on has a hard limit of 1000 threads. That is at the top there. What system is this on?
How many consumers were you trying to create?
And how many streams? Seems quite a very large amount of both..
What does the nats cli report via the following?
nats stream report
@derekcollison If the question is to @carr123, the report says that many MQTT connections were created. As you know, each MQTT session will have its own stream (max msgs of 1), so that may explain the number of streams that you are seeing?
ubuntu18.04, amd64 not 1000 but 10,000 threads. (is this OS threads ? if it is, seems too many for a single process)
i tested mqtt connection 12 hours ago, there are 5327 folders there under jetstream directory, it seems one folder for one connection. each folder named like $MQTT_sess_zlNVCJTT
nats stream report (this command is executed on a live node, not the crashed one, server227 crashed) results: http://101.200.84.208/stream.txt
i restart the crashed node server227, and find its threads grows slowly from less than 10 to near 200. and continue to grow.
server227 threads grow up to 428 and stop growing. but server226 begin to crash. i noticed this yestoday, server226 and server 227 take turns to crash. another node server225 is alway healthy.
server226 panic log: panic: [server226-S-R2F-TM5kP7eZ] Placed an entry at the wrong index, ae is &{leader:1txmfIKU term:182 commit:0 pterm:179 pindex:2 entries: 1}, seq is 5, n.pindex is 2
goroutine 21454 [running]: github.com/nats-io/nats-server/server.(raft).storeToWAL(0xc00a48d080, 0xc36268c080, 0xf, 0xc36e659fa0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/raft.go:2774 +0x357 github.com/nats-io/nats-server/server.(raft).processAppendEntry(0xc00a48d080, 0xc36268c080, 0xc16002fbc0) /home/travis/gopath/src/github.com/nats-io/nats-server/server/raft.go:2636 +0x8c5 github.com/nats-io/nats-server/server.(raft).runAsFollower(0xc00a48d080) /home/travis/gopath/src/github.com/nats-io/nats-server/server/raft.go:1464 +0x1ce github.com/nats-io/nats-server/server.(raft).run(0xc00a48d080) /home/travis/gopath/src/github.com/nats-io/nats-server/server/raft.go:1416 +0xed created by github.com/nats-io/nats-server/server.(*Server).startGoRoutine /home/travis/gopath/src/github.com/nats-io/nats-server/server/server.go:2839 +0xc5
hope those logs can give you some clue.
after server226 crashed, i restart it, watch its threads.
thread num grow up to 2084 on server226, then server227 crash again.
this is captured on server226 : http://101.200.84.208/thread226.png
maybe you should restrict the thread num one process can create.
Your cluster and the machines you are running do not have enough resources currently to run that many MQTT connections.
We have plans to mux streams for MQTT persistence, but until then you need larger machines.
hi, i just download nats-server main branch, build my own binary. nats-server : ubuntu18.04, 3 nodes cluster(server225, server226, server227) (8 GB RAM, 4 core for each node) MQTT client : windows i delete old jetstream data, make a clean cluster. i didn't kill nodes on purpose through the following test.
client code like this:
func PressureTest() { for i := 0; i < 10000; i++ { idx := rand.Intn(3) if idx == 0 { go connectNode1() }else idx == 1 { go connectNode2() }else { go connectNode3() } time.Sleep(time.Millisecond 100) if i % 200 == 0 { log.Println(i, " connects") time.Sleep(time.Second) } } time.Sleep(time.Hour100) }
when the test started, i notice thread count on each node, they are reasonable(12 threads, 11 threads, 12 threads), as connection count increase, thread count increase slowly, when there are 2861 connections on node1, threads count are 20, 25, 19 on each node, it looks still reasonable.
suddenly, server225 crashed, then thread count on server226 increase to 854, thread count on server227 increase to 153. so , i guess some special code on server226, server227 cause thread count jump too high(if no node crash, they are not high).
server255 crash log: http://101.200.84.208/nats_nohup.txt from the first line in the log, i see "SIGQUIT: quit", i did not send this signal manually, i was watching the thread count value. there are 431136 goroutines shown in the log, it seems too many for <5000 client connections.
Currently a MQTT connection has its own stream per connection which again we can now update and improve by using the K/V functionality of JetStream to mux 1 stream for all MQTT clients on a server. This work is being prioritized.
Great, i won't test against mqtt until it's improved.
Will keep you posted. Thanks.
@derekcollison hello, first congratulations for nats-server release v2.5.0. i test mqtt againt nats-server 2.5.0 linux-amd64. (i build my own binary, in order to add pprof ) i made 1000 mqtt client connections
you mentioned "mux 1 stream for all MQTT clients on a server", the sessions are now indeed merged into a single $MQTT_sess folder. i wonder whether the obs subfolders could be merged then goroutines and memory usage are saved.
thanks.
below are logs from http/pprof: https://github.com/carr123/natsjsmdemo/blob/main/goroutine.txt
@carr123, @kozlovic did most of the work here so will let him chime in regarding architecture.
Go routines are designed to be very lightweight, so not too overly concerned about using them for dedicated I/O. IIRC stack memory footprint is 8k per.
well,a connection cost N goroutines,that design philosophy is OK。what I saw is 4000 MQTT connections(no data sent yet)make one node cost 8GB RAM , that's too expensive i think。even after 4000 clients disconnected and node restarted,Nats still take that much memory。 another thing i saw is there maybe goroutine leak, when one node crash ,another node create 50000 goroutines before it panic & exit
发自我的iPhone
在 2021年9月10日,22:24,Derek Collison @.***> 写道:
@carr123, @kozlovic did most of the work here so will let him chime in regarding architecture.
Go routines are designed to be very lightweight, so not too overly concerned about using them for dedicated I/O. IIRC stack memory footprint is 8k per.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Let's see what @kozlovic has to say next week.
@carr123 Are those connections creating each a subscription? That this the only reason I can think of that you see a file per connection under $MQTT_msgs/obs
. If your connections are not created with the "clean session" flag, the server has to maintain the state when shutting down. So this is normal they persist.
The reason for so many go routines and memory usage is that for go routines we have 2 per connections (the read loop and write loop, regardless of MQTT client or standard NATS), then you have go routines for the RAFT related things for consumer leaders, etc.. Each consumer (that is one per your MQTT subscription) will have a raft group associated. Each group create many things, including these kind of objects:
propc: make(chan *Entry, 8192),
entryc: make(chan *appendEntry, 32768),
respc: make(chan *appendEntryResponse, 32768),
applyc: make(chan *CommittedEntry, 32768),
and maybe there are some buffers for each file store (where messages/consumers/etc..), which can explain the memory usage? When running in standalone mode, I don't see a "lot" of memory usage for 4,000 connections with 1 sub each. In clustered mode, much more, but creating those 4,000 connections is very slow. I am only at 2,200 as I type and each server is already at 2.5 GB.
I will defer to @derekcollison for that.
thanks for your detailed description about nats design.
i created mqtt connections in a loop, each connection has fixed client_id(c1, c2, c3, ...), sub 1 topic, clean_session=false, and they connect evenly to a 3-node cluster. that 's emulating real world scene. when i said 1000 connections, means 1000 for each node.
i haven't test on standaone server yet. next week i will have a try.
i agree with you that there are might be buffers that occupy much memory, if it's optimized, then a single node can sustain more
connections. i expect one node can hold 100K clients with light data traffic. that would be nice.
if you want to repoduce other issue i mentioned above, just continue making more connections till one node crash. then watch things happen on other nodes.
Defect
Versions of
nats-server
and affected client libraries used:NATS Server v2.3.2, deployed by Helm chart v0.8.4
OS/Container environment:
RKE 1.2.9, Kubernetes 1.2.80, Docker 20.10.7
Steps or code to reproduce the issue:
It's a 5-node cluster. The cluster was under use for quite some time before it was restarted with
kubectl rollout restart statefulset/nats
.Expected result:
All cluster pods restarted one by one, in the rolling manner.
Actual result:
4 out of 5 pods restart successfully. One goes into crash cycle with the following logging output tail:
Full log