Closed bhzhu203 closed 5 years ago
@bhzhu203 Are you using a local proxy connecting to EMQ? If so, I think this might be caused by a short timeout value of the connection between the EMQ and your Proxy. Try to tuning such timeout configuration would help.
@bhzhu203 any update of this problem?
I finally find out not the "TIME_WAIT" problem.
emqttd have a quite high CPU usage and less memory uage . Only having about 300 clients (about 400~ ESTABLISHED connection) ,there is 20% CPU usage and 10% sys CPU, 1.3 load . When we have about 800 clients ,it is showing that 85% CPU usage and 15% sys CPU, 3.8 load in 4 core 8G machine (aliyun-network-performace-instance).
I think there are some emqtt performance issues.Because our other project using emqtt in 1000~ ESTABLISHED connections only loads on 1 Core 1GB machine of 0.3 load.
System:centos6 4cores 8G machine Erlang MQTT Broker | 2.1.2/ 2.3.7also
300 clients (about 400~ ESTABLISHED connection) ,there is 20% CPU usage and 10% sys CPU
perf top -g --children -p 2097
strace -cfp 2097
% time seconds usecs/call calls errors syscall
99.73 3215.192496 3711 866288 56852 futex 0.17 5.402413 1 7152597 clock_gettime 0.04 1.443275 2 873275 12 writev 0.04 1.305460 1 1393424 619149 recvfrom 0.02 0.619867 1 1238423 185 epoll_ctl 0.00 0.055394 1 103198 epoll_wait 0.00 0.005733 1 4556 fcntl 0.00 0.005134 1 4500 2222 accept 0.00 0.004677 2 2455 close 0.00 0.003018 1 3506 1050 getpeername 0.00 0.000598 11 55 munmap 0.00 0.000457 1 653 sched_yield 0.00 0.000383 1 396 gettimeofday 0.00 0.000244 8 32 mmap 0.00 0.000146 2 81 stat 0.00 0.000131 2 81 open 0.00 0.000044 1 81 lseek 0.00 0.000000 0 85 read 0.00 0.000000 0 3 restart_syscall
100.00 3224.039470 11643689 679470 total
By the way, centos6 have some problem showing the right system load in "top" command
Our each session often have 10000+ messages in minutes
@bhzhu203
What's is the current throughput?
How many subscribers do you have? And what kind of topic the subscribers are subscribed to?
This project is a realtime comunication game program between teachers and students in international network.
@terry-xiaoyu Do you know how to solve this?
I think the I/O might be a bottleneck of EMQ.
The throughput is relatively low in multi-core system. My test result shows that the EMQ consumes about 110% CPU under a traffic of 4000 TPS, in a centos system with 8 cores / 8G MEM.
It is interesting that the same testing on a 1 core 2G MEM system only consumes about 30% CPU.
I used 400 publishers send messages to 400 subscribers, each at a speed of 10 messages per second. The payload size is only 1 byte in my test.
I also tried to inspect the lock contention using lcnt, it shows that there're lots of collisions in pollset_rm
. I'm not sure that caused the problem though:
following result is collected under 10K TPS,
(emq@127.0.0.1)39> lcnt:conflicts().
lock id #tries #collisions collisions [%] time [us] duration [%]
----- --- ------- ------------ --------------- ---------- -------------
pollset_rm 1 456777 73056 15.9938 3454782 19.2976
drv_ev_state 16 849586 14508 1.7077 2438503 13.6209
run_queue 10 3536389 33629 0.9509 148472 0.8293
proc_main 4355 1648354 1 0.0001 42583 0.2379
db_hash_slot 576 1985625 1273 0.0641 6205 0.0347
port_sched_lock 2019 549248 14 0.0025 1077 0.0060
pollset_rm_list 1 205728 260 0.1264 89 0.0005
We're trying to improve the thought-put in next major version of EMQ. We'll appreciate if you have any idea on this.
Do you talk about disk IO? As you see ,the 'wa' in the top command is 0%.I think it is network IO.
We run the EMQTT on Alicloud machines which may use virtio network drivers. Is there any problem in virtio?
And some situations, I find out the 'sy' of CPU is quite high in the top command . /proc/softirqs shows that some keys on one of the CPUs are higher than others.
Yes I meant the network IO. But I am not sure that is the root cause. The clock contention in ETS tables could also causes the same issue.
I think it isn't the fault of virtio as we've seen similar result on other systems.
We're trying to build EMQ 3.0 on top of OTP 21 to see if it brings some throughput improvement.
Yes , I also find that EMQTT has performance decline in multicore CPUs machine .
It is the smp option
of erlang vm which is used to take advantage of multi-processor hardware architectures causing the performance decline. It is inevitable. EMQX 3.0.1 release would have a better performance.
Environment
emqttd 2.1.2
Description
too many time_wait connections and high CPU usage after disabling "net.ipv4.tcp_tw_recycle" frp (fatedier/frp in github ) service connect to emqtt,showing like that
netstat -n | grep TIME_WAIT tcp 0 0 127.0.0.1:53168 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53412 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53054 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53194 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:52952 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53120 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53052 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53192 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53222 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53164 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:52912 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:52942 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53090 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53300 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53372 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53198 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53224 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53330 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53386 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:52950 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53292 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53268 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53088 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:52984 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53190 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53332 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53122 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53130 127.0.0.1:1883 TIME_WAIT
tcp 0 0 127.0.0.1:53026 127.0.0.1:1883 TIME_WAIT
.......100+ TIME_WAIT connetions
before disabling "net.ipv4.tcp_tw_recycle" ,the CPU usage quite low and excellent performance