too many time_wait connections and high CPU usage after disabling "net.ipv4.tcp_tw_recycle"

bhzhu203 commented 6 years ago

Environment

emqttd 2.1.2

OS: CentOS release 6.8 with linux kernel 4.15.15 /google BBR enabled
Erlang/OTP: emqttd 2.1.2
EMQ: emqttd 2.1.2

Description

too many time_wait connections and high CPU usage after disabling "net.ipv4.tcp_tw_recycle" frp (fatedier/frp in github ) service connect to emqtt,showing like that

netstat -n | grep TIME_WAIT tcp 0 0 127.0.0.1:53168 tcp 0 0 127.0.0.1:53412 tcp 0 0 127.0.0.1:53054 tcp 0 0 127.0.0.1:53194 tcp 0 0 127.0.0.1:52952 tcp 0 0 127.0.0.1:53120 tcp 0 0 127.0.0.1:53052 tcp 0 0 127.0.0.1:53192 tcp 0 0 127.0.0.1:53222 tcp 0 0 127.0.0.1:53164 tcp 0 0 127.0.0.1:52912 tcp 0 0 127.0.0.1:52942 tcp 0 0 127.0.0.1:53090 tcp 0 0 127.0.0.1:53300 tcp 0 0 127.0.0.1:53372 tcp 0 0 127.0.0.1:53198 tcp 0 0 127.0.0.1:53224 tcp 0 0 127.0.0.1:53330 tcp 0 0 127.0.0.1:53386 tcp 0 0 127.0.0.1:52950 tcp 0 0 127.0.0.1:53292 tcp 0 0 127.0.0.1:53268 tcp 0 0 127.0.0.1:53088 tcp 0 0 127.0.0.1:52984 tcp 0 0 127.0.0.1:53190 tcp 0 0 127.0.0.1:53332 tcp 0 0 127.0.0.1:53122 tcp 0 0 127.0.0.1:53130 tcp 0 0 127.0.0.1:53026 127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT
127.0.0.1:1883 TIME_WAIT

.......100+ TIME_WAIT connetions

before disabling "net.ipv4.tcp_tw_recycle" ,the CPU usage quite low and excellent performance

terry-xiaoyu commented 6 years ago

@bhzhu203 Are you using a local proxy connecting to EMQ? If so, I think this might be caused by a short timeout value of the connection between the EMQ and your Proxy. Try to tuning such timeout configuration would help.

terry-xiaoyu commented 6 years ago

@bhzhu203 any update of this problem?

bhzhu203 commented 6 years ago

I finally find out not the "TIME_WAIT" problem.

emqttd have a quite high CPU usage and less memory uage . Only having about 300 clients (about 400~ ESTABLISHED connection) ,there is 20% CPU usage and 10% sys CPU, 1.3 load . When we have about 800 clients ,it is showing that 85% CPU usage and 15% sys CPU, 3.8 load in 4 core 8G machine (aliyun-network-performace-instance).

I think there are some emqtt performance issues.Because our other project using emqtt in 1000~ ESTABLISHED connections only loads on 1 Core 1GB machine of 0.3 load.

System:centos6 4cores 8G machine Erlang MQTT Broker | 2.1.2/ 2.3.7also

300 clients (about 400~ ESTABLISHED connection) ,there is 20% CPU usage and 10% sys CPU

perf top -g --children -p 2097

13.62% 0.95% libc-2.12.so [.] __sched_yield
12.66% 12.35% beam.smp [.] process_main
8.28% 8.28% beam.smp [.] 0x00000000000f3020
7.81% 7.81% beam.smp [.] 0x00000000000f303b
6.80% 6.80% [kernel] [k] schedule
6.63% 6.63% [kernel] [k] _spin_unlock_irqrestore
6.19% 0.08% libc-2.12.so [.] syscall
3.27% 0.04% libc-2.12.so [.] __libc_writev
2.80% 0.00% [unknown] [.] 0x000000000000004b
2.56% 2.53% beam.smp [.] erts_thr_progress_finalize_wait
2.37% 0.29% [vdso] [.] __vdso_clock_gettime
2.04% 2.04% [kernel] [k] __audit_syscall_exit
1.86% 1.86% beam.smp [.] 0x00000000000f3029
1.80% 0.07% libpthread-2.12.so [.] __libc_recv
1.74% 0.00% beam.smp [.] 0xffffffffffcf3020
1.67% 1.67% beam.smp [.] 0x00000000000f302e
1.65% 0.00% beam.smp [.] 0xffffffffffcf303b
1.65% 0.12% libc-2.12.so [.] epoll_wait
1.46% 1.46% [kernel] [k] finish_task_switch
1.36% 1.36% [kernel] [k] system_call_after_swapgs
1.33% 1.32% beam.smp [.] erts_thr_progress_prepare_wait
1.24% 1.24% beam.smp [.] 0x00000000000f3039
1.12% 1.12% beam.smp [.] 0x00000000000f3034
1.00% 0.16% librt-2.12.so [.] clock_gettime
0.80% 0.78% libpthread-2.12.so [.] pthread_getspecific
0.71% 0.04% libc-2.12.so [.] epoll_ctl
0.71% 0.00% [unknown] [.] 0x00007fed2db861c0
0.70% 0.00% [unknown] [.] 0x00007fed2db8a280
0.70% 0.70% beam.smp [.] 0x00000000000e53d3
0.70% 0.00% [unknown] [.] 0x00007fed2db8e

strace -cfp 2097

% time seconds usecs/call calls errors syscall

99.73 3215.192496 3711 866288 56852 futex 0.17 5.402413 1 7152597 clock_gettime 0.04 1.443275 2 873275 12 writev 0.04 1.305460 1 1393424 619149 recvfrom 0.02 0.619867 1 1238423 185 epoll_ctl 0.00 0.055394 1 103198 epoll_wait 0.00 0.005733 1 4556 fcntl 0.00 0.005134 1 4500 2222 accept 0.00 0.004677 2 2455 close 0.00 0.003018 1 3506 1050 getpeername 0.00 0.000598 11 55 munmap 0.00 0.000457 1 653 sched_yield 0.00 0.000383 1 396 gettimeofday 0.00 0.000244 8 32 mmap 0.00 0.000146 2 81 stat 0.00 0.000131 2 81 open 0.00 0.000044 1 81 lseek 0.00 0.000000 0 85 read 0.00 0.000000 0 3 restart_syscall

100.00 3224.039470 11643689 679470 total

1515588956366 remmina-2018-4-28-10 52 6 047351 remmina_screenshot-2018-4-28-11 23 15 000918

By the way, centos6 have some problem showing the right system load in "top" command

_258 Our each session often have 10000+ messages in minutes

terry-xiaoyu commented 6 years ago

@bhzhu203

What's is the current throughput?

How many subscribers do you have? And what kind of topic the subscribers are subscribed to?

bhzhu203 commented 6 years ago

_2018-04-28t12-01-26 892z

This project is a realtime comunication game program between teachers and students in international network.

bhzhu203 commented 6 years ago

@terry-xiaoyu Do you know how to solve this?

terry-xiaoyu commented 6 years ago

I think the I/O might be a bottleneck of EMQ.

The throughput is relatively low in multi-core system. My test result shows that the EMQ consumes about 110% CPU under a traffic of 4000 TPS, in a centos system with 8 cores / 8G MEM.

It is interesting that the same testing on a 1 core 2G MEM system only consumes about 30% CPU.

I used 400 publishers send messages to 400 subscribers, each at a speed of 10 messages per second. The payload size is only 1 byte in my test.

I also tried to inspect the lock contention using lcnt, it shows that there're lots of collisions in pollset_rm. I'm not sure that caused the problem though:

following result is collected under 10K TPS,

(emq@127.0.0.1)39> lcnt:conflicts().
            lock   id  #tries  #collisions  collisions [%]  time [us]  duration [%]
           -----  --- ------- ------------ --------------- ---------- -------------
      pollset_rm    1  456777        73056         15.9938    3454782       19.2976
    drv_ev_state   16  849586        14508          1.7077    2438503       13.6209
       run_queue   10 3536389        33629          0.9509     148472        0.8293
       proc_main 4355 1648354            1          0.0001      42583        0.2379
    db_hash_slot  576 1985625         1273          0.0641       6205        0.0347
 port_sched_lock 2019  549248           14          0.0025       1077        0.0060
 pollset_rm_list    1  205728          260          0.1264         89        0.0005

We're trying to improve the thought-put in next major version of EMQ. We'll appreciate if you have any idea on this.

bhzhu203 commented 6 years ago

Do you talk about disk IO? As you see ,the 'wa' in the top command is 0%.I think it is network IO.

We run the EMQTT on Alicloud machines which may use virtio network drivers. Is there any problem in virtio?

And some situations, I find out the 'sy' of CPU is quite high in the top command . /proc/softirqs shows that some keys on one of the CPUs are higher than others.

terry-xiaoyu commented 6 years ago

Yes I meant the network IO. But I am not sure that is the root cause. The clock contention in ETS tables could also causes the same issue.

I think it isn't the fault of virtio as we've seen similar result on other systems.

We're trying to build EMQ 3.0 on top of OTP 21 to see if it brings some throughput improvement.

bhzhu203 commented 6 years ago

Yes , I also find that EMQTT has performance decline in multicore CPUs machine . _278

gilbertwong96 commented 5 years ago

It is the smp option of erlang vm which is used to take advantage of multi-processor hardware architectures causing the performance decline. It is inevitable. EMQX 3.0.1 release would have a better performance.

emqx / emqx

too many time_wait connections and high CPU usage after disabling "net.ipv4.tcp_tw_recycle" #1566

Environment

Description