iqiyi / dpvs

DPVS is a high performance Layer-4 load balancer based on DPDK.
Other
3k stars 723 forks source link

dpip adds show may fails #156

Open icymoon opened 6 years ago

icymoon commented 6 years ago

Here is log, several types: multicast_msg_send: mc_msg(type:12, cid:0->slaves) timeout(2000 us), drop... SAPOOL: sa_pool_stats: mc msg send fail: msg dropped MSGMGR: multicast_msg_send: repeated sequence number for multicast msg: type 12, seq 0 SAPOOL: sa_pool_stats: mc msg send fail: invalid parameter MSGMGR: multicast_msg_send: repeated sequence number for multicast msg: type 12, seq 0 SAPOOL: sa_pool_stats: mc msg send fail: invalid parameter

NETIF: [netif_sockopt_get] msg dropped for netif sockmsg opt 502)

MSGMGR: sockopt_ctl: socket msg callback failed MSGMGR: [sockopt_msg_send:msg#502] errcode set in sockopt msg reply: msg dropped

beacer commented 6 years ago

@icymoon How about your script to monitor sa_pool or other details ? is it easy to reproduce or not ? @ywc689 pls help to check the msg issue.

icymoon commented 6 years ago

It easy to reproduce with workload. Monitors call dpip, ipvsadm command with different parameters.

icymoon commented 6 years ago

每段开始出问题的时间,都是以这样的消息开头的 MSGMGR: multicast_msg_send: mc_msg(type:11, cid:0->slaves) timeout(2000 us), drop... SERVICE: dp_vs_copy_stats: send message fail. 或者 msg_send: uc_msg(type:8, cid:0->13, flags=2) timeout(2000 us), drop... NETIF: [netif_sockopt_get] msg dropped for netif sockmsg opt 502)

前者的概率更大些似乎

beacer commented 6 years ago

As discussed with @icymoon , to reproduce it:

  1. test with 2G bps througput.
  2. run dpip addr show with dead loop: while [ true ]; dpip addr show; done > xxx.log
icymoon commented 6 years ago

reproduced with this work load: # ./wrk -t26 -c5000 -d15m http://xxxx/ Running 15m test @ http://xxx/ 26 threads and 5000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 4.01ms 48.57ms 2.00s 99.11% Req/Sec 34.96k 19.43k 155.07k 73.96% 813153656 requests in 15.00m, 583.85GB read Socket errors: connect 3332, read 0, write 0, timeout 16561 Requests/sec: 903409.67 Transfer/sec: 664.22MB

ywc689 commented 6 years ago
  1. DPVS lcore multicast msg is master-to-all-slaves round trip message , i.e. the master lcore send messages to all slaves, and wait until all slaves reply with a response message if the message is blockable. So the message would time out if any lcore get slower than 2000us/2, which is very likely to happen in the stress test. Increase the msg timeout in the dpvs.conf may helps.
  2. The error log repeated sequence number is a result of that dpvs is too busy to process the frequently generated dpip commands of the same type. We can make use of the seq field in dpvs_msg to solve the problem. But the root cause still is dpvs get slower than dpip.
icymoon commented 6 years ago

谢谢@ywc689 或者我只尝试读per core的计数然后在用户态加起来,可以规避这个问题?

ywc689 commented 6 years ago

@icymoon 你把dpvs.conf配置文件中ctrl_defs/lcore_msg/sync_msg_timeout_us这个超时时间调大一些,不让系统出现msg timeout,就应该可以拿到统计数据了。 这个问题可能是由于个别lcore处理循环中出现了卡顿或者压力过大,我们会进一步排查原因。

icymoon commented 6 years ago

谢谢@ywc689 统计数据抖一下没关系,最多是图不好看,我最怕的是处理卡顿意味着有稳定性或者性能隐患。。。