Open tyheist opened 5 years ago
f-stack.conf
[dpdk]
## Hexadecimal bitmask of cores to run on.
## 2 lcore 14,15
lcore_mask=0xc000
channel=4
promiscuous=1
numa_on=1
## TCP segment offload, default: disabled.
tso=0
## HW vlan strip, default: enabled.
vlan_strip=1
port_list=0
## Port config section
## Correspond to dpdk.port_list's index: port0, port1...
[port0]
addr=1.1.1.2
netmask=255.255.255.0
broadcast=1.1.1.255
gateway=1.1.1.1
## lcore list used to handle this port
## the format is same as port_list
lcore_list=14,15
## Packet capture path, this will hurt performance
#pcap=./a.pcap
#pcap=/home/ty/tcp6.pcap
## Kni config: if enabled and method=reject,
## all packets that do not belong to the following tcp_port and udp_port
## will transmit to kernel; if method=accept, all packets that belong to
## the following tcp_port and udp_port will transmit to kernel.
#[kni]
#enable=1
#method=reject
## The format is same as port_list
#tcp_port=80,443
#udp_port=53
## FreeBSD network performance tuning configurations.
## Most native FreeBSD configurations are supported.
[freebsd.boot]
hz=100
## Block out a range of descriptors to avoid overlap
## with the kernel's descriptor space.
## You can increase this value according to your app.
fd_reserve=1024
kern.ipc.maxsockets=262144
net.inet.tcp.syncache.hashsize=4096
net.inet.tcp.syncache.bucketlimit=100
net.inet.tcp.tcbhashsize=65536
kern.ncallout=262144
[freebsd.sysctl]
kern.ipc.somaxconn=32768
kern.ipc.maxsockbuf=16777216
net.link.ether.inet.maxhold=5
net.inet.tcp.fast_finwait2_recycle=1
net.inet.tcp.sendspace=16384
net.inet.tcp.recvspace=8192
net.inet.tcp.nolocaltimewait=1
net.inet.tcp.cc.algorithm=cubic
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.sendbuf_auto=1
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.sendbuf_inc=16384
net.inet.tcp.recvbuf_inc=524288
net.inet.tcp.sack.enable=1
net.inet.tcp.blackhole=1
net.inet.tcp.msl=2000
net.inet.tcp.delayed_ack=0
net.inet.udp.blackhole=1
net.inet.ip.redirect=0
nginx.conf
# root account is necessary.
user root;
# should be equal to the lcore count of `dpdk.lcore_mask` in f-stack.conf.
## 2 lcore
worker_processes 2;
# path of f-stack configuration file, default: $NGX_PREFIX/conf/f-stack.conf.
fstack_conf f-stack.conf;
events {
worker_connections 102400;
use kqueue;
}
daemon off;
http {
include mime.types;
default_type application/octet-stream;
sendfile off;
#keepalive_timeout 0;
keepalive_timeout 65;
#gzip on;
server {
listen 99;
location /1byte{
access_log off;
proxy_pass http://1.1.1.3/1byte;
}
}
}
I changed code, move the code which the capture packet for port, to capture each queue.
ff_dpdk_if.c init_lcore_conf(void)
338 printf("lcore: %u, port: %u, queue: %u\n", lcore_id, port_id, queueid);
339 uint16_t nb_rx_queue = lcore_conf.nb_rx_queue;
340 lcore_conf.rx_queue_list[nb_rx_queue].port_id = port_id;
341 lcore_conf.rx_queue_list[nb_rx_queue].queue_id = queueid;
342 lcore_conf.nb_rx_queue++;
343
344 lcore_conf.tx_queue_id[port_id] = queueid;
345 lcore_conf.tx_port_id[lcore_conf.nb_tx_port] = port_id;
346 lcore_conf.nb_tx_port++;
347
348 /** capture packet each queue */
349 char tmp[128] = {0};
350 sprintf(tmp, "%s-%d.pcap", pconf->pcap, queueid);
351 lcore_conf.pcap[port_id] = strdup(tmp);
352 ff_enable_pcap(lcore_conf.pcap[port_id]);
353 //lcore_conf.pcap[port_id] = pconf->pcap;
354 lcore_conf.nb_queue_list[port_id] = pconf->nb_lcores;
355 }
356
357 if (lcore_conf.nb_rx_queue == 0) {
358 rte_exit(EXIT_FAILURE, "lcore %u has nothing to do\n", lcore_id);
359 }
360
361 return 0;
362 }
I found f-stack nginx send SYN
to back-end send with queue 1
and recv SYN/ACK
from back-end with queue 0
-- netstat check status, confirm SYN
send with queue 1
root@ty:/home/ty/code/f-stack/tools/sbin# ./netstat -an -P 1
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 0 1.1.1.2.15754 1.1.1.3.80 SYN_SENT
tcp4 0 0 1.1.1.2.99 1.1.1.4.54206 ESTABLISHED
tcp4 0 0 *.99 *.* LISTEN
tcp4 0 0 *.80 *.* LISTEN
udp4 0 0 *.* *.*
-- queue 0
pcap
-- queue 1
pcap
f-stack using ff_rss_check(), select a sport as one of the hash elements, and check whether the hash value is in the current queue. so i don't know where is wrong
by the way, i had test NIC intel I210/I350/X710. when using I210/I350, each request by curl is ok, but wrk test is not work. when using X710, curl will fail after several requests
When connecting to a remote side, f-stack uses ff_rss_check
to check and select a port that packets can be received in this queue.
But it seemed that there's something wrong in ff_rss_check
with these nics.
i print input rss hash through struct rte_mbuf.hash.rss
, and print output rss hash in ff_rss_check()
, found the ouput rss hash difference with input rss hash.
check <Intel Ethernet Controller X710/XXV710/XL710 Datasheet>, section 7.1.10 said x710/xxv710/xl710 supports Microsoft* Toepliztz based hash
and simple hash
, Seclection between the two schemes is controlled by the global CFLQF_CTL register HTOEP
.
Section 10.2.2.19.21
i guest my nic HTOEP is 0 not 1
i do some test with X710. according to x710 datasheet, hash key is 52bytes, so i use datasheet section 7.1.10.1.2 hash key. X710 input rss hash and output rss hash is same now.
But when i use wrk to test, the f-stack nginx (reverse proxy) performance is poor. It's like using nic I350, curl test ok but wrk poor
update, -- nic I350/I210 input rss hash and output rss hash is same -- nic X710 use 52bytes hash key, input rss hash / output rss hash is same
now, the issuse is wrk test, runing mutil-lcore the performance is poor, but runing one lcore the performance is perfect.
F-Stack uses 40 bytes hash key by default, this may be the point. You can use the blow modifications to do some tests.
// Intel's i40e PMD default RSS key
static const rss_key_type default_rsskey_52bytes = {
0x44, 0x39, 0x79, 0x6b, 0xb5, 0x4c, 0x50, 0x23,
0xb6, 0x75, 0xea, 0x5b, 0x12, 0x4f, 0x9f, 0x30,
0xb8, 0xa2, 0xc0, 0x3d, 0xdf, 0xdc, 0x4d, 0x02,
0xa0, 0x8c, 0x9b, 0x33, 0x4a, 0xf6, 0x4a, 0x4c,
0x05, 0xc6, 0xfa, 0x34, 0x39, 0x58, 0xd8, 0x55,
0x7d, 0x99, 0x58, 0x3a, 0xe1, 0x38, 0xc9, 0x2e,
0x81, 0x15, 0x03, 0x66
};
// in function init_port_start, nearly line 660.
if (dev_info.hash_key_size == 52) {
port_conf.rx_adv_conf.rss_conf.rss_key = default_rsskey_52bytes;
port_conf.rx_adv_conf.rss_conf.rss_key_len = 52;
} else {
port_conf.rx_adv_conf.rss_conf.rss_key = default_rsskey_40bytes;
port_conf.rx_adv_conf.rss_conf.rss_key_len = 40;
}
i had test, nic X710 use 52bytes hash key, input rss hash / output rss hash is same.
the new issue is, when f-statck nginx run as reverse proxy and connect to backend with short connection, wrk run result is poor only 1000+ req/sec. when connect to backend with long connection, wrk run result is 130000+ req/sec
i guest above phenomena not related to hash value
by the way, i had do some performance with f-stack nginx and openresty nginx run as webserver.
-- one lcore openresty nginx 338087req/sec and f-stack nginx 337140 req/sec -- two lcore openresty nginx 449649 req/sec and f-stack nginx 337185 req/sec
I think,the root cause is that DPDK i40e driver disable the default ATR mode. I encountered the same issue on XL710 NIC. But the nginx worked OK as reverse proxy on X520 NIC. nginx initiatively connect to the real server, RSS unable to assign the packets with same session back to right original queue. This need the ATR mode in NIC hardware. https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/intel-ethernet-flow-director.pdf says,"ATR, the default mode, implements an algorithm that samples transmit traffic and learns to send receive traffic with the corresponding header information(source and destination reversed) to the core where the transmitted data came from."
@zhanghaisen Thank you for your reply, i will continue to test
i test with nic intel 82599ES, it work fine run as reverse proxy.
Environment:
client(1.1.1.4) ----> f-statck nginx(1.1.1.2) ---------> back-end(1.1.1.3)
Case1:
f-statck nginx run one lcore and one port. -- run as reverse proxy, wrk test result 58504 Requests/sec.
Case2:
f-statck nginx run two lcore and one port,
-- run as reverse proxy, wrk test result only 1025 Requests/sec. And using curl to test everthing works fine.
-- run as webserver, wrk test result 337185 Request/sec.
use netstat check network status -- run as reverse proxy, the client establishes all connect in one of two nginx process. And Few connections to back-end servers -- run as web server, the client establishes connections between two processes, half and half
issue #62 @whl739 said f-stack nginx support reverse proxy. I use version 1.11/1.12 and master/dev branch to test, the result is same -- my configure f-stack.conf.txt nginx.conf.txt