F-Stack / f-stack

F-Stack is an user space network development kit with high performance based on DPDK, FreeBSD TCP/IP stack and coroutine API.
http://www.f-stack.org
Other
3.78k stars 884 forks source link

Question on the performance of Redis #463

Open derekbit opened 4 years ago

derekbit commented 4 years ago

Hello,

I have simple setup to test the performance of Redis with f-stack (v1.20). Two PCs equipped with one 10 GbE NIC (Intel X540-AT2) are directly connected by an Ethernet cable.

The PC B (use generic linux network stack) runs the command to test the two servers' performance

redis-benchmark -t set,get -h 10.0.0.1 -p 6379 -d 40 -n 100000 -c <N>

However, I got the results

  f-stack redis official redis
num. of clients 1 1
SET (Req/Sec) 3592.86 3904.42
GET (Req/Sec) 3672.02 4199.39
  f-stack redis official redis
num. of clients 50 50
SET (Req/Sec) 112981.59 105797.72
GET (Req/Sec) 117980.18 111370.98
  f-stack redis official redis
num. of clients 100 100
SET (Req/Sec) 100441.94 101235.06
GET (Req/Sec) 102880.66 101781.17

The f-stack Redis' performance is similar to official Redis. Is this expected? Or, is there any setting incorrect?

By the way, I also test the f-stack nginx and official nginx. f-stack nginx's performance is roughly three times better than official nginx.

Thans!

jfb8856606 commented 4 years ago

Show your info of config.ini.

Different scenarios require different configuration parameters to achieve optimal performance

And vs officical Redis, the performance of F-Stack can be improved by up to 20-30% use one instance.

derekbit commented 4 years ago

This is my config.ini

[dpdk]
# Hexadecimal bitmask of cores to run on.
lcore_mask=1

# Number of memory channels.
channel=2

# Specify base virtual address to map.
#base_virtaddr=0x7f0000000000

# Promiscuous mode of nic, defualt: enabled.
promiscuous=1
numa_on=1

# TX checksum offload skip, default: disabled.
# We need this switch enabled in the following cases:
# -> The application want to enforce wrong checksum for testing purposes
# -> Some cards advertize the offload capability. However, doesn't calculate checksum.
tx_csum_offoad_skip=0

# TCP segment offload, default: disabled.
tso=1

# HW vlan strip, default: enabled.
vlan_strip=1

# sleep when no pkts incomming
# unit: microseconds
idle_sleep=0

# sent packet delay time(0-100) while send less than 32 pkts.
# default 100 us.
# if set 0, means send pkts immediately.
# if set >100, will dealy 100 us.
# unit: microseconds
pkt_tx_delay=0

# enabled port list
#
# EBNF grammar:
#
#    exp      ::= num_list {"," num_list}
#    num_list ::= <num> | <range>
#    range    ::= <num>"-"<num>
#    num      ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
#
# examples
#    0-3       ports 0, 1,2,3 are enabled
#    1-3,4,7   ports 1,2,3,4,7 are enabled
#
# If use bonding, shoule config the bonding port id in port_list
# and not config slave port id in port_list
# such as, port 0 and port 1 trank to a bonding port 2,
# should set `port_list=2` and config `[port2]` section

port_list=0

# Number of vdev.
nb_vdev=0

# Number of bond.
nb_bond=0

# Port config section
# Correspond to dpdk.port_list's index: port0, port1...
[port0]
addr=10.0.0.1
netmask=255.255.255.0
broadcast=10.0.0.255
gateway=10.0.0.1

# lcore list used to handle this port
# the format is same as port_list
#lcore_list=0

# bonding slave port list used to handle this port
# need to config while this port is a bonding port
# the format is same as port_list
#slave_port_list=0,1

# Packet capture path, this will hurt performance
#pcap=./a.pcap

# Vdev config section
# orrespond to dpdk.nb_vdev's index: vdev0, vdev1...
#    iface : Shouldn't set always.
#    path : The vuser device path in container. Required.
#    queues : The max queues of vuser. Optional, default 1, greater or equal to the number of processes.
#    queue_size : Queue size.Optional, default 256.
#    mac : The mac address of vuser. Optional, default random, if vhost use phy NIC, it should be set to the phy NIC's mac.
#    cq : Optional, if queues = 1, default 0; if queues > 1 default 1.
#[vdev0]
##iface=/usr/local/var/run/openvswitch/vhost-user0
#path=/var/run/openvswitch/vhost-user0
#queues=1
#queue_size=256
#mac=00:00:00:00:00:01
#cq=0

# bond config section
# See http://doc.dpdk.org/guides/prog_guide/link_bonding_poll_mode_drv_lib.html
[bond0]
#mode=4
#slave=0000:0a:00.0,slave=0000:0a:00.1
#primary=0000:0a:00.0
#mac=f0:98:38:xx:xx:xx
## opt argument
#socket_id=0
#xmit_policy=l23
#lsc_poll_period_ms=100
#up_delay=10
#down_delay=50

# Kni config: if enabled and method=reject,
# all packets that do not belong to the following tcp_port and udp_port
# will transmit to kernel; if method=accept, all packets that belong to
# the following tcp_port and udp_port will transmit to kernel.
#[kni]
#enable=1
#method=reject
# The format is same as port_list
#tcp_port=80,443
#udp_port=53

# FreeBSD network performance tuning configurations.
# Most native FreeBSD configurations are supported.
[freebsd.boot]
hz=100

# Block out a range of descriptors to avoid overlap
# with the kernel's descriptor space.
# You can increase this value according to your app.
fd_reserve=1024

kern.ipc.maxsockets=262144

net.inet.tcp.syncache.hashsize=4096
net.inet.tcp.syncache.bucketlimit=100

net.inet.tcp.tcbhashsize=65536

kern.ncallout=262144

kern.features.inet6=1
net.inet6.ip6.auto_linklocal=1
net.inet6.ip6.accept_rtadv=2
net.inet6.icmp6.rediraccept=1
net.inet6.ip6.forwarding=0

[freebsd.sysctl]
kern.ipc.somaxconn=32768
kern.ipc.maxsockbuf=16777216

net.link.ether.inet.maxhold=5

net.inet.tcp.fast_finwait2_recycle=1
net.inet.tcp.sendspace=16384
net.inet.tcp.recvspace=131072
#net.inet.tcp.nolocaltimewait=1
net.inet.tcp.cc.algorithm=cubic
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.sendbuf_auto=1
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.sendbuf_inc=16384
net.inet.tcp.recvbuf_inc=524288
net.inet.tcp.sack.enable=1
net.inet.tcp.blackhole=1
net.inet.tcp.msl=2000
net.inet.tcp.delayed_ack=0

net.inet.udp.blackhole=1
net.inet.ip.redirect=0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
~                         
lazeintech commented 3 years ago

Any update on this? I am facing the same issue

jfb8856606 commented 3 years ago

You can try to adjust the value of some parameters, such as pkt_tx_delay or net.inet.tcp.delayed_ack etc. In different test scenarios, different parameter values will have different performance.

For example, pkt_tx_delay set to 50 or 100 us, contribute to the performance improvement of batch requests.

lazeintech commented 3 years ago

You can try to adjust the value of some parameters, such as pkt_tx_delay or net.inet.tcp.delayed_ack etc. In different test scenarios, different parameter values will have different performance.

For example, pkt_tx_delay set to 50 or 100 us, contribute to the performance improvement of batch requests.

Thanks, by reducing pkt_tx_delay, it runs much more faster.