mtcp-stack / mtcp

mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems
Other
1.98k stars 435 forks source link

mTCP net_ixgbe_vf RSS asymmetric issue? #282

Open vincentmli opened 4 years ago

vincentmli commented 4 years ago

I am testing mTCP in KVM guest with IXGBE SR-IOV VF provisioned, the epwget runs fine with one core, but if I give it two cores, asymmetric flow direction happens, for example:

./apps/example/epwget 10.0.0.2 8 -f ./apps/example/epwget.conf -N 2 -c 2
Configuration updated by mtcp_setconf().
Application configuration:
URL: /
# of total_flows: 8
# of cores: 2
Concurrency: 2
---------------------------------------------------------------------------------
Loading mtcp configuration from : ./apps/example/epwget.conf
Loading interface setting
[probe_all_rte_devices: 128] Could not find pci info on dpdk device: =. Is it a dpdk-attached interface?
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Auto-detected process type: PRIMARY
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: PCI device 0000:00:10.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:10ed net_ixgbe_vf
Total number of attached devices: 1
Interface name: dpdk0
EAL: Auto-detected process type: PRIMARY
Configurations:
Number of CPU cores available: 2
Number of CPU cores to use: 2
Maximum number of concurrency per core: 10000
Maximum number of preallocated buffers per core: 10000
Receive buffer size: 8192
Send buffer size: 8192
TCP timeout seconds: 30
TCP timewait seconds: 0
NICs to print statistics: dpdk0
---------------------------------------------------------------------------------
Interfaces:
name: dpdk0, ifindex: 0, hwaddr: 52:54:00:7D:36:0C, ipaddr: 10.0.0.1, netmask: 255.255.255.0
Number of NIC queues: 2
---------------------------------------------------------------------------------
Loading routing configurations from : config/route.conf
Routes:
Destination: 10.0.0.0/24, Mask: 255.255.255.0, Masked: 10.0.0.0, Route: ifdx-0
Destination: 10.0.0.0/24, Mask: 255.255.255.0, Masked: 10.0.0.0, Route: ifdx-0
---------------------------------------------------------------------------------
Loading ARP table from : config/arp.conf
ARP Table:
IP addr: 10.0.0.2, dst_hwaddr: 90:E2:BA:93:26:0E
---------------------------------------------------------------------------------
Initializing port 0... ixgbevf_dev_configure(): VF can't disable HW CRC Strip
Ethdev port_id=0 tx_queue_id=0, new added offloads 0x8011 must be within pre-queue offload capabilities 0x0 in rte_eth_tx_queue_setup()

Ethdev port_id=0 tx_queue_id=1, new added offloads 0x8011 must be within pre-queue offload capabilities 0x0 in rte_eth_tx_queue_setup()

done: 
[dpdk_load_module: 761] Failed to get flow control info!
[dpdk_load_module: 768] Failed to set flow control info!: errno: -95

Checking link statusdone
Port 0 Link Up - speed 10000 Mbps - full-duplex
Configuration updated by mtcp_setconf().
CPU 1: initialization finished.
[mtcp_create_context:1359] CPU 1 is now the master thread.
Thread 1 handles 4 flows. connecting to 10.0.0.2:80
[CPU 1] dpdk0 flows:      0, RX:       0(pps) (err:     0),  0.00(Gbps), TX:       0(pps),  0.00(Gbps)
[ ALL ] dpdk0 flows:      0, RX:       0(pps) (err:     0),  0.00(Gbps), TX:       0(pps),  0.00(Gbps)
[CPU 1] dpdk0 flows:      0, RX:       0(pps) (err:     0),  0.00(Gbps), TX:       0(pps),  0.00(Gbps)
[ ALL ] dpdk0 flows:      0, RX:       0(pps) (err:     0),  0.00(Gbps), TX:       0(pps),  0.00(Gbps)
CPU 0: initialization finished.
Response size set to 978
Thread 0 handles 4 flows. connecting to 10.0.0.2:80
[CPU 0] dpdk0 flows:      1, RX:       5(pps) (err:     0),  0.00(Gbps), TX:       7(pps),  0.00(Gbps)
[CPU 1] dpdk0 flows:      1, RX:       5(pps) (err:     0),  0.00(Gbps), TX:       7(pps),  0.00(Gbps)
[ ALL ] dpdk0 flows:      2, RX:      10(pps) (err:     0),  0.00(Gbps), TX:      14(pps),  0.00(Gbps)
[CPU 0] dpdk0 flows:      1, RX:       1(pps) (err:     0),  0.00(Gbps), TX:       2(pps),  0.00(Gbps)
[CPU 1] dpdk0 flows:      1, RX:       1(pps) (err:     0),  0.00(Gbps), TX:       2(pps),  0.00(Gbps)
[ ALL ] dpdk0 flows:      2, RX:       2(pps) (err:     0),  0.00(Gbps), TX:       4(pps),  0.00(Gbps)
[CPU 0] dpdk0 flows:      1, RX:       1(pps) (err:     0),  0.00(Gbps), TX:       2(pps),  0.00(Gbps)
[CPU 1] dpdk0 flows:      1, RX:       1(pps) (err:     0),  0.00(Gbps), TX:       2(pps),  0.00(Gbps)

here is the debug log for core 0 log_0, note the tcp source port 1028 connection (SYN) is initiated by core 1, but the SYN+ACK is received by core 0, no existing TCP stream matches the flow on core 0, thus the [CreateNewFlowHTEntry: 725] Weird packet comes.

core 0 log_0:


[MTCPRunThread:1238] CPU 0: initialization finished.
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 0: 10.0.0.1(1026) -> 10.0.0.2(80) (ISS: 1012484)
[ STATE: mtcp_connect: 807] Stream 0: TCP_ST_SYN_SENT
[RunMainLoop: 773] CPU 0: mtcp thread running.
IN 0 1827801836 10.0.0.2(80) -> 10.0.0.1(1028) IP_ID=0 TTL=64 TCP S A seq 191461752 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1028) IP_ID=0 TTL=64 TCP S A seq 191461752 ack 1716955680 WDW=65160 len=60
IN 0 1827801837 10.0.0.2(80) -> 10.0.0.1(1026) IP_ID=0 TTL=64 TCP S A seq 3479957166 ack 1012485 WDW=65160 len=74
[ STATE: Handle_TCP_ST_SYN_SENT: 812] Stream 0: TCP_ST_ESTABLISHED
IN 0 1827801849 10.0.0.2(80) -> 10.0.0.1(1026) IP_ID=4245 TTL=64 TCP A seq 3479957167 ack 1012586 WDW=509 len=66
IN 0 1827801849 10.0.0.2(80) -> 10.0.0.1(1026) IP_ID=4246 TTL=64 TCP F A seq 3479957167 ack 1012586 WDW=509 len=1044
[ STATE: Handle_TCP_ST_ESTABLISHED: 942] Stream 0: TCP_ST_CLOSE_WAIT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 1: 10.0.0.1(1027) -> 10.0.0.2(80) (ISS: 1716955679)
[ STATE: mtcp_connect: 807] Stream 1: TCP_ST_SYN_SENT
[ STATE: HandleApplicationCalls: 599] Stream 0: TCP_ST_LAST_ACK
IN 0 1827801849 10.0.0.2(80) -> 10.0.0.1(1026) IP_ID=0 TTL=64 TCP A seq 3479958146 ack 1012587 WDW=509 len=66
[ STATE: Handle_TCP_ST_LAST_ACK:1015] Stream 0: TCP_ST_CLOSED
[STREAM: DestroyTCPStream: 413] DESTROY TCP STREAM 0: 10.0.0.1(1026) -> 10.0.0.2(80) (CLOSED)
[STREAM: DestroyTCPStream: 569] Destroyed. Remaining flows: 1
IN 0 1827802327 10.0.0.2(80) -> 10.0.0.1(1028) IP_ID=0 TTL=64 TCP S A seq 199272686 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1028) IP_ID=0 TTL=64 TCP S A seq 199272686 ack 1716955680 WDW=65160 len=60
IN 0 1827803327 10.0.0.2(80) -> 10.0.0.1(1028) IP_ID=0 TTL=64 TCP S A seq 214897367 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1028) IP_ID=0 TTL=64 TCP S A seq 214897367 ack 1716955680 WDW=65160 len=60
IN 0 1827805327 10.0.0.2(80) -> 10.0.0.1(1028) IP_ID=0 TTL=64 TCP S A seq 246146665 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1028) IP_ID=0 TTL=64 TCP S A seq 246146665 ack 1716955680 WDW=65160 len=60

core 1 log_1:


[MTCPRunThread:1238] CPU 1: initialization finished.
[RunMainLoop: 773] CPU 1: mtcp thread running.
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 0: 10.0.0.1(1025) -> 10.0.0.2(80) (ISS: 1012484)
[ STATE: mtcp_connect: 807] Stream 0: TCP_ST_SYN_SENT
IN 0 1827801826 10.0.0.2(80) -> 10.0.0.1(1025) IP_ID=0 TTL=64 TCP S A seq 2092968279 ack 1012485 WDW=65160 len=74
[ STATE: Handle_TCP_ST_SYN_SENT: 812] Stream 0: TCP_ST_ESTABLISHED
IN 0 1827801826 10.0.0.2(80) -> 10.0.0.1(1025) IP_ID=55151 TTL=64 TCP A seq 2092968280 ack 1012586 WDW=509 len=66
IN 0 1827801827 10.0.0.2(80) -> 10.0.0.1(1025) IP_ID=55152 TTL=64 TCP F A seq 2092968280 ack 1012586 WDW=509 len=1044
[ STATE: Handle_TCP_ST_ESTABLISHED: 942] Stream 0: TCP_ST_CLOSE_WAIT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 1: 10.0.0.1(1028) -> 10.0.0.2(80) (ISS: 1716955679)
[ STATE: mtcp_connect: 807] Stream 1: TCP_ST_SYN_SENT
[ STATE: HandleApplicationCalls: 599] Stream 0: TCP_ST_LAST_ACK
IN 0 1827801827 10.0.0.2(80) -> 10.0.0.1(1025) IP_ID=0 TTL=64 TCP A seq 2092969259 ack 1012587 WDW=509 len=66
[ STATE: Handle_TCP_ST_LAST_ACK:1015] Stream 0: TCP_ST_CLOSED
[STREAM: DestroyTCPStream: 413] DESTROY TCP STREAM 0: 10.0.0.1(1025) -> 10.0.0.2(80) (CLOSED)
[STREAM: DestroyTCPStream: 569] Destroyed. Remaining flows: 1
IN 0 1827801849 10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 509557464 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 509557464 ack 1716955680 WDW=65160 len=60
IN 0 1827802349 10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 517358188 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 517358188 ack 1716955680 WDW=65160 len=60
IN 0 1827803349 10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 532982841 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 532982841 ack 1716955680 WDW=65160 len=60
IN 0 1827805349 10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 564232134 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 564232134 ack 1716955680 WDW=65160 len=60
IN 0 1827809349 10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 626730718 ack 1716955680 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.
10.0.0.2(80) -> 10.0.0.1(1027) IP_ID=0 TTL=64 TCP S A seq 626730718 ack 1716955680 WDW=65160 len=60
[RunMainLoop: 871] MTCP thread 1 out of main loop.
[MTCPRunThread:1238] CPU 1: initialization finished.

is this something need to be addressed in mtcp/src/rss.c ?


/*-------------------------------------------------------------------*/
/* RSS redirection table is in the little endian byte order (intel)  */
/*                                                                   */
/* idx: 0 1 2 3 | 4 5 6 7 | 8 9 10 11 | 12 13 14 15 | 16 17 18 19 ...*/
/* val: 3 2 1 0 | 7 6 5 4 | 11 10 9 8 | 15 14 13 12 | 19 18 17 16 ...*/
/* qid = val % num_queues */
/*-------------------------------------------------------------------*/
/*
 * IXGBE (Intel X520 NIC) : (Rx queue #) = (7 LS bits of RSS hash) mod N
 * I40E (Intel XL710 NIC) : (Rx queue #) = (9 LS bits of RSS hash) mod N
 */
#define RSS_BIT_MASK_IXGBE              0x0000007F
#define RSS_BIT_MASK_I40E               0x000001FF

ajamshed commented 4 years ago

@vincentmli,

As far as I know, we can not program RSS seed inside VFs. So if I were you, I would create 'n' number of VFs (where 'n' is the number of CPU cores you want to use). And then program each core to read from a different VF (CPU 0 -> reads from dpdk0 q0, CPU 1 -> reads from dpdk1 q0 and so on..). dpdk_module.c may need some update so that each worker thread reads from queue 0. This needs a patch. I may have to work on improving this part as soon as I get free time.

vincentmli commented 4 years ago

Asim thanks for answering the question. if I want to use 2 cores in VM, I should provision 2 VFs to the VM, I will give it a try.

ajamshed commented 4 years ago

Yes. Please do try and let me know if it worked.

vincentmli commented 4 years ago

Hi Asim,

It has been a while and I finally got time to test this but on different environment, this time I am not running on SR-IOV, but I run mTCP in Ubuntu 18.04 VM on top of ovs-dpdk vhost user client port according to http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/, I provisioned 2 virtio queues to the Ubuntu 18.04 VM, I had the same problem as this issue.

Here is the ovs-dpdk bridge and dpdk vhost user client port:

root@vli-lab:~# ovs-vsctl show
f7569f8c-2118-4fec-8197-2affe5dc581b
    Bridge ovs-br0
        datapath_type: netdev
        Port dpdkvhostuserclient0
            Interface dpdkvhostuserclient0
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/dpdkvhostuserclient0"}
        Port dpdk-p0
            Interface dpdk-p0
                type: dpdk
                options: {dpdk-devargs="0000:04:00.0", n_rxq="16", n_txq="16"}
        Port ovs-br0
            Interface ovs-br0
                type: internal
    ovs_version: "2.12.90"

here is the VM xml for vhost user client port configured with 2 queues

<interface type='vhostuser'>
  <mac address='00:00:00:00:00:09'/>
  <source type='unix' path='/tmp/dpdkvhostuserclient0' mode='server'/>
  <target dev='dpdkvhostuserclient0'/>
  <model type='virtio'/>
  <driver queues='2'>
    <host mrg_rxbuf='on'/>
  </driver>
  <alias name='net1'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</interface>

here is the test example output:


root@ubuntu1804:/usr/src/mtcp# ./apps/example/epwget 10.0.0.3 8 -f ./apps/example/epwget.conf -N 2
Configuration updated by mtcp_setconf().
Application configuration:
URL: /
# of total_flows: 8
# of cores: 2
Concurrency: 0
---------------------------------------------------------------------------------
Loading mtcp configuration from : ./apps/example/epwget.conf
Loading interface setting
[probe_all_rte_devices: 128] Could not find pci info on dpdk device: =. Is it a dpdk-attached interface?
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Auto-detected process type: PRIMARY
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
EAL: PCI device 0000:00:04.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 1af4:1000 net_virtio
Total number of attached devices: 1
Interface name: dpdk0
EAL: Auto-detected process type: PRIMARY
Configurations:
Number of CPU cores available: 2
Number of CPU cores to use: 2
Maximum number of concurrency per core: 10000
Maximum number of preallocated buffers per core: 10000
Receive buffer size: 8192
Send buffer size: 8192
TCP timeout seconds: 30
TCP timewait seconds: 0
NICs to print statistics: dpdk0
---------------------------------------------------------------------------------
Interfaces:
name: dpdk0, ifindex: 0, hwaddr: 00:00:00:00:00:09, ipaddr: 10.0.0.16, netmask: 255.255.255.0
Number of NIC queues: 2
---------------------------------------------------------------------------------
Loading routing configurations from : config/route.conf
Routes:
Destination: 10.0.0.0/24, Mask: 255.255.255.0, Masked: 10.0.0.0, Route: ifdx-0
Destination: 10.0.0.0/24, Mask: 255.255.255.0, Masked: 10.0.0.0, Route: ifdx-0
---------------------------------------------------------------------------------
Loading ARP table from : config/arp.conf
ARP Table:
IP addr: 10.0.0.2, dst_hwaddr: 90:E2:BA:93:26:0E
IP addr: 10.0.0.3, dst_hwaddr: 90:E2:BA:93:26:0C
---------------------------------------------------------------------------------
Initializing port 0... ethdev port_id=0 requested Tx offloads 0xe doesn't match Tx offloads capabilities 0x8001 in rte_eth_dev_configure()

Ethdev port_id=0 tx_queue_id=0, new added offloads 0x8011 must be within pre-queue offload capabilities 0x0 in rte_eth_tx_queue_setup()

Ethdev port_id=0 tx_queue_id=1, new added offloads 0x8011 must be within pre-queue offload capabilities 0x0 in rte_eth_tx_queue_setup()

done: 
rte_eth_dev_flow_ctrl_get: Function not supported
[dpdk_load_module: 761] Failed to get flow control info!
rte_eth_dev_flow_ctrl_set: Function not supported
[dpdk_load_module: 768] Failed to set flow control info!: errno: -95

Checking link statusdone
Port 0 Link Up - speed 10000 Mbps - full-duplex
Configuration updated by mtcp_setconf().
CPU 0: initialization finished.
[mtcp_create_context:1359] CPU 0 is now the master thread.
[CPU 0] dpdk0 flows:      0, RX:       0(pps) (err:     0),  0.00(Gbps), TX:       0(pps),  0.00(Gbps)
[ ALL ] dpdk0 flows:      0, RX:       0(pps) (err:     0),  0.00(Gbps), TX:       0(pps),  0.00(Gbps)
[CPU 0] dpdk0 flows:      0, RX:       0(pps) (err:     0),  0.00(Gbps), TX:       0(pps),  0.00(Gbps)
[ ALL ] dpdk0 flows:      0, RX:       0(pps) (err:     0),  0.00(Gbps), TX:       0(pps),  0.00(Gbps)
CPU 1: initialization finished.
Thread 0 handles 4 flows. connecting to 10.0.0.3:80
Thread 1 handles 4 flows. connecting to 10.0.0.3:80
Response size set to 978

here is the epwget debug log, core 0 created tcp stream for port 1026,1027,1030, 1031, but received SYN+ACK from port 1025 that core 0 did not create stream for, it logs "Weird packet comes"

[MTCPRunThread:1238] CPU 0: initialization finished.
[RunMainLoop: 773] CPU 0: mtcp thread running. 
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 0: 10.0.0.16(1026) -> 10.0.0.3(80) (ISS: 1012484)
[ STATE: mtcp_connect: 807] Stream 0: TCP_ST_SYN_SENT 
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 1: 10.0.0.16(1027) -> 10.0.0.3(80) (ISS: 1716955679)
[ STATE: mtcp_connect: 807] Stream 1: TCP_ST_SYN_SENT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 2: 10.0.0.16(1030) -> 10.0.0.3(80) (ISS: 1792309082)
[ STATE: mtcp_connect: 807] Stream 2: TCP_ST_SYN_SENT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 3: 10.0.0.16(1031) -> 10.0.0.3(80) (ISS: 229610924)
[ STATE: mtcp_connect: 807] Stream 3: TCP_ST_SYN_SENT
IN 0 262758017190:E2:BA:93:26:0C -> FF:FF:FF:FF:FF:FF protocol 0806  len=60
ARP header: 
Hardware type: 1 (len: 6), protocol type: 2048 (len: 4), opcode: 1 
Sender IP: 10.0.0.3, haddr: 90:E2:BA:93:26:0C
Target IP: 10.0.0.16, haddr: 00:00:00:00:00:00
ARP header: 
Hardware type: 1 (len: 6), protocol type: 2048 (len: 4), opcode: 2 
Sender IP: 10.0.0.16, haddr: 00:00:00:00:00:09
Target IP: 10.0.0.3, haddr: 90:E2:BA:93:26:0C
IN 0 2627580171 10.0.0.3(80) -> 10.0.0.16(1026) IP_ID=0 TTL=64 TCP S A seq 3773126738 ack 1012485 WDW=65160 len=74
[ STATE: Handle_TCP_ST_SYN_SENT: 812] Stream 0: TCP_ST_ESTABLISHED 
IN 0 2627580171 10.0.0.3(80) -> 10.0.0.16(1030) IP_ID=0 TTL=64 TCP S A seq 3920638880 ack 1792309083 WDW=65160 len=74
[ STATE: Handle_TCP_ST_SYN_SENT: 812] Stream 2: TCP_ST_ESTABLISHED
IN 0 2627580171 10.0.0.3(80) -> 10.0.0.16(1031) IP_ID=0 TTL=64 TCP S A seq 3750667222 ack 229610925 WDW=65160 len=74
[ STATE: Handle_TCP_ST_SYN_SENT: 812] Stream 3: TCP_ST_ESTABLISHED
IN 0 2627580171 10.0.0.3(80) -> 10.0.0.16(1027) IP_ID=0 TTL=64 TCP S A seq 320366513 ack 1716955680 WDW=65160 len=74
[ STATE: Handle_TCP_ST_SYN_SENT: 812] Stream 1: TCP_ST_ESTABLISHED 
IN 0 2627580171 10.0.0.3(80) -> 10.0.0.16(1025) IP_ID=0 TTL=64 TCP S A seq 2664655853 ack 1012485 WDW=65160 len=74
[CreateNewFlowHTEntry: 725] Weird packet comes.

I am wondering why this issue does not happen in VMware ESXi VM, but in KVM/qemu VM+ SR-IOV or KVM /qemu VM vhost virtio environment

vincentmli commented 4 years ago

do you think is it because DPDK virtio driver does not support RSS? I see this commit in DPDK virtio

# git show 13b3137f3b7c8
commit 13b3137f3b7c8f866947a9b34e06a8aec0d084f7
Author: Dilshod Urazov 
Date:   Wed Oct 9 13:32:07 2019 +0100

    net/virtio: reject unsupported Rx multi-queue modes

    This driver supports none of DCB, RSS or VMDQ modes, therefore must
    check and return error if configured incorrectly.

    Virtio can distribute Rx packets across multi-queue, but there is
    no controls (algorithm, redirection table, hash function) except
    number of Rx queues and ETH_MQ_RX_NONE is the best fit meaning
    no method is enforced on how to route packets to MQs.

    Fixes: c1f86306a026 ("virtio: add new driver")
    Cc: stable@dpdk.org

    Signed-off-by: Dilshod Urazov 
    Signed-off-by: Andrew Rybchenko 
    Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 0a2ed2e50..76bd40a3e 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -2066,6 +2066,13 @@ virtio_dev_configure(struct rte_eth_dev *dev)
        PMD_INIT_LOG(DEBUG, "configure");
        req_features = VIRTIO_PMD_DEFAULT_GUEST_FEATURES;

+       if (rxmode->mq_mode != ETH_MQ_RX_NONE) {
+               PMD_DRV_LOG(ERR,
+                       "Unsupported Rx multi queue mode %d",
+                       rxmode->mq_mode);
+               return -EINVAL;
+       }
+
        if (dev->data->dev_conf.intr_conf.rxq) {
                ret = virtio_init_device(dev, hw->req_guest_features);
                if (ret < 0)
vincentmli commented 4 years ago

this is debug log in core 1, no SYN+ACK received, so I highly suspect the DPDK virtio not supporting RSS cause this problem,

[MTCPRunThread:1238] CPU 1: initialization finished.
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 0: 10.0.0.16(1025) -> 10.0.0.3(80) (ISS: 1012484)
[ STATE: mtcp_connect: 807] Stream 0: TCP_ST_SYN_SENT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 1: 10.0.0.16(1028) -> 10.0.0.3(80) (ISS: 1716955679)
[ STATE: mtcp_connect: 807] Stream 1: TCP_ST_SYN_SENT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 2: 10.0.0.16(1029) -> 10.0.0.3(80) (ISS: 1792309082)
[ STATE: mtcp_connect: 807] Stream 2: TCP_ST_SYN_SENT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 3: 10.0.0.16(1032) -> 10.0.0.3(80) (ISS: 229610924)
[ STATE: mtcp_connect: 807] Stream 3: TCP_ST_SYN_SENT
[RunMainLoop: 773] CPU 1: mtcp thread running.
[RunMainLoop: 871] MTCP thread 1 out of main loop.
[MTCPRunThread:1238] CPU 1: initialization finished.
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 0: 10.0.0.16(1025) -> 10.0.0.3(80) (ISS: 1012484)
[ STATE: mtcp_connect: 807] Stream 0: TCP_ST_SYN_SENT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 1: 10.0.0.16(1028) -> 10.0.0.3(80) (ISS: 1716955679)
[ STATE: mtcp_connect: 807] Stream 1: TCP_ST_SYN_SENT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 2: 10.0.0.16(1029) -> 10.0.0.3(80) (ISS: 1792309082)
[ STATE: mtcp_connect: 807] Stream 2: TCP_ST_SYN_SENT
[STREAM: CreateTCPStream: 372] CREATED NEW TCP STREAM 3: 10.0.0.16(1032) -> 10.0.0.3(80) (ISS: 229610924)
[ STATE: mtcp_connect: 807] Stream 3: TCP_ST_SYN_SENT
[RunMainLoop: 773] CPU 1: mtcp thread running.
[RunMainLoop: 871] MTCP thread 1 out of main loop.
[RunMainLoop: 874] MTCP thread 1 flushed logs.