mcusim / freebsd-src

sys/dev/dpaa2 drivers work-in-progress
https://www.FreeBSD.org/
Other
4 stars 3 forks source link

[ten64 branch] Dataflow stops with multiple port traffic #17

Closed mcbridematt closed 1 year ago

mcbridematt commented 1 year ago

Commit: 5f6b8b38dfebacb69aad59122e86d18484e7e3c1

In my test suite, I run iperf3 through the FreeBSD host functioning as a router

For example: iperf3 server <-> dpniX (FreeBSD) dpniX+1 <-> iperf3 client

The test system is another Ten64 running Linux which runs each iperf3 instance in a container with one of the ethX ports transferred into it.

So dpni0 on FreeBSD -> eth0 on test system, dpni1<->eth1, dpni2<->eth2 etc.

(I'll publish the scripts another time, they need a bit of cleanup)

cat /etc/rc.conf
hostname="freebsd-ten64"
ifconfig_dpni0="192.168.13.1 netmask 255.255.255.0"
ifconfig_dpni1="192.168.14.1 netmask 255.255.255.0"
ifconfig_dpni2="192.168.15.1 netmask 255.255.255.0"
ifconfig_dpni3="192.168.16.1 netmask 255.255.255.0"
ifconfig_dpni6="DHCP inet6 accept_rtadv"
growfs_enable="YES"
dhcpd_enable="YES"                          # dhcpd enabled?
dhcpd_flags="-q"                            # command option(s)
dhcpd_conf="/usr/local/etc/dhcpd.conf"      # configuration file
dhcpd_ifaces="dpni1 dpni3"                  # ethernet interface(s)
dhcpd_withumask="022"                       # file creation mask
gateway_enable="YES"
sshd_enable="YES"

dpni6 is the interface to my LAN for management

Server 1 is attached to dpni0 on 192.168.13.2 Client 1 is on dpni2, gets an IP via DHCP and initiates an iperf3 -R -c 192.168.13.2 Server 2 on 192.168.15.2, Client 2 on 192.168.16.X so on.

For this initial test, I will run just one flow.

On this branch, the dataflow completely stops almost immediately:

udhcpc: started, v1.34.1
udhcpc: broadcasting discover                                                   
udhcpc: broadcasting select for 192.168.14.10, server 192.168.14.1
udhcpc: lease of 192.168.14.10 obtained from 192.168.14.1, lease time 600
Connecting to host 192.168.13.2, port 5201
Reverse mode, remote host 192.168.13.2 is sending
[  5] local 192.168.14.10 port 53100 connected to 192.168.13.2 port 5201
[ ID] Interval           Transfer     Bitrate                                   
[  5]   0.00-1.00   sec  14.1 KBytes   116 Kbits/sec
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec

In this case, dpni1 won't receive any traffic to the iperf3 server (192.168.13.2), but will receive other frames:


root@freebsd-ten64:/dev # tcpdump -i dpni1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on dpni1, link-type EN10MB (Ethernet), capture size 262144 bytes
07:03:08.616665 ARP, Request who-has 192.168.14.1 tell 192.168.14.10, length 46
07:03:32.278905 IP 192.168.14.10 > 192.168.13.1: ICMP echo request, id 9, seq 0, length 64
# 192.168.13.2 missing
07:03:34.073692 IP 192.168.14.10 > 192.168.13.3: ICMP echo request, id 11, seq 0, length 64
07:03:34.858869 IP 192.168.14.10 > 192.168.13.4: ICMP echo request, id 12, seq 0, length 64
07:03:35.533436 IP 192.168.14.10 > 192.168.13.5: ICMP echo request, id 13, seq 0, length 64
07:03:36.293349 IP 192.168.14.10 > 192.168.13.6: ICMP echo request, id 14, seq 0, length 64
07:03:37.348211 IP 192.168.14.10 > 192.168.13.7: ICMP echo request, id 15, seq 0, length 64
07:03:38.298136 IP 192.168.14.10 > 192.168.13.8: ICMP echo request, id 16, seq 0, length 64
07:03:39.223962 IP 192.168.14.10 > 192.168.13.9: ICMP echo request, id 17, seq 0, length 64
# 192.168.13.10 MISSING
07:03:41.163779 IP 192.168.14.10 > 192.168.13.11: ICMP echo request, id 19, seq 0, length 64
07:03:42.098482 IP 192.168.14.10 > 192.168.13.12: ICMP echo request, id 20, seq 0, length 64
07:03:42.853497 IP 192.168.14.10 > 192.168.13.13: ICMP echo request, id 21, seq 0, length 64
# Other IPs not being received: 192.168.13.17, 26, 29, so looks like one of the queues is not processing

vmstat:

vmstat -i | grep dpaa2
its0,140: dpaa2_io0                                      17          0
its0,141: dpaa2_io1                                     798          2
its0,142: dpaa2_io2                                      13          0
its0,143: dpaa2_io3                                      53          0
its0,144: dpaa2_io4                                       4          0
its0,145: dpaa2_io5                                       4          0
its0,146: dpaa2_io6                                      21          0
its0,147: dpaa2_io7                                      38          0
its0,148: dpaa2_ni0                                       1          0
its0,149: dpaa2_ni1                                       1          0
its0,150: dpaa2_ni2                                       1          0
its0,151: dpaa2_ni3                                       1          0
its0,154: dpaa2_ni6                                       1          0

I do a few more vmstats:

its0,140: dpaa2_io0                                      17          0
its0,141: dpaa2_io1                                    1429          1
its0,142: dpaa2_io2                                      25          0
its0,143: dpaa2_io3                                     104          0
its0,144: dpaa2_io4                                      11          0
its0,145: dpaa2_io5                                      16          0
its0,146: dpaa2_io6                                      35          0
its0,147: dpaa2_io7                                      56          0
its0,148: dpaa2_ni0                                       1          0
its0,149: dpaa2_ni1                                       1          0
its0,150: dpaa2_ni2                                       1          0
its0,151: dpaa2_ni3                                       1          0
its0,154: dpaa2_ni6                                       1          0

its0,140: dpaa2_io0 counter has not changed, is it stuck?

dpaa2 niX counters:

sysctl dev.dpaa2_ni.0
dev.dpaa2_ni.0.stats.in_all_frames: 66
dev.dpaa2_ni.0.stats.in_all_bytes: 62368
dev.dpaa2_ni.0.stats.in_multi_frames: 0
dev.dpaa2_ni.0.stats.eg_all_frames: 36
dev.dpaa2_ni.0.stats.eg_all_bytes: 2471
dev.dpaa2_ni.0.stats.eg_multi_frames: 0
dev.dpaa2_ni.0.stats.in_filtered_frames: 0
dev.dpaa2_ni.0.stats.in_discarded_frames: 0
dev.dpaa2_ni.0.stats.in_nobuf_discards: 0
dev.dpaa2_ni.0.stats.buf_free: 0
dev.dpaa2_ni.0.stats.buf_num: 2800
dev.dpaa2_ni.0.%parent: dpaa2_rc0
dev.dpaa2_ni.0.%pnpinfo:
dev.dpaa2_ni.0.%location:
dev.dpaa2_ni.0.%driver: dpaa2_ni
dev.dpaa2_ni.0.%desc: DPAA2 Network Interface
root@freebsd-ten64:/dev # sysctl dev.dpaa2_ni.1
dev.dpaa2_ni.1.stats.in_all_frames: 584
dev.dpaa2_ni.1.stats.in_all_bytes: 56208
dev.dpaa2_ni.1.stats.in_multi_frames: 0
dev.dpaa2_ni.1.stats.eg_all_frames: 195
dev.dpaa2_ni.1.stats.eg_all_bytes: 32414
dev.dpaa2_ni.1.stats.eg_multi_frames: 0
dev.dpaa2_ni.1.stats.in_filtered_frames: 0
dev.dpaa2_ni.1.stats.in_discarded_frames: 0
dev.dpaa2_ni.1.stats.in_nobuf_discards: 0
dev.dpaa2_ni.1.stats.buf_free: 0
dev.dpaa2_ni.1.stats.buf_num: 2800
dev.dpaa2_ni.1.%parent: dpaa2_rc0
dev.dpaa2_ni.1.%pnpinfo:
dev.dpaa2_ni.1.%location:
dev.dpaa2_ni.1.%driver: dpaa2_ni
dev.dpaa2_ni.1.%desc: DPAA2 Network Interface
root@freebsd-ten64:/dev # sysctl dev.dpaa2_ni.6
dev.dpaa2_ni.6.stats.in_all_frames: 1103
dev.dpaa2_ni.6.stats.in_all_bytes: 95670
dev.dpaa2_ni.6.stats.in_multi_frames: 667
dev.dpaa2_ni.6.stats.eg_all_frames: 10
dev.dpaa2_ni.6.stats.eg_all_bytes: 978
dev.dpaa2_ni.6.stats.eg_multi_frames: 0
dev.dpaa2_ni.6.stats.in_filtered_frames: 2
dev.dpaa2_ni.6.stats.in_discarded_frames: 0
dev.dpaa2_ni.6.stats.in_nobuf_discards: 0
dev.dpaa2_ni.6.stats.buf_free: 0
dev.dpaa2_ni.6.stats.buf_num: 2800
dev.dpaa2_ni.6.%parent: dpaa2_rc0
dev.dpaa2_ni.6.%pnpinfo:
dev.dpaa2_ni.6.%location:
dev.dpaa2_ni.6.%driver: dpaa2_ni
dev.dpaa2_ni.6.%desc: DPAA2 Network Interface
dsalychev commented 1 year ago

@mcbridematt Same as https://github.com/mcusim/freebsd-src/issues/8#issuecomment-1655446455

mcbridematt commented 1 year ago

Excellent! I've tested with four ports active (GENERIC-NODEBUG) and it has gone 24 hours without issues.

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-86400.00 sec  7.46 TBytes   760 Mbits/sec  360862             sender
[  5]   0.00-86400.00 sec  7.46 TBytes   760 Mbits/sec                  receiver
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-86400.00 sec  6.78 TBytes   690 Mbits/sec  95783             sender
[  5]   0.00-86400.00 sec  6.78 TBytes   690 Mbits/sec                  receiver

I'll move my 'production' FreeBSD machine to this version and see how it goes.

dsalychev commented 1 year ago

I'll move my 'production' FreeBSD machine to this version and see how it goes.

I hope I'll still have a chance to commit those changes till 14.0. Thanks for testing and help!