Open tantm3 opened 1 month ago
I updated one more test case from my research. I want to know whether bonding interface cause the overload of CPU usage, so, I removed the bonding interface and ran Katran shared mode directly in two physical interfaces.
1: lo: ....
2: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp/id:2118 qdisc mq state UP group default qlen 1000
link/ether d4:f5:ef:ac:ac:f0 brd ff:ff:ff:ff:ff:ff
altname enp18s0f0
inet 10.50.73.55/24 brd 10.50.73.255 scope global ens1f0
valid_lft forever preferred_lft forever
inet6 fe80::d6f5:efff:feac:acf0/64 scope link
valid_lft forever preferred_lft forever
3: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp/id:2123 qdisc mq state UP group default qlen 1000
link/ether d4:f5:ef:ac:ac:f8 brd ff:ff:ff:ff:ff:ff
altname enp18s0f1
inet 10.50.73.52/24 brd 10.50.73.255 scope global ens1f1
valid_lft forever preferred_lft forever
inet6 fe80::d6f5:efff:feac:acf8/64 scope link
valid_lft forever preferred_lft forever
....
8: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
9: ipip0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
10: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1000
link/tunnel6 :: brd :: permaddr f27f:220f:4e91::
11: ipip60@NONE: <NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN group default qlen 1000
link/tunnel6 :: brd :: permaddr baba:33a1:a4e6::
inet6 fe80::b8ba:33ff:fea1:a4e6/64 scope link
valid_lft forever preferred_lft forever
sudo ./build/example_grpc/katran_server_grpc -balancer_prog ./deps/bpfprog/bpf/balancer.bpf.o -default_mac 58:e4:34:56:46:e0 -forwarding_cores=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 -numa_nodes=0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1 -healthchecker_prog ./deps/bpfprog/bpf/healthchecking_ipip.o -intf=ens1f0 -ipip_intf=ipip0 -ipip6_intf=ipip60 -lru_size=1000000 -map_path /sys/fs/bpf/jmp_ens1 -prog_pos=2
i40e-ens1f0-TxRx-0(44) is affinitive with 00,00000001
i40e-ens1f0-TxRx-1(45) is affinitive with 00,00000002
i40e-ens1f0-TxRx-2(46) is affinitive with 00,00000004
i40e-ens1f0-TxRx-3(47) is affinitive with 00,00000008
i40e-ens1f0-TxRx-4(48) is affinitive with 00,00000010
i40e-ens1f0-TxRx-5(49) is affinitive with 00,00000020
i40e-ens1f0-TxRx-6(50) is affinitive with 00,00000040
i40e-ens1f0-TxRx-7(51) is affinitive with 00,00000080
i40e-ens1f0-TxRx-8(52) is affinitive with 00,00000100
i40e-ens1f0-TxRx-9(53) is affinitive with 00,00000200
i40e-ens1f1-TxRx-0(103) is affinitive with 00,00000400
i40e-ens1f1-TxRx-1(104) is affinitive with 00,00000800
i40e-ens1f1-TxRx-2(105) is affinitive with 00,00001000
i40e-ens1f1-TxRx-3(106) is affinitive with 00,00002000
i40e-ens1f1-TxRx-4(107) is affinitive with 00,00004000
i40e-ens1f1-TxRx-5(108) is affinitive with 00,00008000
i40e-ens1f1-TxRx-6(109) is affinitive with 00,00010000
i40e-ens1f1-TxRx-7(110) is affinitive with 00,00020000
i40e-ens1f1-TxRx-8(111) is affinitive with 00,00040000
i40e-ens1f1-TxRx-9(112) is affinitive with 00,00080000
perf
commandperf top
perf report
There is a little performance improvement when running with physical interfaces (according to the output from Katran), but the CPU usage is still full.
Hi @avasylev @tehnerd,
Could you guys share some thoughts on my setup? I still struggle with this.
sudo sysctl -a | grep bpf ?
Hi @tehnerd,
Here is the output:
Hmm. Strange. Please collect 1 perf record -a -F 23 -- sleep 10 2 same perf as before (in your previous pictures; I guess you were using -g as well). When looking into report move to balancers bpf program and use 'a' shortcut. That would show assembly code so we would understand where exactly in bpf program we consume cpu
Also how the traffic pattern looks like ? Are they real tcp streams or just random packets
Please collect 1 perf record -a -F 23 -- sleep 10 2 same perf as before (in your previous pictures; I guess you were using -g as well)
- I have a little trouble getting assembly code from
perf report
.- With bpf program, it shows an error
With other processes, I used a shortcut
and it returned the output like this one
Anyway, I share all the output that I collected from those commands.
perf record -a -F 23 -- sleep 10
Assembly code when I jump (press enter) deeper-dive in katran's bpf program
perf record -ag -- sleep 20
perf top --sort comm,dso
Again, it shows an error when I press a shortcut
to the bpf program
Also how the traffic pattern looks like ? Are they real tcp streams or just random packets
I used Pktgen to generate traffic, here is the configuration:
In short, I want to stimulate the syn flood packet and send it to Katran. I used xdpdump tool to capture the packet and the packets look like this:
06:24:05.724378 IP 49.213.85.169.14397 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP
06:24:05.724396 IP 49.213.85.169.14997 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP
06:24:05.724401 IP 49.213.85.169.14984 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP
katran_goclient -l -server localhost:8080
2024/09/26 06:26:30 vips len 2
VIP: 49.213.85.153 Port: 80 Protocol: tcp
Vip's flags:
->49.213.85.171 weight: 1 flags:
VIP: 49.213.85.152 Port: 80 Protocol: tcp
Vip's flags:
->49.213.85.171 weight: 1 flags:
exiting
UPDATE:
When trying to figure out the cause, I stopped at this thread which describes the NIC driver i40e
drop packet at a rate 10Mpps: https://www.spinics.net/lists/xdp-newbies/msg01918.html#google_vignette
mine server have a similar NIC driver:
driver: i40e
version: 5.15.0-122-generic
firmware-version: 10.53.7
expansion-rom-version:
bus-info: 0000:37:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
And I saw a lot of package drops at the rx flow
Feels like bpf program is not jitted. Could you please run bpftool prog list and bpftool map list
Also in perf report (which was taken with -ag) Please show the output filtered with bpf keyword (/ and than type bpf)
Yes sure, here is the output of those commands bpftool prog list
bpftool map list
perf report with the output filtered with bpf keyword
UPDATE Here is the interface tag, it seems like the bpf program is jitted, at least from the outside view, hope this information is helpful.
Hi @tehnerd,
Would you happen to have any updates on this issue? Feel free to ask me to provide more information or do some tests!
No idea. For some reason bpf program seems slow in bpf code itself. At this point the only idea is to build perf with bpf support (link against that library which is required to disassembly bpf) and to check where that cpus is spent inside bpf.
You mention that it feels like the bpf program is not jitted. But, according to your answer, it does not seem to be the case here, right? So, the next step is building perf with bpf support and checking inside the bpf program.
Hi everyone!
I am currently running Katran as a L3 Director load balancer for our services. I would like to run Katran with a bonding interface because I believe it's easier to add more network interfaces rather than servers for scaling Katran's workload. I followed this issue (https://github.com/facebookincubator/katran/issues/13) and make the Katran work normally in shared mode and bonding interface with those command:
I do mapping irq with cpu:
The problem arose when I saw the Katran's statics, it's shown 100% lru miss rate