facebookincubator / katran

A high performance layer 4 load balancer
GNU General Public License v2.0
4.72k stars 503 forks source link

High CPU usage when running Katran in shared mode with bonding interface #235

Open tantm3 opened 1 month ago

tantm3 commented 1 month ago

Hi everyone!

I am currently running Katran as a L3 Director load balancer for our services. I would like to run Katran with a bonding interface because I believe it's easier to add more network interfaces rather than servers for scaling Katran's workload. I followed this issue (https://github.com/facebookincubator/katran/issues/13) and make the Katran work normally in shared mode and bonding interface with those command:

# Network config
1: lo: ...
2: ens2f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 xdp/id:1900 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 02:96:77:09:2b:73 brd ff:ff:ff:ff:ff:ff permaddr d4:f5:ef:36:1a:60
    altname enp55s0f0
3: ens2f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 xdp/id:1905 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 02:96:77:09:2b:73 brd ff:ff:ff:ff:ff:ff permaddr d4:f5:ef:36:1a:68
    altname enp55s0f1
4: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
5: ipip0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
6: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1000
    link/tunnel6 :: brd :: permaddr 16dd:8fa:927c::
7: ipip60@NONE: <NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN group default qlen 1000
    link/tunnel6 :: brd :: permaddr 42d6:82ed:7cf5::
    inet6 fe80::40d6:82ff:feed:7cf5/64 scope link 
       valid_lft forever preferred_lft forever
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 02:96:77:09:2b:73 brd ff:ff:ff:ff:ff:ff
    inet 10.50.73.53/24 brd 10.50.73.255 scope global bond0
       valid_lft forever preferred_lft forever
    inet6 fe80::96:77ff:fe09:2b73/64 scope link 
       valid_lft forever preferred_lft forever
## for xdp root I edited the install_xdproot.sh script.
## And I run just one command to add Katran loadbalancer xdp program
sudo ./build/example_grpc/katran_server_grpc -balancer_prog ./deps/bpfprog/bpf/balancer.bpf.o -default_mac 58:e4:34:56:46:e0  -forwarding_cores=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 -numa_nodes=0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1 -healthchecker_prog ./deps/bpfprog/bpf/healthchecking_ipip.o -intf=ens2f0 -ipip_intf=ipip0 -ipip6_intf=ipip60 -lru_size=100000 -map_path /sys/fs/bpf/jmp_ens2 -prog_pos=2
## Katran VIP + REAL config
2024/09/08 04:23:39 vips len 1
VIP:        49.213.85.151 Port:     80 Protocol: tcp
Vip's flags: 
 ->49.213.85.171     weight: 1 flags: 
exiting

- And all 20 CPUs are consumed by `ksoftirqd`
![image](https://github.com/user-attachments/assets/a8cd637e-f93b-4432-ae70-2af7cfe68891)

- Here is the screenshot that shown the output of `perf report`:
![image](https://github.com/user-attachments/assets/f9a8b574-e73c-45c3-9062-a16f6b0b9301)

I am not sure this performance issue is relating to Katran or not. So, I post this question here to find some clues.

Feel free to ask me to provide more information!
tantm3 commented 3 weeks ago

I updated one more test case from my research. I want to know whether bonding interface cause the overload of CPU usage, so, I removed the bonding interface and ran Katran shared mode directly in two physical interfaces.

There is a little performance improvement when running with physical interfaces (according to the output from Katran), but the CPU usage is still full.

tantm3 commented 1 week ago

Hi @avasylev @tehnerd,

Could you guys share some thoughts on my setup? I still struggle with this.

tehnerd commented 1 week ago

sudo sysctl -a | grep bpf ?

tantm3 commented 1 week ago

Hi @tehnerd,

Here is the output:

image
tehnerd commented 1 week ago

Hmm. Strange. Please collect 1 perf record -a -F 23 -- sleep 10 2 same perf as before (in your previous pictures; I guess you were using -g as well). When looking into report move to balancers bpf program and use 'a' shortcut. That would show assembly code so we would understand where exactly in bpf program we consume cpu

tehnerd commented 1 week ago

Also how the traffic pattern looks like ? Are they real tcp streams or just random packets

tantm3 commented 1 week ago

Please collect 1 perf record -a -F 23 -- sleep 10 2 same perf as before (in your previous pictures; I guess you were using -g as well)

  • I have a little trouble getting assembly code from perf report.
  • With bpf program, it shows an error image
image

Again, it shows an error when I press a shortcut to the bpf program

image
tantm3 commented 1 week ago

Also how the traffic pattern looks like ? Are they real tcp streams or just random packets

06:24:05.724378 IP 49.213.85.169.14397 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP
06:24:05.724396 IP 49.213.85.169.14997 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP
06:24:05.724401 IP 49.213.85.169.14984 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP

UPDATE:

image
tehnerd commented 1 week ago

Feels like bpf program is not jitted. Could you please run bpftool prog list and bpftool map list

tehnerd commented 1 week ago

Also in perf report (which was taken with -ag) Please show the output filtered with bpf keyword (/ and than type bpf)

tantm3 commented 1 week ago

Yes sure, here is the output of those commands bpftool prog list

image

bpftool map list

image image

perf report with the output filtered with bpf keyword

image

UPDATE Here is the interface tag, it seems like the bpf program is jitted, at least from the outside view, hope this information is helpful.

image

tantm3 commented 4 days ago

Hi @tehnerd,

Would you happen to have any updates on this issue? Feel free to ask me to provide more information or do some tests!

tehnerd commented 2 days ago

No idea. For some reason bpf program seems slow in bpf code itself. At this point the only idea is to build perf with bpf support (link against that library which is required to disassembly bpf) and to check where that cpus is spent inside bpf.

tantm3 commented 2 days ago

You mention that it feels like the bpf program is not jitted. But, according to your answer, it does not seem to be the case here, right? So, the next step is building perf with bpf support and checking inside the bpf program.