xdp-project / xdp-tutorial

XDP tutorial
2.44k stars 576 forks source link

Trying to write XDP for redirecting packets in veth inside kubernetes #399

Open vincent5753 opened 7 months ago

vincent5753 commented 7 months ago

Hi, I am newbie to XDP. I got inspired by this tutorial and wanted to test redirecting packets between veths on kubernetes

I used hard-coded IP and MAC address in the XDP, and used return bpf_redirectXDP_TXXDP_REDIRECT but none of them worked.

I traced XDP using perf, and here is the result

➜  ebpf sudo perf trace --event 'xdp:*'
# bpf_redirect(ifindex2, 0)
     0.000 ping/936614 xdp:xdp_redirect_err:prog_id=1426 action=REDIRECT ifindex=66 to_ifindex=67 err=-6
# XDP_TX
 15562.756 ping/936800 xdp:xdp_bulk_tx:ifindex=66 action=TX sent=0 drops=1 err=-6
# XDP_REDIRECT
 27807.646 ping/936940 xdp:xdp_redirect_err:prog_id=1428 action=REDIRECT ifindex=66 to_ifindex=0 err=-22

My ENV setup is in fowlling

OS: Ubuntu 20.04(VM on Proxmox VE)
Kernel: 5.4.0-144-generic
Kubernetes: 1.23.17
CNI: flannel
clang: Ubuntu clang version 11.0.0-2~ubuntu20.04.1

test Pod YAML

apiVersion: v1
kind: Pod
metadata:
  name: ping-1-privileged
spec:
  containers:
  - name: ping-1-privileged
    image: ubuntu20.04
    command: ["sleep", "infinity"]
    securityContext:
      privileged: true
---
apiVersion: v1
kind: Pod
metadata:
  name: ping-2-privileged
spec:
  containers:
  - name: ping-2-privileged
    image: ubuntu20.04
    command: ["sleep", "infinity"]
    securityContext:
      privileged: true
---
apiVersion: v1
kind: Pod
metadata:
  name: ping-3-privileged
spec:
  containers:
  - name: ping-3-privileged
    image: ubuntu20.04
    command: ["sleep", "infinity"]
    securityContext:
      privileged: true

Can anyone point out what I was wrong or missing?

tohojo commented 7 months ago

Looks like XDP is not enabled on the peer veth device. This can be done in one of two ways:

I'm not sure if the second method works on a kernel that old, though, so going for the first option is probably safer...

vincent5753 commented 7 months ago

@tohojo Thanks for the reply. Can you provide your Kernel version for me, so I can test this in the newer version of Kernel?

vincent5753 commented 7 months ago

Here's how I setup my ENV I am trying to implement a LB like lb-from-scratch from lizrice I started 4 container for client / LB / nginx1(backend) / nginx2(backend)

and loaded dummy XDP onto veth of client / nginx1(backend) / nginx2(backend) and LB XDP onto veth of LB. Enabled NAPI using ethtool -K ${vethname} gro on And disabled rp_filer using echo 0 | sudo tee /proc/sys/net/ipv4/conf/${vethname}/rp_filter

I oringaly tried this on Ubuntu 20.04(with Kernel 5.4.0-144-generic), but I tried on Ubuntu 22.04(with Kernel 5.15.0-94-generic) yesterday, I could not figure out what was missing.

My setup can be found at hire https://github.com/vincent5753/MASTER-VP/tree/main/eBPF/redirect_icmp/docker

tohojo commented 7 months ago

The GRO thing was added in kernel 5.13, so on that 5.15 kernel it should be enough to just enable GRO on both ends of the veth device (i.e., both outside and inside the container). No idea how to get docker to do this for you, but it should be possible to just do it manually with ethtool after starting up the containers...

vincent5753 commented 7 months ago

I tried entering container and manually installed ethtool after starting up the containers, did ethtool -K eth0 gro on inside the container and ethtool -K ${vethname} gro on on host. But it did not work either.

Maybe this don't work on docker env because the veth pair? ref: Re: Veth pair swallow packets for XDP_TX operation But I tried mounting dummy onto other containers, that did not work either.

I am wondering if I can mount eBPF to tc to redirect packet to other interfaces instead? Maybe sacrifice some performance but works?

tohojo commented 7 months ago

VP @.***> writes:

I tried entering container and manually installed ethtool after starting up the containers, did ethtool -K eth0 gro on inside the container and ethtool -K ${vethname} gro on on host. But it did not work either.

Yeah, that should work (on the 5.15 kernel). Did the error codes change?

Maybe this don't work on docker env because the veth pair? ref: Re: Veth pair swallow packets for XDP_TX operation But I tried mounting dummy onto other containers, that did not work either.

I am wondering if I can mount eBPF to tc to redirect packet to other interfaces instead? Maybe sacrifice some performance but works?

Sure, you can use TC instead. If you're only doing things on veth devices that won't hurt performance either; XDP on veth only really makes sense performance-wise if you're redirecting frames into the veth devices from a physical NIC using XDP. If you're just sending traffic around between containers, just use TC :)