nberlee / talos

Friendly fork for Turing RK1 on Talos
https://www.talos.dev
Mozilla Public License 2.0
67 stars 0 forks source link

Example With Cilium #1

Open camrossi opened 8 months ago

camrossi commented 8 months ago

he ascii cinema video is great and give an very good idea on how to deploy this. Perhaps you could consider adding an example on how to deploy this for Cilium + ebpf + Proxy Replacement and the L2 Advertisement feature.

This is what I did in my cluster:

  1. I use patch to generate cp config, If you have more than 3 nodes you will need to also patch/edit the worker config accordingly. I add the extensions,and also select the right interfaces for the VIP advertisement. It is a must to use the busPath as the interface names are derived from the mac address so are all different.
    
    - op: add
    path: /machine/install/extensions
    value:
    - image: ghcr.io/nberlee/rk3588:v1.6.3
  1. Generate the configs and customize them:
talosctl gen config turing https://<VIP>:6443 --config-patch-control-plane @cp.patch.yaml
cp  controlplane.yaml rk1-1.yaml
cp  controlplane.yaml rk1-2.yaml
cp  controlplane.yaml rk1-3.yaml
  1. Edit the files and set the hostname

    machine:
    network:
        hostname: rk1-1
  2. Bootstrap the cluster as per your video

  3. Now the cluster comes up without a CNI as I have disabled it but we can just deploy Cilium with Helm

helm install \
    cilium \
    cilium/cilium \
    --version 1.14.6 \
    --namespace kube-system \
    --set ipam.mode=kubernetes \
    --set=kubeProxyReplacement=true \
    --set=securityContext.capabilities.ciliumAgent="{CHOWN,KILL,NET_ADMIN,NET_RAW,IPC_LOCK,SYS_ADMIN,SYS_RESOURCE,DAC_OVERRIDE,FOWNER,SETGID,SETUID}" \
    --set=securityContext.capabilities.cleanCiliumState="{NET_ADMIN,SYS_ADMIN,SYS_RESOURCE}" \
    --set=cgroup.autoMount.enabled=false \
    --set=cgroup.hostRoot=/sys/fs/cgroup \
    --set l2announcements.enabled=true \
    --set kubeProxyReplacement=true \
    --set loadBalancer.acceleration=native \
    --set k8sServiceHost=127.0.0.1  \
    --set k8sServicePort=7445 \
    --set bpf.masquerade=true
  1. Follow the Cilium L2 advertisement guide to expose services with the new L2 functionality
nberlee commented 8 months ago

I still need to find a a nice place for a quickstart guide. This issue will stay open in the mean time

k8sServiceHost should be 127.0.0.1 as of Talos version 1.6.2 and loadBalancer.acceleration=native for xdp native support should be added.

camrossi commented 8 months ago

Hi @nberlee thanks for the loadBalancer.acceleration=native totally missed that.

I tried it and it worked fine on my 3xRK1 but is failing on my RPI4

level=fatal msg="Failed to compile XDP program" error="program cil_xdp_entry: attaching XDP program to interface enxe45f01c7527b: operation not supported" subsys=datapath-loader

Seems I need some more Kernel parameters... something I will check later but for the scope of this I have added it in my original comment :)

As for the k8sServiceHost in the talos config is specified to use localhost https://www.talos.dev/v1.6/kubernetes-guides/network/deploying-cilium/#without-kube-proxy are you sure are not interchangeable?

nberlee commented 8 months ago

Yes, if you have a mixed cluster, you cannot set XDP native in helm.

The Pi4 has a brcmgenet driver which does not support XDP native.

There is Generic XDP which doesn't use the driver but the kernel (a bit slower) but is only supported in Talos 1.7 as XDP_SOCKETS are on.

So in order to have the RK1s in XDP native native mode and your pi disabled, you can try XDP selective config.

Have it disabled in your helm. add the label to all your RK1 nodes (kubectl label node rk1-1 io.cilium.xdp-offload=true), restart cilium ds pods and only the RK1s have XDP native enabled then.

As for k8sServiceHost, see https://github.com/siderolabs/talos/commit/8fa6e93f0 I am certain the docs will change to 127.0.0.1 in this regard. It makes ipv6 clusters not have an issue with https://github.com/siderolabs/talos/issues/8112

camrossi commented 8 months ago

I was just reading the XDP selective config! Will give it a try later today !

Thank you so much for the in-depth explanation, I have updated the 127.0.0.1!

nberlee commented 8 months ago

Now it says k8sServiceHost and k8sServicePort twice :)

camrossi commented 8 months ago

Upsies... I think now is fixed!

RealKelsar commented 7 months ago

If i follow this, the cluster will never get ready, as soon as i enable the cni it works.

Ok, if I had set the port for the cluster address, it would have worked, so feel free to ignore...

bguijt commented 6 months ago

I embedded this configuration in my Talos setup script: https://github.com/bguijt/turingpi2/tree/main/projects/talos/shell - thanks!