vpsfreecz / vpsadminos

Host for Linux system containers based on NixOS, ZFS and LXC
https://vpsadminos.org
MIT License
161 stars 27 forks source link

Support K8s in os containers #32

Closed snajpa closed 3 years ago

snajpa commented 4 years ago

K8s needs to think it owns these netfilter settings:

{"log":"I0226 12:57:33.323050       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 1310720\n","stream":"stderr","time":"2020-02-26T12:57:33.323181344Z"}
{"log":"F0226 12:57:33.323111       1 server.go:485] open /proc/sys/net/netfilter/nf_conntrack_max: no such file or directory\n","stream":"stderr","time":"2020-02-26T12:57:33.323210509Z"}
nf_conntrack_tcp_timeout_established
nf_conntrack_tcp_timeout_close_wait
tomassrnka commented 4 years ago
k8s on vpsAdminOS / Ubuntu 18.04

https://phoenixnap.com/kb/install-kubernetes-on-ubuntu

#### Kube on vpsAadminOS

apt install -y apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
apt update
apt-get install -y docker-ce
apt-get install -y iptables arptables ebtables

wget -q https://storage.googleapis.com/golang/getgo/installer_linux
chmod +x installer_linux
./installer_linux
source /root/.bash_profile

update-alternatives --set iptables /usr/sbin/iptables-legacy
update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
update-alternatives --set arptables /usr/sbin/arptables-legacy
update-alternatives --set ebtables /usr/sbin/ebtables-legacy
apt-get update &&  apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF | tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF

apt-get update
apt-get install -y kubelet kubeadm kubectl

 mkdir fake
 cd fake

 echo 0 > panic
 mount --bind panic /proc/sys/kernel/panic
 echo 0 > panic_on_oops
 mount --bind panic_on_oops /proc/sys/kernel/panic_on_oops
 echo 0 > overcommit_memory
 mount --bind overcommit_memory /proc/sys/vm/overcommit_memory
 mkdir -p netfilter/nf_log
 mount --bind netfilter /proc/sys/net/netfilter/
 echo 1310720 > netfilter/nf_conntrack_max
 echo 16384 > netfilter/hashsize
 echo 

 echo "Filename            Type     Size  Used  Priority" > swaps
 mount --bind swaps /proc/swaps
 mkdir block
 mount -o bind block/ /sys/block/
 mount --make-rshared /

 cat > /etc/docker/daemon.json <<EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

systemctl daemon-reload
systemctl restart docker

kubeadm config images pull
kubeadm init --pod-network-cidr=10.244.0.0/16

mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

## TODO: Flannel
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
VeeeneX commented 4 years ago

Another issue is with swap, main kubernetes process called kubelet is failing as well as kubeadm.

Possible solution is to set kubelet not to use swap, and ignore errors on kubeadm init

kubeadm reset 
echo 'Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false"' >> /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

systemctl daemon-reload
systemctl restart kubelet

kubeadm init

I'll update this after trying setup guide

tomassrnka commented 4 years ago

@VeeeneX keep in mind, that it is not running properly. CoreDNS keeps failing due to original bug.

VeeeneX commented 4 years ago

That's true, so we just have to wait for the fix on kernel.

VeeeneX commented 4 years ago

Any updates? :)

snajpa commented 4 years ago

This will have to wait a while, in kernel 5.6 sysctl() syscall disappeared, so we'll be able to think further about faking access to kernfs-based files after we get to that kernel.

snajpa commented 4 years ago

The downside is, that old distros potentially depend on sysctl() syscall being present, so we have to evaluate the situation further, before we make any substantial move.

VeeeneX commented 4 years ago

Today I tried k3s, which seems to be working, on the other hand k8s is still not working like expected, somehow it can't change nf_conntrack_max, but my assumptions migh be wrong

root@testing-k8s:~/fake# kubectl logs kube-proxy-kh6jh -n kube-system
W0423 10:30:02.840576       1 proxier.go:625] Failed to read file /lib/modules/5.4.28/modules.builtin with error open /lib/modules/5.4.28/modules.builtin: no such file or directory. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W0423 10:30:02.848766       1 proxier.go:635] Failed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W0423 10:30:02.853877       1 proxier.go:635] Failed to load kernel module ip_vs_rr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W0423 10:30:02.862592       1 proxier.go:635] Failed to load kernel module ip_vs_wrr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W0423 10:30:02.873454       1 proxier.go:635] Failed to load kernel module ip_vs_sh with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W0423 10:30:02.884237       1 proxier.go:635] Failed to load kernel module nf_conntrack with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W0423 10:30:02.892797       1 server_others.go:559] Unknown proxy mode "", assuming iptables proxy
I0423 10:30:02.909239       1 node.go:136] Successfully retrieved node IP: 37.205.14.39
I0423 10:30:02.909311       1 server_others.go:186] Using iptables Proxier.
I0423 10:30:02.909763       1 server.go:583] Version: v1.18.2
I0423 10:30:02.912223       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 1048576
F0423 10:30:02.912273       1 server.go:497] open /proc/sys/net/netfilter/nf_conntrack_max: no such file or directory

Maybe related to: https://github.com/rancher/k3s/issues/364?

tomassrnka commented 4 years ago

kube-proxy keeps failing on checking hashsize in /sys. I pin-pointed the code from k8s code base, that crashes located in cmd/kube-proxy/app/conntrack.go around lines 60-65 and made a test scenario, see below.

Steps to reproduce:

  1. Create any VPS that runs latest Docker under vpsadminos
  2. Run privileged ubuntu container docker run -it --rm --privileged ubuntu
  3. Inside install golang apt update ; apt -y install golang

Run:

package main

import (
        "io/ioutil"
        "strconv"
        "strings"
        "fmt"
)

func main() {
        max := 1310720
        hashsize, err := readIntStringFile("/sys/module/nf_conntrack/parameters/hashsize")
        if err != nil {
                panic(err)
        }
        if hashsize >= (max / 4) {
        panic("Hashsize not /4")
        }

        fmt.Println("hashsize OK")
}

func readIntStringFile(filename string) (int, error) {
        b, err := ioutil.ReadFile(filename)
        if err != nil {
                return -1, err
        }
        return strconv.Atoi(strings.TrimSpace(string(b)))
}

Outputs:

root@42e6bfe436b4:/# go run test.go
panic: open /sys/module/nf_conntrack/parameters/hashsize: permission denied

goroutine 1 [running]:
main.main()
    /test.go:14 +0xf5
exit status 2

The code works in Docker under KVM VPS:

root@5d4621f65463:/# go run test.go
hashsize OK
tomassrnka commented 4 years ago

Setting 644 permissions on host's OS in vpsadminos (=baremetal) solves this issue for test scenario. Let's try k8s now.

# chmod 644 /sys/module/nf_conntrack/parameters/hashsize