rancher / rke

Rancher Kubernetes Engine (RKE), an extremely simple, lightning fast Kubernetes distribution that runs entirely within containers.
Apache License 2.0
3.22k stars 582 forks source link

flanneld container shipped by Rancher is not working with SELinux #2662

Closed debackerl closed 3 years ago

debackerl commented 3 years ago

RKE version: RKE from Terraform Provider: source = "rancher/rke" version = "1.2.3"

Docker version: (docker version,docker info preferred)

Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 9
  Running: 7
  Paused: 0
  Stopped: 2
 Images: 11
 Server Version: 20.10.7
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: /usr/libexec/docker/docker-init
 containerd version: 
 runc version: 4c62ef789fd7a2963bf61ffbf421ce9646063648
 init version: 
 Security Options:
  selinux
  cgroupns
 Kernel Version: 5.13.4-200.fc34.x86_64
 Operating System: Fedora CoreOS 34.20210725.3.0
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 1.873GiB
 Name: static.196.187.130.94.clients.your-server.de
 ID: I5Z3:F7ZZ:6PSP:REWU:FRAB:H6BB:DVID:TWZA:WHSI:GTHE:Q2MY:HCP4
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: true

Operating system and kernel: (cat /etc/os-release, uname -r preferred) Fedora CoreOS 5.13.4-200.fc34.x86_64

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) Mix of Hetzner Cloud and Bare-metal on Hetzner.

cluster.yml file:

network:
  plugin: flannel
  options:
    flannel_backend_type: vxlan 

ingress:
  provider: none
resource "rke_cluster" "main" {
    cluster_yaml = file("artifacts/config-files/rke.yaml")
    kubernetes_version = "v1.20.8-rancher1-1"

    services {
        kube_controller {
            cluster_cidr = "10.42.0.0/16"
            extra_args = {
                "flex-volume-plugin-dir" = "/opt/kubernetes/kubelet-plugins/volume/exec/"
            }
        }

        kubelet {
            cluster_dns_server = "10.43.0.10"
            cluster_domain = "cluster.local"
        }
    }

    upgrade_strategy {
        drain = true
        drain_input {
            force = true
            delete_local_data = true
            ignore_daemon_sets = true
            grace_period = -1
            timeout = 60
        }

        max_unavailable_worker = "20%"
    }

Steps to Reproduce:

  1. Install RKE by applying the Terraform plan.
  2. Open logs of one flanneld container.

Results:

E0816 16:03:16.039524 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t filter -C FORWARD -s 10.42.0.0/16 -j ACCEPT --wait]: exit status 4: Fatal: can't open lock file /run/xtables.lock: Permission denied
E0816 16:03:21.014400 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t nat -C POSTROUTING -s 10.42.0.0/16 -d 10.42.0.0/16 -j RETURN --wait]: exit status 4: Fatal: can't open lock file /run/xtables.lock: Permission denied
E0816 16:03:21.040395 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t filter -C FORWARD -s 10.42.0.0/16 -j ACCEPT --wait]: exit status 4: Fatal: can't open lock file /run/xtables.lock: Permission denied
I0816 16:03:26.018463 1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0816 16:03:26.018480 1 iptables.go:167] Deleting iptables rule: -s 10.42.0.0/16 -d 10.42.0.0/16 -j RETURN
I0816 16:03:26.019372 1 iptables.go:167] Deleting iptables rule: -s 10.42.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0816 16:03:26.020268 1 iptables.go:167] Deleting iptables rule: ! -s 10.42.0.0/16 -d 10.42.4.0/24 -j RETURN
I0816 16:03:26.088286 1 iptables.go:167] Deleting iptables rule: ! -s 10.42.0.0/16 -d 10.42.0.0/16 -j MASQUERADE --random-fully

If I disabled SELinux on the host, and re-install RKE, all is successful.

I already contacted CoreOS and Lucab gave me this explanation:

It looks like you are using a modified version of this upstream manifest: https://github.com/flannel-io/flannel/blob/v0.14.0/Documentation/kube-flannel-old.yaml

However your version differs under some aspects; for example it misses a privileged: true in its SecurityContext, which I think is exactly relevant to this ticket. Please get in touch with the vendor or with the author of this manifest in order to double-check which kind of permissions it expects.

See, my ticket over there: https://github.com/coreos/fedora-coreos-tracker/issues/927

superseb commented 3 years ago

Pretty sure this is the same as https://github.com/rancher/rke/issues/2636?

stale[bot] commented 3 years ago

This issue/PR has been automatically marked as stale because it has not had activity (commit/comment/label) for 60 days. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.