weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 670 forks source link

Weave as k8s AddOn does not run on worker nodes #2881

Open deitch opened 7 years ago

deitch commented 7 years ago

I have been banging my head on this (or something like it) for several hours now. I am trying to do something that should be really simple: start a kube cluster with just weavenet networking. As simple as:

kubectl apply -f https://git.io/weave-kube-1.6

And yet:

  1. No weave containers are run on the worker node (kubelet host)
  2. kubelet never enters ready state, message just like @weitzj 's : Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

I do not know if the two are related - or if it is connected to https://github.com/weaveworks/weave/issues/2826 - but I just cannot get "simple one-step install" to be, well, simple. :-)

bboreham commented 7 years ago

Are you sure no weave containers are run? What does kubectl get pods --namespace=kube-system -o wide show ?

If they are actually running and dieing, can you get the logs of one of the dead containers please.

bboreham commented 7 years ago

actually it might be https://github.com/kubernetes/kubernetes/issues/43815 - can you make sure you have Kubernetes 1.6.1 please?

deitch commented 7 years ago

Hey @bboreham thanks for getting back so quickly.

Yeah, I am sure none is running. I have a single worker node (systemctl stop kubelet on the others) so I can focus on where stuff is running and debug.

On master:

ip-10-50-21-250 core # kubectl get pods --namespace=kube-system -o wide
NAME                       READY     STATUS    RESTARTS   AGE       IP           NODE
kube-dns-321336704-9gq2p   2/4       Error     23         2h        172.17.0.2   ip-10-50-22-42.ec2.internal
kube-dns-321336704-z349d   2/4       Error     24         2h        172.17.0.3   ip-10-50-22-42.ec2.internal

On worker:

ip-10-50-22-42 kubernetes # docker ps -a | grep weave
ip-10-50-22-42 kubernetes # journalctl  -l --no-pager -u kubelet.service -f
-- Logs begin at Tue 2017-04-04 12:49:41 UTC. --
Apr 04 17:29:45 ip-10-50-22-42.ec2.internal kubelet[15748]: E0404 17:29:45.816864   15748 kubelet.go:2067] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 04 17:29:50 ip-10-50-22-42.ec2.internal kubelet[15748]: I0404 17:29:50.413517   15748 qos_container_manager_linux.go:285] [ContainerManager]: Updated QoS cgroup configuration
Apr 04 17:29:50 ip-10-50-22-42.ec2.internal kubelet[15748]: W0404 17:29:50.821379   15748 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Apr 04 17:29:50 ip-10-50-22-42.ec2.internal kubelet[15748]: E0404 17:29:50.822066   15748 kubelet.go:2067] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 04 17:29:52 ip-10-50-22-42.ec2.internal kubelet[15748]: E0404 17:29:52.414519   15748 pod_workers.go:182] Error syncing pod bf621c17-194a-11e7-be58-0e94e95c9de0 ("kube-dns-321336704-z349d_kube-system(bf621c17-194a-11e7-be58-0e94e95c9de0)"), skipping: network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized]
Apr 04 17:29:52 ip-10-50-22-42.ec2.internal kubelet[15748]: E0404 17:29:52.416158   15748 pod_workers.go:182] Error syncing pod bf6236ac-194a-11e7-be58-0e94e95c9de0 ("kube-dns-321336704-9gq2p_kube-system(bf6236ac-194a-11e7-be58-0e94e95c9de0)"), skipping: network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized]
Apr 04 17:29:55 ip-10-50-22-42.ec2.internal kubelet[15748]: W0404 17:29:55.416988   15748 prober.go:98] No ref for container "docker://bf3428d623ce4740f712161ed284990487381fd8b32f840e117cbb98ef0c5c28" (kube-dns-321336704-z349d_kube-system(bf621c17-194a-11e7-be58-0e94e95c9de0):kubedns)
Apr 04 17:29:55 ip-10-50-22-42.ec2.internal kubelet[15748]: I0404 17:29:55.417581   15748 prober.go:106] Readiness probe for "kube-dns-321336704-z349d_kube-system(bf621c17-194a-11e7-be58-0e94e95c9de0):kubedns" failed (failure): Get http://172.17.0.3:8081/readiness: dial tcp 172.17.0.3:8081: getsockopt: connection refused
Apr 04 17:29:55 ip-10-50-22-42.ec2.internal kubelet[15748]: W0404 17:29:55.823492   15748 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Apr 04 17:29:55 ip-10-50-22-42.ec2.internal kubelet[15748]: E0404 17:29:55.823618   15748 kubelet.go:2067] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

Ignoring the DNS pod problems, just nothing there. For additional, from the master:

ip-10-50-21-250 core # kubectl describe daemonset weave-net --namespace=kube-system
Name:       weave-net
Selector:   name=weave-net
Node-Selector:  <none>
Labels:     name=weave-net
Annotations:    kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"extensions/v1beta1","kind":"DaemonSet","metadata":{"annotations":{},"name":"weave-net","namespace":"kube-system"},"spec":{"template":{"m...
Desired Number of Nodes Scheduled: 0
Current Number of Nodes Scheduled: 0
Number of Nodes Scheduled with Up-to-date Pods: 0
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status:    0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:       name=weave-net
  Service Account:  weave-net
  Containers:
   weave:
    Image:  weaveworks/weave-kube:1.9.4
    Port:
    Command:
      /home/weave/launch.sh
    Requests:
      cpu:      10m
    Liveness:       http-get http://127.0.0.1:6784/status delay=30s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /host/etc from cni-conf (rw)
      /host/home from cni-bin2 (rw)
      /host/opt from cni-bin (rw)
      /host/var/lib/dbus from dbus (rw)
      /lib/modules from lib-modules (rw)
      /weavedb from weavedb (rw)
   weave-npc:
    Image:  weaveworks/weave-npc:1.9.4
    Port:
    Requests:
      cpu:      10m
    Environment:    <none>
    Mounts:     <none>
  Volumes:
   weavedb:
    Type:   EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
   cni-bin:
    Type:   HostPath (bare host directory volume)
    Path:   /opt
   cni-bin2:
    Type:   HostPath (bare host directory volume)
    Path:   /home
   cni-conf:
    Type:   HostPath (bare host directory volume)
    Path:   /etc
   dbus:
    Type:   HostPath (bare host directory volume)
    Path:   /var/lib/dbus
   lib-modules:
    Type:   HostPath (bare host directory volume)
    Path:   /lib/modules
Events:
  FirstSeen LastSeen    Count   From        SubObjectPath   Type        Reason      Message
  --------- --------    -----   ----        -------------   --------    ------      -------
  2h        5m      27  daemon-set          Warning     FailedCreate    Error creating: pods "" is forbidden: pod.Spec.SecurityContext.SELinuxOptions is forbidden

Not sure what that daemonset/SELinux error is. Running stock CoreOS Stable.

deitch commented 7 years ago

actually it might be kubernetes/kubernetes#43815 - can you make sure you have Kubernetes 1.6.1 please?

Running 1.6.0, but not kubeadm. Downloaded and installed all kube components manually. Happy to try, though.

bboreham commented 7 years ago

Maybe the kubelet logs will show something?

Or possibly kubectl describe node <your-node>

deitch commented 7 years ago

Just downloading and installing 1.6.1 now, then will check.

deitch commented 7 years ago

Maybe the kubelet logs will show something?

Yeah, those were the logs from journalctl. Doesn't kubelet just spew to stdout/stderr?

Well, 1.6.1 doesn't appear to solve it.

Apr 04 17:41:31 ip-10-50-22-42.ec2.internal kubelet[20511]: W0404 17:41:31.045714   20511 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Apr 04 17:41:31 ip-10-50-22-42.ec2.internal kubelet[20511]: E0404 17:41:31.045904   20511 kubelet.go:2067] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

I don't get it. If I pass --network-plugin=cni --network-plugin-dir=/etc/cni/net.d, it looks for something. How does it actually load them? Weave installs as a daemonset (excellent, by the way), but it looks like the node doesn't even get to ready state because it has nothing in CNI?

deitch commented 7 years ago

Oops, you asked for describe node

ip-10-50-21-250 bin # kubectl describe no ip-10-50-22-42.ec2.internal
Name:           ip-10-50-22-42.ec2.internal
Role:
Labels:         beta.kubernetes.io/arch=amd64
            beta.kubernetes.io/os=linux
            kubernetes.io/hostname=ip-10-50-22-42.ec2.internal
Annotations:        node.alpha.kubernetes.io/ttl=0
            volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:         <none>
CreationTimestamp:  Tue, 04 Apr 2017 14:46:55 +0000
Phase:
Conditions:
  Type          Status  LastHeartbeatTime           LastTransitionTime          Reason              Message
  ----          ------  -----------------           ------------------          ------              -------
  OutOfDisk         False   Tue, 04 Apr 2017 17:45:54 +0000     Tue, 04 Apr 2017 17:45:04 +0000     KubeletHasSufficientDisk    kubelet has sufficient disk space available
  MemoryPressure    False   Tue, 04 Apr 2017 17:45:54 +0000     Tue, 04 Apr 2017 17:45:04 +0000     KubeletHasSufficientMemory  kubelet has sufficient memory available
  DiskPressure      False   Tue, 04 Apr 2017 17:45:54 +0000     Tue, 04 Apr 2017 17:45:04 +0000     KubeletHasNoDiskPressure    kubelet has no disk pressure
  Ready         False   Tue, 04 Apr 2017 17:45:54 +0000     Tue, 04 Apr 2017 17:45:04 +0000     KubeletNotReady         runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:      10.50.22.42,10.50.22.42,ip-10-50-22-42.ec2.internal
Capacity:
 cpu:       2
 memory:    8178308Ki
 pods:      110
Allocatable:
 cpu:       2
 memory:    8075908Ki
 pods:      110
System Info:
 Machine ID:            22728a39e0794116afb356b59fdb9751
 System UUID:           EC2BB4F9-1532-7105-796A-D8256882EF5D
 Boot ID:           a80a3638-2f50-4c94-9384-a57aa205d3ff
 Kernel Version:        4.9.16-coreos-r1
 OS Image:          Container Linux by CoreOS 1298.7.0 (Ladybug)
 Operating System:      linux
 Architecture:          amd64
 Container Runtime Version: docker://1.12.6
 Kubelet Version:       v1.6.1
 Kube-Proxy Version:        v1.6.1
PodCIDR:            10.200.0.0/24
ExternalID:         ip-10-50-22-42.ec2.internal
Non-terminated Pods:        (2 in total)
  Namespace         Name                    CPU Requests    CPU Limits  Memory Requests Memory Limits
  ---------         ----                    ------------    ----------  --------------- -------------
  kube-system           kube-dns-321336704-9gq2p        260m (13%)  0 (0%)      140Mi (1%)  220Mi (2%)
  kube-system           kube-dns-321336704-z349d        260m (13%)  0 (0%)      140Mi (1%)  220Mi (2%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests Memory Limits
  ------------  ----------  --------------- -------------
  520m (26%)    0 (0%)      280Mi (3%)  440Mi (5%)
Events:
  FirstSeen LastSeen    Count   From                    SubObjectPath   Type        Reason          Message
  --------- --------    -----   ----                    -------------   --------    ------          -------
  44m       44m     1   kubelet, ip-10-50-22-42.ec2.internal            Warning     ImageGCFailed       unable to find data for container /
  44m       44m     2   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientDisk   Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientDisk
  44m       44m     2   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientMemory Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientMemory
  44m       44m     2   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasNoDiskPressure   Node ip-10-50-22-42.ec2.internal status is now: NodeHasNoDiskPressure
  44m       44m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      Starting        Starting kubelet.
  40m       40m     1   kube-proxy, ip-10-50-22-42.ec2.internal         Normal      Starting        Starting kube-proxy.
  38m       38m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasNoDiskPressure   Node ip-10-50-22-42.ec2.internal status is now: NodeHasNoDiskPressure
  38m       38m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      Starting        Starting kubelet.
  38m       38m     1   kubelet, ip-10-50-22-42.ec2.internal            Warning     ImageGCFailed       unable to find data for container /
  38m       38m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientDisk   Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientDisk
  38m       38m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientMemory Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientMemory
  38m       38m     1   kube-proxy, ip-10-50-22-42.ec2.internal         Normal      Starting        Starting kube-proxy.
  23m       23m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      Starting        Starting kubelet.
  23m       23m     1   kubelet, ip-10-50-22-42.ec2.internal            Warning     ImageGCFailed       unable to find data for container /
  23m       23m     1   kube-proxy, ip-10-50-22-42.ec2.internal         Normal      Starting        Starting kube-proxy.
  23m       23m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientDisk   Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientDisk
  23m       23m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasNoDiskPressure   Node ip-10-50-22-42.ec2.internal status is now: NodeHasNoDiskPressure
  23m       23m     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientMemory Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientMemory
  6m        6m      1   kubelet, ip-10-50-22-42.ec2.internal            Normal      Starting        Starting kubelet.
  6m        6m      1   kubelet, ip-10-50-22-42.ec2.internal            Warning     ImageGCFailed       unable to find data for container /
  5m        5m      1   kube-proxy, ip-10-50-22-42.ec2.internal         Normal      Starting        Starting kube-proxy.
  6m        5m      14  kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientDisk   Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientDisk
  6m        5m      14  kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientMemory Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientMemory
  6m        5m      14  kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasNoDiskPressure   Node ip-10-50-22-42.ec2.internal status is now: NodeHasNoDiskPressure
  5m        5m      1   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeNotReady        Node ip-10-50-22-42.ec2.internal status is now: NodeNotReady
  1m        1m      1   kube-proxy, ip-10-50-22-42.ec2.internal         Normal      Starting        Starting kube-proxy.
  50s       50s     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      Starting        Starting kubelet.
  50s       50s     1   kubelet, ip-10-50-22-42.ec2.internal            Warning     ImageGCFailed       unable to find data for container /
  50s       50s     3   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientDisk   Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientDisk
  50s       50s     3   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasSufficientMemory Node ip-10-50-22-42.ec2.internal status is now: NodeHasSufficientMemory
  50s       50s     3   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeHasNoDiskPressure   Node ip-10-50-22-42.ec2.internal status is now: NodeHasNoDiskPressure
  50s       50s     1   kube-proxy, ip-10-50-22-42.ec2.internal         Normal      Starting        Starting kube-proxy.
  50s       50s     1   kubelet, ip-10-50-22-42.ec2.internal            Normal      NodeNotReady        Node ip-10-50-22-42.ec2.internal status is now: NodeNotReady
deitch commented 7 years ago

And FWIW, my kubelet systemd unit:

[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
ExecStart=/opt/local/bin/kubelet --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --allow-privileged=true   --cloud-provider=   --cluster-dns=10.100.0.5   --cluster-domain=cluster.local   --container-runtime=docker   --docker=unix:///var/run/docker.sock    --kubeconfig=/var/lib/kubelet/kubeconfig   --register-node=true   --require-kubeconfig=true   --serialize-image-pulls=false   --tls-cert-file=/var/lib/kubernetes/kubernetes-worker.pem   --tls-private-key-file=/var/lib/kubernetes/kubernetes-worker-key.pem   --v=2

Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
deitch commented 7 years ago

Oh damn it! My own darn stupidity! I had used a config with --admission-control=SecurityContextDeny (among others), which caused weave deployment (which requires SELinux) to fail. Sigh.

I have no idea how to configure kube so that weave (as a privileged daemonset) can have full privileges, but typical user pods and containers cannot. Should it actually fail with SecurityContextDeny?

Still leaving this open because:

  1. This really should be in the docs. Save some other poor soul my wasted day.
  2. Still seeing a problem. On the master:
ip-10-50-21-250 bin # kubectl describe pod weave-net-rc838 -n kube-system
Name:       weave-net-rc838
Namespace:  kube-system
Node:       ip-10-50-22-42.ec2.internal/10.50.22.42
Start Time: Tue, 04 Apr 2017 18:25:43 +0000
Labels:     name=weave-net
        pod-template-generation=1
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"DaemonSet","namespace":"kube-system","name":"weave-net","uid":"d04b85c7-1935-11e7-b540-026890ffec58","apiV...
Status:     Pending
IP:     10.50.22.42
Controllers:    DaemonSet/weave-net
Containers:
  weave:
    Container ID:
    Image:      weaveworks/weave-kube:1.9.4
    Image ID:
    Port:
    Command:
      /home/weave/launch.sh
    State:      Waiting
      Reason:       ContainerCreating
    Ready:      False
    Restart Count:  0
    Requests:
      cpu:      10m
    Liveness:       http-get http://127.0.0.1:6784/status delay=30s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /host/etc from cni-conf (rw)
      /host/home from cni-bin2 (rw)
      /host/opt from cni-bin (rw)
      /host/var/lib/dbus from dbus (rw)
      /lib/modules from lib-modules (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from weave-net-token-82nn3 (ro)
      /weavedb from weavedb (rw)
  weave-npc:
    Container ID:
    Image:      weaveworks/weave-npc:1.9.4
    Image ID:
    Port:
    State:      Waiting
      Reason:       ContainerCreating
    Ready:      False
    Restart Count:  0
    Requests:
      cpu:      10m
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from weave-net-token-82nn3 (ro)
Conditions:
  Type      Status
  Initialized   True
  Ready     False
  PodScheduled  True
Volumes:
  weavedb:
    Type:   EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  cni-bin:
    Type:   HostPath (bare host directory volume)
    Path:   /opt
  cni-bin2:
    Type:   HostPath (bare host directory volume)
    Path:   /home
  cni-conf:
    Type:   HostPath (bare host directory volume)
    Path:   /etc
  dbus:
    Type:   HostPath (bare host directory volume)
    Path:   /var/lib/dbus
  lib-modules:
    Type:   HostPath (bare host directory volume)
    Path:   /lib/modules
  weave-net-token-82nn3:
    Type:   Secret (a volume populated by a Secret)
    SecretName: weave-net-token-82nn3
    Optional:   false
QoS Class:  Burstable
Node-Selectors: <none>
Tolerations:    node-role.kubernetes.io/master=:NoSchedule
Events:
  FirstSeen LastSeen    Count   From                    SubObjectPath   Type        Reason      Message
  --------- --------    -----   ----                    -------------   --------    ------      -------
  2m        2m      1   kubelet, ip-10-50-22-42.ec2.internal            Warning     FailedSync  Error syncing pod, skipping: failed to "CreatePodSandbox" for "weave-net-rc838_kube-system(1c6182e5-1964-11e7-b1af-0e94e95c9de0)" with CreatePodSandboxError: "CreatePodSandbox for pod \"weave-net-rc838_kube-system(1c6182e5-1964-11e7-b1af-0e94e95c9de0)\" failed: rpc error: code = 2 desc = failed to start sandbox container for pod \"weave-net-rc838\": Error response from daemon: {\"message\":\"invalid header field value \\\"oci runtime error: container_linux.go:247: starting container process caused \\\\\\\"process_linux.go:359: container init caused \\\\\\\\\\\\\\\"write /proc/self/task/2287/attr/exec: invalid argument\\\\\\\\\\\\\\\"\\\\\\\"\\\\n\\\"\"}"
# last line repeated many times
deitch commented 7 years ago

One more piece of the puzzle. SELinux?

Apr 05 08:06:37 ip-10-50-22-42.ec2.internal kubelet[29296]: I0405 08:06:37.633880   29296 kuberuntime_manager.go:384] No ready sandbox for pod "weave-net-rc838_kube-system(1c6182e5-1964-11e7-b1af-0e94e95c9de0)" can be found. Need to start a new one
Apr 05 08:06:37 ip-10-50-22-42.ec2.internal kubelet[29296]: I0405 08:06:37.634046   29296 kuberuntime_manager.go:458] Container {Name:weave Image:weaveworks/weave-kube:1.9.4 Command:[/home/weave/launch.sh] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:10 scale:-3} d:{Dec:<nil>} s:10m Format:DecimalSI}]} VolumeMounts:[{Name:weavedb ReadOnly:false MountPath:/weavedb SubPath:} {Name:cni-bin ReadOnly:false MountPath:/host/opt SubPath:} {Name:cni-bin2 ReadOnly:false MountPath:/host/home SubPath:} {Name:cni-conf ReadOnly:false MountPath:/host/etc SubPath:} {Name:dbus ReadOnly:false MountPath:/host/var/lib/dbus SubPath:} {Name:lib-modules ReadOnly:false MountPath:/lib/modules SubPath:} {Name:weave-net-token-82nn3 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/status,Port:6784,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:30,TimeoutSeconds:1,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Apr 05 08:06:37 ip-10-50-22-42.ec2.internal kubelet[29296]: I0405 08:06:37.634091   29296 kuberuntime_manager.go:458] Container {Name:weave-npc Image:weaveworks/weave-npc:1.9.4 Command:[] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:10 scale:-3} d:{Dec:<nil>} s:10m Format:DecimalSI}]} VolumeMounts:[{Name:weave-net-token-82nn3 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Apr 05 08:06:37 ip-10-50-22-42.ec2.internal kernel: SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)
Apr 05 08:06:37 ip-10-50-22-42.ec2.internal containerd[1341]: time="2017-04-05T08:06:37.835787158Z" level=error msg="containerd: start container" error="oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:359: container init caused \\\"write /proc/self/task/14661/attr/exec: invalid argument\\\"\"\n" id=dcafd3ef35a51333219b32148f7afa1aa733fe978066402f0d0627f703df0f1c
Apr 05 08:06:37 ip-10-50-22-42.ec2.internal dockerd[1342]: time="2017-04-05T08:06:37.836307141Z" level=error msg="Create container failed with error: invalid header field value \"oci runtime error: container_linux.go:247: starting container process caused \\\"process_linux.go:359: container init caused \\\\\\\"write /proc/self/task/14661/attr/exec: invalid argument\\\\\\\"\\\"\\n\""

But no idea how to resolve this. Does weave+kube+coreos not work as a combo?

pronix commented 7 years ago

@deitch check please how to allow restricted calls https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security-Enhanced_Linux/sect-Security-Enhanced_Linux-Fixing_Problems-Allowing_Access_audit2allow.html

deitch commented 7 years ago

@pronix

ip-10-50-22-42 core # sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             mcs
Current mode:                   permissive
Mode from config file:          permissive
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      30
ip-10-50-22-42 core # getenforce
Permissive

Fully permissive. I could disable it entirely and reboot, but if it is permissive, should it matter?

pronix commented 7 years ago

@deitch permissive is just notification. so problem is somewhere other

deitch commented 7 years ago

permissive is just notification. so problem is somewhere other

Um, yeah. It means it notifies but does not enforce. Here, it is enforcing. Or, more correctly, the call is failing (possibly because of selinux, possibly that is a red herring).

deitch commented 7 years ago

Well, disabling selinux entirely - including removing it from the default (on coreos) on dockerd - makes the container run. It then fails trying to get kubernetes info:

E0405 08:54:23.706286       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1.Namespace: Get https://10.100.0.1:443/api/v1/namespaces?resourceVersion=0: dial tcp 10.100.0.1:443: getsockopt: connection refused
E0405 08:54:23.717651       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1beta1.NetworkPolicy: Get https://10.100.0.1:443/apis/extensions/v1beta1/networkpolicies?resourceVersion=0: dial tcp 10.100.0.1:443: getsockopt: connection refused
E0405 08:54:24.707660       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1.Pod: Get https://10.100.0.1:443/api/v1/pods?resourceVersion=0: dial tcp 10.100.0.1:443: getsockopt: connection refused

That appears to be the kubernetes cluster service:

ip-10-50-21-250 core # kubectl get svc -oyaml kubernetes
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: 2017-04-04T12:54:08Z
  labels:
    component: apiserver
    provider: kubernetes
  name: kubernetes
  namespace: default
  resourceVersion: "24"
  selfLink: /api/v1/namespaces/default/services/kubernetes
  uid: cafee572-1935-11e7-b540-026890ffec58
spec:
  clusterIP: 10.100.0.1
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: 6443
  sessionAffinity: ClientIP
  type: ClusterIP
status:
  loadBalancer: {}
pronix commented 7 years ago

there is example how to handle with enabled selinux https://github.com/weaveworks/weave/issues/293

bboreham commented 7 years ago

dial tcp 10.100.0.1:443: getsockopt: connection refused

This means it did manage to contact a host, but there was nothing listening at that port.

You should check that kube-proxy is mapping port 443 to 6443 and mapping to the real address of the api-server. More tips at kubernetes.io/docs/tasks/debug-application-cluster/debug-service/

deitch commented 7 years ago

This means it did manage to contact a host, but there was nothing listening at that port

Obviously. :-)

So all of those, through many layers of logs and debugging, come down to something selinux between docker/coreos/kube/maybe weave?

I dug through the DaemonSet from the weave spec, it asks for certain SELinux capabilities. Should that not handle it?

there is example how to handle with enabled selinux #293

Does that handle the mqueue issue? Also, are there official docs on, "run weave in selinux environment using ____"?

deitch commented 7 years ago

there is example how to handle with enabled selinux #293

That only covers as a systemd unit, not as k8s AddOn.

deitch commented 7 years ago

Obviously. :-)

Oops, sorry @bboreham, that came across as snarky. Completely unintentional.

Curious: weave uses the kubernetes service to reach the api server. Which makes sense. And if the cluster has 3 api servers, but only one functioning? I manually disabled systemctl stop kube-apiserver 2 out of 3, yet the kube-proxy-generated iptables show:

-A KUBE-SEP-7ENDL6QSPNVRD6RQ -s 10.50.22.124/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-7ENDL6QSPNVRD6RQ -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-7ENDL6QSPNVRD6RQ --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 10.50.22.124:6443
-A KUBE-SEP-IX4LL7XNRMTXIUD2 -s 10.50.20.186/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-IX4LL7XNRMTXIUD2 -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-IX4LL7XNRMTXIUD2 --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 10.50.20.186:6443
-A KUBE-SEP-QLFYKHFTR2K3O732 -s 10.50.21.250/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-QLFYKHFTR2K3O732 -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-QLFYKHFTR2K3O732 --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 10.50.21.250:6443
-A KUBE-SERVICES -d 10.100.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-IX4LL7XNRMTXIUD2 --mask 255.255.255.255 --rsource -j KUBE-SEP-IX4LL7XNRMTXIUD2
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-QLFYKHFTR2K3O732 --mask 255.255.255.255 --rsource -j KUBE-SEP-QLFYKHFTR2K3O732
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-7ENDL6QSPNVRD6RQ --mask 255.255.255.255 --rsource -j KUBE-SEP-7ENDL6QSPNVRD6RQ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m statistic --mode random --probability 0.33332999982 -j KUBE-SEP-IX4LL7XNRMTXIUD2
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-QLFYKHFTR2K3O732
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-7ENDL6QSPNVRD6RQ

And, yes, those are the 3 IPs of the master nodes, and they are running at 6443.

deitch commented 7 years ago

Yep, restarting the other nodes gets it to respond. It isn't a service problem per se, but an inability by kube-proxy to recognize loss of an API server.

And with every layer of the onion:

E0405 09:38:38.966305       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1.Pod: Get https://10.100.0.1:443/api/v1/pods?resourceVersion=0: x509: certificate signed by unknown authority
E0405 09:38:38.971602       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1.Namespace: Get https://10.100.0.1:443/api/v1/namespaces?resourceVersion=0: x509: certificate signed by unknown authority
E0405 09:38:38.992128       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1beta1.NetworkPolicy: Get https://10.100.0.1:443/apis/extensions/v1beta1/networkpolicies?resourceVersion=0: x509: certificate signed by unknown authority

How does Weave handle connecting to the api server with its certs signed by a private CA?

Ah, yes, the cert in /var/run/secrets/, so debugging that now.

bboreham commented 7 years ago

kubeproxy ought to see those master nodes are bad and take them out of the iptables rules.

Possibly this takes some time for the remaining master nodes to notice; that would be a question for Kubernetes.

marccarre commented 7 years ago

How does Weave handle connecting to the api server with its certs signed by a private CA?

Weave Net does not know nor care about these: it uses InClusterConfig (see: /prog/kube-peers/main.go#L13), which handles this transparently.

I have no idea how to configure kube so that weave (as a privileged daemonset) can have full privileges

and

I dug through the DaemonSet from the weave spec, it asks for certain SELinux capabilities. Should that not handle it?

I believe that is indeed taken care of by the YAML file available at https://git.io/weave-kube-1.6 already, see:

      securityContext:
        seLinuxOptions:
          type: spc_t

I know close to nothing about SELinux, but I would probably start checking which domains are the following components running under:

and make sure they can talk to each others.

This may be relevant as well: https://www.weave.works/docs/net/latest/installing-weave/systemd/

deitch commented 7 years ago

kubeproxy ought to see those master nodes are bad and take them out of the iptables rules. Possibly this takes some time for the remaining master nodes to notice; that would be a question for Kubernetes.

@bboreham yes it should. That definitely is not a Weave question. If I can replicate it, I will open a k8s issue.

@marccarre wrote:

I know close to nothing about SELinux

Ha! Join the club. Everywhere enables it, and few actually know how to use it. All I know is that every place I have been has had to disable it because stuff just didn't work. Sometimes I wonder if it is like Plato's ideal. It is a security system that works perfectly only in theory, but few use it in the concrete. :-)

I think the core issue is that mqueue one, but I really don't know what it is about. I will check those links.

So despite the interim issues, in the end, this boils down to: please put big warning that selinux can get in the way, whether enabled on OS or dockerd?

deitch commented 7 years ago

Oh, and still struggling with the certificates. Apparently it is due to SNI, which I have enabled on api server. Internal CA with internal dynamic certs for internal access, externally provided cert for external API access.

deitch commented 7 years ago

Confirm there is an SNI issue. API server has 2 certs configured under SNI. The one with the local CA cert at /var/run/secrets/kubernetes.io/serviceaccount/ca.crt is delivered when I access using the load balancer, the private IP of the specific master node 10.50.22.57, or the service IP 10.100.0.1. I can confirm it by doing the following inside the container. Any of the below works and has openssl reporting the server cert as verified:

openssl s_client -connect 10.50.22.57:6443 -servername 10.50.22.57 -CAfile /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
openssl s_client -connect 10.50.22.57:6443 -servername 10.100.0.1 -CAfile /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
openssl s_client -connect 10.100.0.1:443 -servername 10.50.22.57 -CAfile /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
openssl s_client -connect 10.100.0.1:443 -servername 10.100.0.1 -CAfile /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

But docker logs on the host shows:

E0405 11:54:00.144769       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1.Pod: Get https://10.100.0.1:443/api/v1/pods?resourceVersion=0: x509: certificate signed by unknown authority
E0405 11:54:00.150515       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1beta1.NetworkPolicy: Get https://10.100.0.1:443/apis/extensions/v1beta1/networkpolicies?resourceVersion=0: x509: certificate signed by unknown authority
E0405 11:54:00.154787       1 reflector.go:214] github.com/weaveworks/weave/vendor/k8s.io/client-go/tools/cache/reflector.go:109: Failed to list *v1.Namespace: Get https://10.100.0.1:443/api/v1/namespaces?resourceVersion=0: x509: certificate signed by unknown authority
deitch commented 7 years ago

Confirmed. I wiresharked the comms from the API server side. Even though API server supports SNI (as do kubectl and kubelet and kube-proxy, etc. as clients), weave's request does not include the TLS 1.2 extension server_name, causing it to serve up the wrong cert.

marccarre commented 7 years ago

Thanks for the update @deitch. I had another look at Weave Net's sources, and how we call Kubernetes using their client, and it doesn't look like we can provide anything (argument, flag, etc.) to change this behaviour, therefore this looks like a bug for Kubernetes.

deitch commented 7 years ago

@marccarre can you point me to where in the source? I am happy to dig into it. I know that kubelet and kube-proxy succeed with it, and they should use the same code, so want to see where it is different.

marccarre commented 7 years ago

@deitch, I didn't trace the full call tree but the error you see seems to happen in NewReflector and the only places where weave-kube interacts with Kubernetes' API directly seem to be:

deitch commented 7 years ago

Yeah, I tracked it down to here https://github.com/weaveworks/weave/blob/master/prog/weave-npc/main.go#L116 and https://github.com/weaveworks/weave/blob/master/prog/weave-npc/main.go#L119

Those don't give much choice, so I opened https://github.com/kubernetes/client-go/issues/173

Now I have no idea how I will make this all work. The problem is a kubernetes one. The masters have an internal CA that generates certs for etcd<->etcd comms, worker<->API server, etc. This is internally generated and dynamic, and is distinct from what clients use, since that needs to be controlled by an outside CA (real admin person dishing out certs).

I used SNI so that the API server would use the internal auto-generated on server startup cert for all internal comms (including from weave and kubelet and kube-proxy). That one is valid for the internal ELB, and the private IP of the API server (determined at boot time because in the cloud it is dynamic), and the service IP for kubernetes service (which weave uses), so it cannot be generated in advance. The external one is generated at cluster creation time.

Now I need a way to solve this without SNI. I know it isn't a weave problem - the only weave issue here appears to be the selinux issue, which isn't really Weave's fault, but probably should be documented - but a more general issue.

Maybe I can use a single cert for all, have it generated in real-time, and have the external clients (including kubectl) trust the CA that is used internally dynamically? For api server to authenticate externals, it would only be via the external CA. Hmm....

bboreham commented 7 years ago

Thanks for tracking down this certificate issue! Look forward to the response on https://github.com/kubernetes/client-go/issues/173

deitch commented 7 years ago

@bboreham quite welcome.

Would be nice to figure out the selinux issue, though, so happy to take pointers.

Separately: any chance you are at Continuous Lifecycle in London in a month? Flying in for it.

marccarre commented 7 years ago

For the record: feature and [component/docs] labels added as we should improve documentation for CoreOS + SELinux. Related issue: #1458 (documentation for CentOS + SELinux)


Would be nice to figure out the selinux issue, though, so happy to take pointers.

@deitch did you find anything suspicious when looking at the SELinux domains for systemd, Kubernetes and Docker? (see: https://github.com/weaveworks/weave/issues/2881#issuecomment-291823577)

deitch commented 7 years ago

did you find anything suspicious when looking at the SELinux domains for systemd, Kubernetes and Docker?

Not beyond that mqueue issue, which I suspect is more of a docker+coreos thing. I will dig deeper, but might need to wait a bit...

weitzj commented 7 years ago

I could not install Weave 1.9.4 on Kubernetes 1.6.2 with Ubuntu 16.04.2 due to pod.Spec.SecurityContext.SELinuxOptions is forbidden

kubectl describe daemonsets/weave-net -n kube-system

Events:
  FirstSeen     LastSeen        Count   From            SubObjectPath   Type            Reason                  Message
  ---------     --------        -----   ----            -------------   --------        ------                  -------
  1d            12m             531     daemon-set                      Warning         FailedCreate            Error creating: pods "" is forbidden: pod.Spec.SecurityContext.SELinuxOptions is forbidden

Because probably I do not have SELinux installed.

After removing

-      securityContext:
  -        seLinuxOptions:
  -          type: spc_t

Weave launches.

But this is probably not the idea.

Now I have do dig down, how to allow this SELinux Option

marccarre commented 7 years ago

@weitzj, which YAML did you use? This one: https://github.com/weaveworks/weave/releases/download/v1.9.4/weave-daemonset-k8s-1.6.yaml?

weitzj commented 7 years ago

@marccarre Yes. This one. I have modified it a bit to incorporate a secret. So here is the whole diff

61a62,63
>           args:
>             - --log-level=warning
67a70,79
>           env:
>             - name: CHECKPOINT_DISABLE
>               value: "1"
>             - name: IPALLOC_RANGE
>               value: 172.20.0.0/16
>             - name: WEAVE_PASSWORD
>               valueFrom:
>                 secretKeyRef:
>                   name: weave-passwd
>                   key: weave-passwd
98,100d109
<       securityContext:
<         seLinuxOptions:
<           type: spc_t
burdiyan commented 7 years ago

@deitch Did you find the way to run Weave Net with Kubernetes on CoreOS? I continue to have this when describing weave-net pod:

  3m    3m  1   {kubelet 10.135.65.230}     Warning FailedSync  Error syncing pod, skipping: failed to "CreatePodSandbox" for "weave-net-pkl1d_kube-system(a64e9767-2a91-11e7-9963-d2b6d3081ec8)" with CreatePodSandboxError: "CreatePodSandbox for pod \"weave-net-pkl1d_kube-system(a64e9767-2a91-11e7-9963-d2b6d3081ec8)\" failed: rpc error: code = 2 desc = failed to start sandbox container for pod \"weave-net-pkl1d\": Error response from daemon: {\"message\":\"invalid header field value \\\"oci runtime error: container_linux.go:247: starting container process caused \\\\\\\"process_linux.go:359: container init caused \\\\\\\\\\\\\\\"write /proc/self/task/8630/attr/exec: invalid argument\\\\\\\\\\\\\\\"\\\\\\\"\\\\n\\\"\"}"
deitch commented 7 years ago

@burdiyan I did. I had to disable SELinux entirely (ugh), but it worked. I also stopped using SNI because of the early issue. If you want to see my cloudinit, I probably can share a chunk of it.

burdiyan commented 7 years ago

@deitch Would be great if you can share it :) I'm already struggling to set it up for a lot longer than I wish.

deitch commented 7 years ago

I'm already struggling to set it up for a lot longer than I wish.

heh, see my comment above, "Save some other poor soul my wasted day." :-)

Here is a simplified and much reduced version of my master cloudinit, ignoring etcd, etc.. We use plain vanilla CoreOS, and then cloudinit to configure it. This makes auto-scaling immensely easier. Actually, we do that for every instance (kube, vpn, ci, you name it).

Also, cloudinit got a little too long for AWS, so we have a cloudinit stub that just installs AWS S3 utils and then downloads everything else from S3 (using IAM roles), including environment, etc.

# get private cloud IP
INTERNAL_IP=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)

# ensure binaries are in path
cat > /etc/profile.d/optlocal.sh <<"EOF"
PATH=$PATH:/opt/local/bin
EOF

# setup etcd2
# note: Use image monsantoco/etcd-aws-cluster to simplify figuring out new vs join

# download kubernetes binaries
# install into /opt/local/bin because /usr is read-only

# download CNI from https://github.com/containernetworking/cni/releases/download
# install into /opt/cni/bin

# create systemd entries for kube-apiserver, kube-controller-manager, kube-scheduler
# Note:
#  - we removed --admission-control=SecurityContextDeny
#  - certs and keys are either passed in or auto-generated
cat > /etc/systemd/system/kube-apiserver.service <<EOF
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
ExecStart=$BINPATH/kube-apiserver \
  --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota \
  --advertise-address=$INTERNAL_IP \
  --allow-privileged=true \
  --apiserver-count=3 \
  --authorization-mode=ABAC \
  --authorization-policy-file=$AUTH_POLICY_FILE \
  --bind-address=$INTERNAL_IP \
  --secure-port=6443 \
  --insecure-bind-address=127.0.0.1 \
  --insecure-port=8080 \
  --enable-swagger-ui=true \
  --storage-backend=etcd2 \
  --etcd-cafile=/etc/etcd/ca.pem \
  --etcd-certfile=/etc/etcd/etcd.pem \
  --etcd-keyfile=/etc/etcd/etcd-key.pem \
  --kubelet-certificate-authority=$CA_FILE_FULL \
  --etcd-servers=$ETCD_SERVERS \
  --service-account-key-file=$SERVICEACCOUNT_KEY \
  --service-cluster-ip-range=$SERVICE_CIDR \
  --service-node-port-range=30000-32767 \
  --tls-cert-file $APISERVER_CERT \
  --tls-private-key-file $APISERVER_KEY \
  --client-ca-file=$CA_FILE_FULL \
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

# leaving out kube-controller-manager, kube-schedule; ask if you need

# create kube-dns deployment and service, based largely on Kelsey Hightower's templates
# load up with timeout, in case API server not ready yet

# load up weave, same timeout logic 
# also enable retries because this is run on every master, and https://github.com/kubernetes/kubernetes/issues/44165
kubectl apply -f https://git.io/weave-kube-1.6

# start cfssl signing server
#   because we run a CA signing for each node

And for worker nodes

# get private cloud IP
INTERNAL_IP=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)

# ensure binaries are in path
cat > /etc/profile.d/optlocal.sh <<"EOF"
PATH=$PATH:/opt/local/bin
EOF

# download kubernetes binaries: kubelet kube-proxy kubectl
# install into /opt/local/bin because /usr is read-only

# download CNI from https://github.com/containernetworking/cni/releases/download
# install into /opt/cni/bin

# create kubeconfig file using certs generated from master earlier in here
cat > $KUBELETROOT/kubeconfig <<EOF
apiVersion: v1
kind: Config
clusters:
- cluster:
    certificate-authority: $CA_CERT
    server: $KUBERNETES_API_SERVER_URL
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubelet
  name: kubelet
current-context: kubelet
users:
- name: kubelet
  user:
    client-certificate: $WORKER_CERT
    client-key: $WORKER_KEY
EOF

# REALLY IMPORTANT TO MAKE IT WORK... and I dislike it
# unfortunately disable selinux is necessary
mkdir -p /etc/systemd/system/docker.service.d
cat > /etc/systemd/system/docker.service.d/disable_selinux.conf <<EOF
[Service]
Environment=DOCKER_OPTS=--selinux-enabled=false
EOF

# kubelet systemd
cat > /etc/systemd/system/kubelet.service <<EOF
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
ExecStart=$BINPATH/kubelet \
  --allow-privileged=true \
  --cloud-provider= \
  --network-plugin=cni \
  --cni-conf-dir=/etc/cni/net.d \
  --cni-bin-dir=/opt/cni/bin \
  --cluster-dns=$KUBERNETES_CLUSTER_DNS \
  --cluster-domain=cluster.local \
  --container-runtime=docker \
  --docker=unix:///var/run/docker.sock \
  --kubeconfig=$KUBELETROOT/kubeconfig \
  --register-node=true \
  --require-kubeconfig=true \
  --serialize-image-pulls=false \
  --tls-cert-file=$WORKER_CERT \
  --tls-private-key-file=$WORKER_KEY \
  --v=2

Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

# repeat for kube-proxy

Let me know what more I can do

bboreham commented 7 years ago

So is this issue all about SELinux? I should change the title if so.

deitch commented 7 years ago

@bboreham I wish I could say yes. All I know for sure is that a combination of the following makes it work:

Probably should run each separately and tease out the issue.

bboreham commented 7 years ago

Wonder if #3000 would help here? (basically we change the default to blank SELinux options)

deitch commented 7 years ago

Wonder if #3000 would help here?

Not sure. When I can set a new cluster up, I can reenable everything and try it, but might be a while. Just going through testing after I removed the other workarounds (CRI/hairpin, yaml generator).

Were you able to recreate the issue?

messmerxx commented 6 years ago

i have similar problem on my windows server 2016 node. the weave-net pod can not be started successfully.

i used kubernetes 1.9.3 alpha

https://github.com/kubernetes/kubernetes/issues/56696

any idea?

thank you all very much