planetlabs / draino

Automatically cordon and drain Kubernetes nodes based on node conditions
Apache License 2.0
629 stars 84 forks source link

No actions being taken by draino for the condition Ready=Unknown #48

Open prabhatnagpal opened 5 years ago

prabhatnagpal commented 5 years ago

Please help me out because I am not able to make Draino work. Even if the node is in unknown state it doesn't drain nodes. I have used kops to spin up the cluster with 1 master and 2 nodes. I make the node reach unknown state by using this script and start on a node:-

#!/bin/bash
for (( ; ; ))
do
echo "Press CTRL+C to stop............................................."
nohup ./run.sh &
done

My draino.yaml is this:-

---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels: {component: draino}
  name: draino
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels: {component: draino}
  name: draino
rules:
- apiGroups: ['']
  resources: [events]
  verbs: [create, patch, update]
- apiGroups: ['']
  resources: [nodes]
  verbs: [get, watch, list, update]
- apiGroups: ['']
  resources: [nodes/status]
  verbs: [patch]
- apiGroups: ['']
  resources: [pods]
  verbs: [get, watch, list]
- apiGroups: ['']
  resources: [pods/eviction]
  verbs: [create]
- apiGroups: [extensions]
  resources: [daemonsets]
  verbs: [get, watch, list]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels: {component: draino}
  name: draino
roleRef: {apiGroup: rbac.authorization.k8s.io, kind: ClusterRole, name: draino}
subjects:
- {kind: ServiceAccount, name: draino, namespace: kube-system}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels: {component: draino}
  name: draino
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels: {component: draino}
  template:
    metadata:
      labels: {component: draino}
      name: draino
      namespace: kube-system
    spec:
      containers:
      - name: draino
        image: planetlabs/draino:5e07e93
        command:
        - /draino
        - Ready=Unknown
        livenessProbe:
          httpGet: {path: /healthz, port: 10002}
          initialDelaySeconds: 30
      serviceAccountName: draino

My node-problem-detector.yaml is this:-

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-problem-detector
  namespace: kube-system
  labels:
    app: node-problem-detector
spec:
  selector:
    matchLabels:
      app: node-problem-detector
  template:
    metadata:
      labels:
        app: node-problem-detector
    spec:
      containers:
      - name: node-problem-detector
        command:
        - /node-problem-detector
        - --logtostderr
        - --system-log-monitors=/config/kernel-monitor.json,/config/docker-monitor.json
        image: k8s.gcr.io/node-problem-detector:v0.6.3
        resources:
          limits:
            cpu: 10m
            memory: 80Mi
          requests:
            cpu: 10m
            memory: 80Mi
        imagePullPolicy: Always
        securityContext:
          privileged: true
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        volumeMounts:
        - name: log
          mountPath: /var/log
          readOnly: true
        - name: kmsg
          mountPath: /dev/kmsg
          readOnly: true
        # Make sure node problem detector is in the same timezone
        # with the host.
        - name: localtime
          mountPath: /etc/localtime
          readOnly: true
        - name: config
          mountPath: /config
          readOnly: true
      volumes:
      - name: log
        # Config `log` to your system log directory
        hostPath:
          path: /var/log/
      - name: kmsg
        hostPath:
          path: /dev/kmsg
      - name: localtime
        hostPath:
          path: /etc/localtime
      - name: config
        configMap:
          name: node-problem-detector-config
          items:
          - key: kernel-monitor.json
            path: kernel-monitor.json
          - key: docker-monitor.json
            path: docker-monitor.json

And finally the node-problem-detector-config.yaml is this:-

apiVersion: v1
data:
  kernel-monitor.json: |
    {
        "plugin": "kmsg",
        "logPath": "/dev/kmsg",
        "lookback": "5m",
        "bufferSize": 10,
        "source": "kernel-monitor",
        "conditions": [
            {
                "type": "KernelDeadlock",
                "reason": "KernelHasNoDeadlock",
                "message": "kernel has no deadlock"
            },
            {
                "type": "ReadonlyFilesystem",
                "reason": "FilesystemIsReadOnly",
                "message": "Filesystem is read-only"
            },
            {
                "type": "Ready",
                "reason": "NodeStatusUnknown",
                "message": "Kubelet stopped posting node status"
            }

        ],
        "rules": [

            {
                "type": "temporary",
                "reason": "NodeStatusUnknown",
                "pattern": "Kubelet stopped posting node status"
            },
            {
                "type": "temporary",
                "reason": "OOMKilling",
                "pattern": "Kill process \\d+ (.+) score \\d+ or sacrifice child\\nKilled process \\d+ (.+) total-vm:\\d+kB, anon-rss:\\d+kB, file-rss:\\d+kB.*"
            },
            {
                "type": "temporary",
                "reason": "TaskHung",
                "pattern": "task \\S+:\\w+ blocked for more than \\w+ seconds\\."
            },
            {
                "type": "temporary",
                "reason": "UnregisterNetDevice",
                "pattern": "unregister_netdevice: waiting for \\w+ to become free. Usage count = \\d+"
            },
            {
                "type": "temporary",
                "reason": "KernelOops",
                "pattern": "BUG: unable to handle kernel NULL pointer dereference at .*"
            },
            {
                "type": "temporary",
                "reason": "KernelOops",
                "pattern": "divide error: 0000 \\[#\\d+\\] SMP"
            },
            {
                "type": "permanent",
                "condition": "KernelDeadlock",
                "reason": "AUFSUmountHung",
                "pattern": "task umount\\.aufs:\\w+ blocked for more than \\w+ seconds\\."
            },
            {
                "type": "permanent",
                "condition": "KernelDeadlock",
                "reason": "DockerHung",
                "pattern": "task docker:\\w+ blocked for more than \\w+ seconds\\."
            },
            {
                "type": "permanent",
                "condition": "ReadonlyFilesystem",
                "reason": "FilesystemIsReadOnly",
                "pattern": "Remounting filesystem read-only"
            }
        ]
    }
  docker-monitor.json: |
    {
        "plugin": "journald",
        "pluginConfig": {
            "source": "dockerd"
        },
        "logPath": "/var/log/journal",
        "lookback": "5m",
        "bufferSize": 10,
        "source": "docker-monitor",
        "conditions": [],
        "rules": [
            {
                "type": "temporary",
                "reason": "CorruptDockerImage",
                "pattern": "Error trying v2 registry: failed to register layer: rename /var/lib/docker/image/(.+) /var/lib/docker/image/(.+): directory not empty.*"
            }
        ]
    }
kind: ConfigMap
metadata:
  name: node-problem-detector-config
  namespace: kube-system
dliao-tyro-admin commented 4 years ago

I came across something similar. Make sure to upgrade your draino and the helm chart to the newer version.

Also, for the clli args that you have supplied,Ready=Unknown. This is incorrect in the newer version and should be e.g. Ready=Unknown,10m. See https://github.com/planetlabs/draino/issues/33 for more info.

Note: I'm running release b788331 and this is working as expected.

dbenque commented 4 years ago

@dliao-tyro Did you have some complete draino success (cordon+drain) for such configuration Ready=Unknown,10m ? https://github.com/planetlabs/draino/issues/33#issuecomment-602466215

dliao-tyro-admin commented 4 years ago

I ran the following experiment to demonstrate the behaviour.

Stopping the kubelet service on one of the nodes. This will report back a NotReady status with the following:

Name:               ip-12-18-92-92.ap-southeast-2.compute.internal
Roles:              workload
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m5.large
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=ap-southeast-2
                    failure-domain.beta.kubernetes.io/zone=ap-southeast-2b
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-12-18-92-92.ap-southeast-2.compute.internal
                    kubernetes.io/lifecycle=spot
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/workload=true
Annotations:        node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 23 Mar 2020 17:28:25 +1100
Taints:             node.kubernetes.io/unreachable:NoExecute
                    node.kubernetes.io/unreachable:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  ip-12-18-92-92.ap-southeast-2.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Tue, 24 Mar 2020 08:17:20 +1100
Conditions:
  Type                        Status    LastHeartbeatTime                 LastTransitionTime                Reason                      Message
  ----                        ------    -----------------                 ------------------                ------                      -------
  KernelDeadlock              False     Tue, 24 Mar 2020 08:26:05 +1100   Mon, 23 Mar 2020 18:42:05 +1100   KernelHasNoDeadlock         kernel has no deadlock
  ReadonlyFilesystem          False     Tue, 24 Mar 2020 08:26:05 +1100   Mon, 23 Mar 2020 18:42:05 +1100   FilesystemIsNotReadOnly     Filesystem is not read-only
  CannotKillContainer         False     Tue, 24 Mar 2020 08:26:05 +1100   Mon, 23 Mar 2020 18:42:05 +1100   NoCannotKillContainer       System can stop containers
  FrequentKubeletRestart      False     Tue, 24 Mar 2020 08:26:05 +1100   Mon, 23 Mar 2020 18:42:07 +1100   FrequentKubeletRestart      kubelet is functioning properly
  FrequentDockerRestart       False     Tue, 24 Mar 2020 08:26:05 +1100   Mon, 23 Mar 2020 18:42:09 +1100   FrequentDockerRestart       docker is functioning properly
  FrequentContainerdRestart   False     Tue, 24 Mar 2020 08:26:05 +1100   Mon, 23 Mar 2020 18:42:11 +1100   FrequentContainerdRestart   containerd is functioning properly
  FrequentAwslogsdRestart     False     Tue, 24 Mar 2020 08:26:05 +1100   Mon, 23 Mar 2020 18:42:13 +1100   FrequentAwslogsdRestart     awslogsd is functioning properly
  MemoryPressure              Unknown   Tue, 24 Mar 2020 08:17:06 +1100   Tue, 24 Mar 2020 08:18:04 +1100   NodeStatusUnknown           Kubelet stopped posting node status.
  DiskPressure                Unknown   Tue, 24 Mar 2020 08:17:06 +1100   Tue, 24 Mar 2020 08:18:04 +1100   NodeStatusUnknown           Kubelet stopped posting node status.
  PIDPressure                 Unknown   Tue, 24 Mar 2020 08:17:06 +1100   Tue, 24 Mar 2020 08:18:04 +1100   NodeStatusUnknown           Kubelet stopped posting node status.
  Ready                       Unknown   Tue, 24 Mar 2020 08:17:06 +1100   Tue, 24 Mar 2020 08:18:04 +1100   NodeStatusUnknown           Kubelet stopped posting node status.

10 mins later, the draino logs will output:

{"level":"info","ts":1584998887.2430277,"caller":"kubernetes/eventhandler.go:139","msg":"Cordoned","node":"ip-12-18-92-92.ap-southeast-2.compute.internal"}
{"level":"info","ts":1584998887.2431808,"caller":"kubernetes/eventhandler.go:148","msg":"Scheduled drain","node":"ip-12-18-92-92.ap-southeast-2.compute.internal","after":1584999148.9898398}

I went back to start the kubelet service on the same node and the draino logs the following output:

{"level":"info","ts":1584998567.4310849,"caller":"draino/draino.go:172","msg":"node watcher is running"}
{"level":"info","ts":1584998887.2430277,"caller":"kubernetes/eventhandler.go:139","msg":"Cordoned","node":"ip-12-18-92-92.ap-southeast-2.compute.internal"}
{"level":"info","ts":1584998887.2431808,"caller":"kubernetes/eventhandler.go:148","msg":"Scheduled drain","node":"ip-10-18-92-92.ap-southeast-2.compute.internal","after":1584999148.9898398}
{"level":"info","ts":1584999250.2038314,"caller":"kubernetes/eventhandler.go:161","msg":"Drained","node":"ip-12-18-92-92.ap-southeast-2.compute.internal"}

So overall, this does seem to be working as expected with the following configuration in the container spec.

      - command:
        - /draino
        - CannotKillContainer
        - DiskPressure
        - FrequentContainerdRestart
        - FrequentDockerRestart
        - FrequentKubeletRestart
        - KernelDeadlock
        - NetworkUnavailable
        - OutOfDisk
        - PIDPressure
        - ReadonlyFilesystem
        - Ready=Unknown,10m

Let me know if you need more information.

dbenque commented 4 years ago

@dliao-tyro ok, but in order to have the drain completed you had to restart the kubelet, which will make the node responsive again (transition from unknown (aka notReady) to ready state). I was expecting the solution to work on top of a node that remains notReady. IMO it can due to the way the eviction API works, waiting for kubelet status to confirm pod eviction.

dliao-tyro-admin commented 4 years ago

@dbenque, The example above was just to demonstrate draino would attempt to drain the node if Ready=Unknown condition was met; not sure if there was a quick/simple way to set that particular node condition.

For the drain to be successful, I think there needs to be an expectation that kubelet remains operational so that it can report on status?

yogeek commented 3 years ago

Hi :)

Same problem here : we wanted to try draino with the Ready=Unknown,2m by stopping the kubelet and impossible for draino to finish the drain.

Does someone has a idea to solve this please ?