openebs / charts

OpenEBS Helm Charts and other utilities
Apache License 2.0
101 stars 96 forks source link

Health fails for pod maya-apiserver #262

Open robinhuiser opened 3 years ago

robinhuiser commented 3 years ago

What steps did you take and what happened:

What did you expect to happen:

# 
$ kubectl get pods -n openebs -o wide
NAME                                           READY   STATUS      RESTARTS   AGE
openebs-ndm-x6q6g                              1/1     Running     0          18h
openebs-ndm-zl9jc                              1/1     Running     0          18h
openebs-ndm-operator-7d69c98987-lz2cc          1/1     Running     0          18h
openebs-provisioner-74b57cbdbd-g4tws           1/1     Running     0          18h
openebs-snapshot-operator-5b9dfd4fcd-hnbnn     2/2     Running     0          18h
openebs-admission-server-789b9d6dbd-t2hkt      1/1     Running     1          18h
openebs-localpv-provisioner-776b54f698-flfh4   1/1     Running     0          18h
maya-apiserver-6f79bb87bd-58kp7                0/1     Running     0          15h

The output of the following commands will help us better understand what's going on:

# Describe pod so we can see the reason (timeout on liveness probe)
$ kubectl describe pod -n openebs maya-apiserver-6f79bb87bd-58kp7
Name:         maya-apiserver-6f79bb87bd-58kp7
Namespace:    openebs
Priority:     0
Node:         node-08/10.0.19.18
Start Time:   Wed, 03 Mar 2021 19:25:00 +0000
Labels:       name=maya-apiserver
              openebs.io/component-name=maya-apiserver
              openebs.io/version=2.6.0
              pod-template-hash=6f79bb87bd
Annotations:  cni.projectcalico.org/podIP: 10.1.251.215/32
              cni.projectcalico.org/podIPs: 10.1.251.215/32
Status:       Running
IP:           10.1.251.215
IPs:
  IP:           10.1.251.215
Controlled By:  ReplicaSet/maya-apiserver-6f79bb87bd
Containers:
  maya-apiserver:
    Container ID:   containerd://aacdb0533b88a39d3ca1d06a4be1fcf7e06b9c2c7a605874758d5131118e4b04
    Image:          registry.tekqube.lan/openebs/m-apiserver:2.6.0
    Image ID:       registry.tekqube.lan/openebs/m-apiserver@sha256:16f2a6d8a20d28d1326bae83e7adac2db05b5388b6a10c43e22d93153b963b9d
    Port:           5656/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 03 Mar 2021 19:25:16 +0000
    Ready:          False
    Restart Count:  0
    Liveness:       exec [/usr/local/bin/mayactl version] delay=30s timeout=1s period=60s #success=1 #failure=3
    Readiness:      exec [/usr/local/bin/mayactl version] delay=30s timeout=1s period=60s #success=1 #failure=3
    Environment:
      OPENEBS_NAMESPACE:                             openebs (v1:metadata.namespace)
      OPENEBS_SERVICE_ACCOUNT:                        (v1:spec.serviceAccountName)
      OPENEBS_MAYA_POD_NAME:                         maya-apiserver-6f79bb87bd-58kp7 (v1:metadata.name)
      OPENEBS_IO_CREATE_DEFAULT_STORAGE_CONFIG:      true
      OPENEBS_IO_INSTALL_DEFAULT_CSTOR_SPARSE_POOL:  false
      OPENEBS_IO_JIVA_CONTROLLER_IMAGE:              openebs/jiva:2.6.0
      OPENEBS_IO_JIVA_REPLICA_IMAGE:                 openebs/jiva:2.6.0
      OPENEBS_IO_JIVA_REPLICA_COUNT:                 3
      OPENEBS_IO_CSTOR_TARGET_IMAGE:                 openebs/cstor-istgt:2.6.0
      OPENEBS_IO_CSTOR_POOL_IMAGE:                   openebs/cstor-pool:2.6.0
      OPENEBS_IO_CSTOR_POOL_MGMT_IMAGE:              openebs/cstor-pool-mgmt:2.6.0
      OPENEBS_IO_CSTOR_VOLUME_MGMT_IMAGE:            openebs/cstor-volume-mgmt:2.6.0
      OPENEBS_IO_VOLUME_MONITOR_IMAGE:               openebs/m-exporter:2.6.0
      OPENEBS_IO_CSTOR_POOL_EXPORTER_IMAGE:          openebs/m-exporter:2.6.0
      OPENEBS_IO_HELPER_IMAGE:                       openebs/linux-utils:2.6.0
      OPENEBS_IO_ENABLE_ANALYTICS:                   false
      OPENEBS_IO_INSTALLER_TYPE:                     openebs-operator
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from openebs-maya-operator-token-bt5rb (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  openebs-maya-operator-token-bt5rb:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  openebs-maya-operator-token-bt5rb
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                    From     Message
  ----     ------     ----                   ----     -------
  Warning  Unhealthy  3m47s (x910 over 15h)  kubelet  Liveness probe errored: rpc error: code = DeadlineExceeded desc = failed to exec in container: timeout 1s exceeded: context deadline exceeded
# Get logs for startup api server
$ kubectl logs -n openebs maya-apiserver-6f79bb87bd-58kp7
+ MAYA_API_SERVER_NETWORK=eth0
+ ip -4 addr show scope global dev eth0
+ grep inet
+ awk '{print $2}'
+ cut -d / -f 1
+ CONTAINER_IP_ADDR=10.1.251.215
+ exec /usr/local/bin/maya-apiserver start '--bind=10.1.251.215'
I0303 19:25:16.338105       1 start.go:148] Initializing maya-apiserver...
I0303 19:25:16.529286       1 start.go:279] Starting maya api server ...
I0303 19:25:20.610703       1 start.go:288] resources applied successfully by installer
I0303 19:25:20.704653       1 start.go:193] Maya api server configuration:
I0303 19:25:20.704744       1 start.go:195]          Log Level: INFO
I0303 19:25:20.704782       1 start.go:195]             Region: global (DC: dc1)
I0303 19:25:20.704816       1 start.go:195]            Version: 2.6.0-released
I0303 19:25:20.704846       1 start.go:201] 
I0303 19:25:20.704876       1 start.go:204] Maya api server started! Log data will stream in below:
I0303 19:25:20.713460       1 runner.go:37] Starting SPC controller
I0303 19:25:20.713523       1 runner.go:40] Waiting for informer caches to sync
I0303 19:25:20.913834       1 runner.go:45] Checking for preupgrade tasks
I0303 19:25:20.948354       1 runner.go:51] Starting SPC workers
I0303 19:25:20.948478       1 runner.go:58] Started SPC workers

Anything else you would like to add:

When attaching with a shell to the running pod, I can confirm the liveness probe fails since the command takes around 12 seconds (!) to finish:

$ time mayactl version
Version: 2.6.0-released
Git commit: 519dc0e567d77f3573e4e5b8096f1450e8928f54
GO Version: go1.14.7
GO ARCH: arm64
GO OS: linux
m-apiserver url:  http://10.1.251.215:5656
m-apiserver status:  running

real    0m12.066s
user    0m0.012s
sys     0m0.046s

When I run the same command, but specify the server and port using parameters, the command finishes within milliseconds:

$ time mayactl -m 10.1.251.215 -p 5656 version
Version: 2.6.0-released
Git commit: 519dc0e567d77f3573e4e5b8096f1450e8928f54
GO Version: go1.14.7
GO ARCH: arm64
GO OS: linux
m-apiserver url:  http://10.1.251.215:5656
m-apiserver status:  running

real    0m0.037s
user    0m0.016s
sys     0m0.023s

I assume the CLI takes some of the environment variables to connect to its status endpoint, not sure what has been set incorrectly; I have included the environment set below:

$ env 
KUBERNETES_SERVICE_PORT_HTTPS=443
OPENEBS_NAMESPACE=openebs
KUBERNETES_SERVICE_PORT=443
OPENEBS_IO_ENABLE_ANALYTICS=false
HOSTNAME=maya-apiserver-6f79bb87bd-58kp7
OPENEBS_IO_CSTOR_TARGET_IMAGE=openebs/cstor-istgt:2.6.0
OPENEBS_MAYA_POD_NAME=maya-apiserver-6f79bb87bd-58kp7
ADMISSION_SERVER_SVC_PORT_443_TCP=tcp://10.152.183.18:443
ADMISSION_SERVER_SVC_SERVICE_HOST=10.152.183.18
MAYA_APISERVER_SERVICE_SERVICE_HOST=10.152.183.216
ADMISSION_SERVER_SVC_PORT=tcp://10.152.183.18:443
PWD=/
MAYA_APISERVER_SERVICE_PORT_5656_TCP_PROTO=tcp
OPENEBS_IO_CSTOR_POOL_MGMT_IMAGE=openebs/cstor-pool-mgmt:2.6.0
OPENEBS_IO_HELPER_IMAGE=openebs/linux-utils:2.6.0
OPENEBS_SERVICE_ACCOUNT=openebs-maya-operator
HOME=/root
OPENEBS_IO_CSTOR_VOLUME_MGMT_IMAGE=openebs/cstor-volume-mgmt:2.6.0
OPENEBS_IO_JIVA_CONTROLLER_IMAGE=openebs/jiva:2.6.0
KUBERNETES_PORT_443_TCP=tcp://10.152.183.1:443
MAYA_APISERVER_SERVICE_SERVICE_PORT=5656
MAYA_APISERVER_SERVICE_PORT_5656_TCP_ADDR=10.152.183.216
OPENEBS_IO_CSTOR_POOL_EXPORTER_IMAGE=openebs/m-exporter:2.6.0
MAYA_APISERVER_SERVICE_PORT_5656_TCP=tcp://10.152.183.216:5656
MAYA_APISERVER_SERVICE_SERVICE_PORT_API=5656
OPENEBS_IO_CSTOR_POOL_IMAGE=openebs/cstor-pool:2.6.0
TERM=xterm
ADMISSION_SERVER_SVC_SERVICE_PORT=443
SHLVL=1
ADMISSION_SERVER_SVC_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
ADMISSION_SERVER_SVC_PORT_443_TCP_ADDR=10.152.183.18
KUBERNETES_PORT_443_TCP_ADDR=10.152.183.1
OPENEBS_IO_VOLUME_MONITOR_IMAGE=openebs/m-exporter:2.6.0
OPENEBS_IO_JIVA_REPLICA_COUNT=3
KUBERNETES_SERVICE_HOST=10.152.183.1
MAYA_APISERVER_SERVICE_PORT_5656_TCP_PORT=5656
KUBERNETES_PORT=tcp://10.152.183.1:443
KUBERNETES_PORT_443_TCP_PORT=443
ADMISSION_SERVER_SVC_PORT_443_TCP_PROTO=tcp
MAYA_API_SERVER_NETWORK=eth0
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
OPENEBS_IO_INSTALL_DEFAULT_CSTOR_SPARSE_POOL=false
MAYA_APISERVER_SERVICE_PORT=tcp://10.152.183.216:5656
OPENEBS_IO_CREATE_DEFAULT_STORAGE_CONFIG=true
OPENEBS_IO_JIVA_REPLICA_IMAGE=openebs/jiva:2.6.0
OPENEBS_IO_INSTALLER_TYPE=openebs-operator
_=/usr/bin/env

Environment:

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.7-34+fa60fe11bf77d0", GitCommit:"fa60fe11bf77d0c591abbc397e178efe296f83f9", GitTreeState:"clean", BuildDate:"2021-02-11T20:46:36Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/arm64"}
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
robinhuiser commented 3 years ago

Additional information:

robinhuiser commented 3 years ago

Temporarily workaround - updated the spec for deployment maya-apiserver with:

        env:
        - name: MY_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP

        livenessProbe:
          exec:
            command:
            - sh
            - -c 
            - /usr/local/bin/mayactl
            - -m $MY_POD_IP
            - version
          initialDelaySeconds: 30
          periodSeconds: 60
        readinessProbe:
          exec:
            command:
            - sh
            - -c 
            - /usr/local/bin/mayactl
            - -m $MY_POD_IP
            - version
          initialDelaySeconds: 30
          periodSeconds: 60
prateekpandey14 commented 3 years ago

@robinhuiser did you try running by just doing ?

        livenessProbe:
          exec:
            command:
            - sh
            - -c 
            - /usr/local/bin/mayactl
            - version