apache / openwhisk-deploy-kube

The Apache OpenWhisk Kubernetes Deployment repository supports deploying the Apache OpenWhisk system on Kubernetes and OpenShift clusters.
https://openwhisk.apache.org/
Apache License 2.0
302 stars 232 forks source link

getsockopt: connection refused causing liveness probe fail causing controller crashloop #324

Closed wcorbett36 closed 5 years ago

wcorbett36 commented 6 years ago

I am working through a helm deploy on minikube windows 10. I am getting stuck with a crashloop error on my controller. Pods and controller describe posted below. Please advise. Thanks,

λ kubectl get pods -n openwhisk NAME READY STATUS RESTARTS AGE apigateway-7b87dd957f-t8j2f 1/1 Running 0 37m controller-0 0/1 CrashLoopBackOff 15 37m couchdb-6b87ccfb78-frvbj 1/1 Running 0 37m init-couchdb-p4wt7 0/1 Completed 0 37m install-routemgmt-n8xp9 0/1 Init:0/1 0 37m invoker-6wtwl 0/1 Init:1/2 0 37m kafka-0 1/1 Running 0 37m nginx-598c75d7d6-wt25s 1/1 Running 0 37m redis-5d77674f65-7rq75 1/1 Running 0 37m zookeeper-0 1/1 Running 0 37m

λ kubectl describe pods/controller-0 -n openwhisk Name: controller-0 Namespace: openwhisk Node: minikube/192.168.86.187 Start Time: Wed, 24 Oct 2018 21:00:46 -0400 Labels: controller-revision-hash=controller-69db44c56b name=controller statefulset.kubernetes.io/pod-name=controller-0 Annotations: Status: Running IP: 172.17.0.8 Controlled By: StatefulSet/controller Init Containers: wait-for-kafka: Container ID: docker://03f69f049f2cf9a112109933560e69baab1720f354271b742fa5c893e1a05257 Image: busybox Image ID: docker-pullable://busybox@sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812 Port: Host Port: Command: sh -c result=1; until [ $result -eq 0 ]; do OK=$(echo ruok | nc -w 1 zookeeper-0.zookeeper.openwhisk.svc.cluster.local 2181); if [ "$OK" == "imok" ]; then result=0; echo "zookeeper returned imok!"; else echo waiting for zookeeper to be ready; sleep 1; fi done; echo "Zookeeper is up; will wait for 10 seconds to give kafka time to initialize"; sleep 10; State: Terminated Reason: Completed Exit Code: 0 Started: Wed, 24 Oct 2018 21:00:52 -0400 Finished: Wed, 24 Oct 2018 21:01:36 -0400 Ready: True Restart Count: 0 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from ow-core-token-v7kt2 (ro) wait-for-couchdb: Container ID: docker://e6de73d420993fde9cdac225ba622bae22af8400fab50cf1fa69376bdb7188bd Image: busybox Image ID: docker-pullable://busybox@sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812 Port: Host Port: Command: sh -c while true; do echo 'checking CouchDB readiness'; wget -T 5 --spider $READINESS_URL --header="Authorization: Basic d2hpc2tfYWRtaW46c29tZV9wYXNzdzByZA=="; result=$?; if [ $result -eq 0 ]; then echo 'Success: CouchDB is ready!'; break; fi; echo '...not ready yet; sleeping 3 seconds before retry'; sleep 3; done; State: Terminated Reason: Completed Exit Code: 0 Started: Wed, 24 Oct 2018 21:01:39 -0400 Finished: Wed, 24 Oct 2018 21:05:32 -0400 Ready: True Restart Count: 0 Environment: READINESS_URL: http://couchdb.openwhisk.svc.cluster.local:5984/ow_kube_couchdb_initialized_marker Mounts: /var/run/secrets/kubernetes.io/serviceaccount from ow-core-token-v7kt2 (ro) Containers: controller: Container ID: docker://363345272c48f82a50a4261a1433dd3e965bcba389d7b91e3300eb25b4797a92 Image: openwhisk/controller:latest Image ID: docker-pullable://openwhisk/controller@sha256:79b537a8f9aad7f799d7ff81e9e5a1bd6c74d329ae324ecf792c2a2b7c441dce Ports: 8080/TCP, 2552/TCP, 19999/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Command: /bin/bash -c /init.sh hostname | cut -d'-' -f2 State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 143 Started: Wed, 24 Oct 2018 21:24:02 -0400 Finished: Wed, 24 Oct 2018 21:24:25 -0400 Ready: False Restart Count: 11 Liveness: http-get http://:8080/ping delay=5s timeout=1s period=10s #success=1 #failure=3 Environment: PORT: 8080 CONFIG_whisk_info_date: <set to the key 'whisk_info_date' of config map 'whisk.config'> Optional: false CONFIG_whisk_info_buildNo: <set to the key 'whisk_info_buildNo' of config map 'whisk.config'> Optional: false JAVA_OPTS: -Xmx1024M CONTROLLER_OPTS: RUNTIMES_MANIFEST: { "runtimes": { "nodejs": [ { "kind": "nodejs", "image": { "prefix": "openwhisk", "name": "nodejsaction", "tag": "latest" }, "deprecated": true }, { "kind": "nodejs:6", "default": true, "image": { "prefix": "openwhisk", "name": "nodejs6action", "tag": "latest" }, "deprecated": false, "stemCells": [{ "count": 2, "memory": "256 MB" }] }, { "kind": "nodejs:8", "default": false, "image": { "prefix": "openwhisk", "name": "action-nodejs-v8", "tag": "latest" }, "deprecated": false } ], "python": [ { "kind": "python", "image": { "prefix": "openwhisk", "name": "python2action", "tag": "latest" }, "deprecated": false }, { "kind": "python:2", "default": true, "image": { "prefix": "openwhisk", "name": "python2action", "tag": "latest" }, "deprecated": false }, { "kind": "python:3", "image": { "prefix": "openwhisk", "name": "python3action", "tag": "latest" }, "deprecated": false } ], "swift": [ { "kind": "swift:3.1.1", "image": { "prefix": "openwhisk", "name": "action-swift-v3.1.1", "tag": "latest" }, "deprecated": false }, { "kind": "swift:4.1", "default": true, "image": { "prefix": "openwhisk", "name": "action-swift-v4.1", "tag": "latest" }, "deprecated": false } ], "java": [ { "kind": "java", "default": true, "image": { "prefix": "openwhisk", "name": "java8action", "tag": "latest" }, "deprecated": false, "attached": { "attachmentName": "jarfile", "attachmentType": "application/java-archive" }, "sentinelledLogs": false, "requireMain": true } ], "php": [ { "kind": "php:7.1", "default": true, "deprecated": false, "image": { "prefix": "openwhisk", "name": "action-php-v7.1", "tag": "latest" } } ], "ruby": [ { "kind": "ruby:2.5", "default": true, "deprecated": false, "image": { "prefix": "openwhisk", "name": "action-ruby-v2.5", "tag": "latest" } } ] }, "blackboxes": [ { "prefix": "openwhisk", "name": "dockerskeleton", "tag": "latest" } ] }

  CONFIG_whisk_loadbalancer_invokerUserMemory:     2048m
  KAFKA_HOSTS:                                     kafka.openwhisk.svc.cluster.local:9092
  KAFKA_HOST_PORT:                                 9092
  CONFIG_whisk_couchdb_username:                   <set to the key 'db_username' in secret 'db.auth'>        Optional: false
  CONFIG_whisk_couchdb_password:                   <set to the key 'db_password' in secret 'db.auth'>        Optional: false
  CONFIG_whisk_couchdb_port:                       <set to the key 'db_port' of config map 'db.config'>      Optional: false
  CONFIG_whisk_couchdb_protocol:                   <set to the key 'db_protocol' of config map 'db.config'>  Optional: false
  CONFIG_whisk_couchdb_host:                       couchdb.openwhisk.svc.cluster.local
  CONFIG_whisk_couchdb_provider:                   <set to the key 'db_provider' of config map 'db.config'>           Optional: false
  CONFIG_whisk_couchdb_databases_WhiskActivation:  <set to the key 'db_whisk_activations' of config map 'db.config'>  Optional: false
  CONFIG_whisk_couchdb_databases_WhiskEntity:      <set to the key 'db_whisk_actions' of config map 'db.config'>      Optional: false
  CONFIG_whisk_couchdb_databases_WhiskAuth:        <set to the key 'db_whisk_auths' of config map 'db.config'>        Optional: false
  LIMITS_ACTIONS_SEQUENCE_MAXLENGTH:               50
  LIMITS_TRIGGERS_FIRES_PERMINUTE:                 60
  LIMITS_ACTIONS_INVOKES_PERMINUTE:                60
  LIMITS_ACTIONS_INVOKES_CONCURRENT:               30
  CONTROLLER_INSTANCES:                            1
Mounts:
  /var/run/secrets/kubernetes.io/serviceaccount from ow-core-token-v7kt2 (ro)

Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: ow-core-token-v7kt2: Type: Secret (a volume populated by a Secret) SecretName: ow-core-token-v7kt2 Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message


Normal Scheduled 24m default-scheduler Successfully assigned controller-0 to minikube Normal SuccessfulMountVolume 24m kubelet, minikube MountVolume.SetUp succeeded for volume "ow-core-token-v7kt2" Normal Pulled 24m kubelet, minikube Container image "busybox" already present on machine Normal Created 23m kubelet, minikube Created container Normal Started 23m kubelet, minikube Started container Normal Pulled 23m kubelet, minikube Container image "busybox" already present on machine Normal Created 23m kubelet, minikube Created container Normal Started 23m kubelet, minikube Started container Warning Unhealthy 18m (x5 over 19m) kubelet, minikube Liveness probe failed: Get http://172.17.0.8:8080/ping: dial tcp 172.17.0.8:8080: getsockopt: connection refused Normal Pulling 18m (x3 over 19m) kubelet, minikube pulling image "openwhisk/controller:latest" Normal Pulled 18m (x3 over 19m) kubelet, minikube Successfully pulled image "openwhisk/controller:latest" Normal Killing 18m (x2 over 18m) kubelet, minikube Killing container with id docker://controller:Container failed liveness probe.. Container will be killed and recreated. Normal Created 18m (x3 over 19m) kubelet, minikube Created container Normal Started 18m (x3 over 19m) kubelet, minikube Started container Warning BackOff 3m (x56 over 17m) kubelet, minikube Back-off restarting failed containe

dgrove-oss commented 6 years ago

One reason for the controller to fail on startup is that it cannot connect to kafka, because kafka is not really fully functional. On minikube, you have to execute the command minikube ssh -- sudo ip link set docker0 promisc on every time you start (or restart) minikube to get kafka to work. Perhaps this is the problem?

wcorbett36 commented 6 years ago

I had some good luck with that command however now I'm facing auth issues... I can do a list command...

thanks,

C:\Users\u390982\openwhisk\kubernetes-openwhisk\incubator-openwhisk-deploy-kube (master) λ wsk -i property set --auth 23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP ok: whisk auth set. Run 'wsk property get --auth' to see the new value.

C:\Users\u390982\openwhisk\kubernetes-openwhisk\incubator-openwhisk-deploy-kube (master) λ wsk -i action get /whisk.system/samples/greeting --save error: Unable to get action 'samples/greeting': The supplied authentication is not authorized to access 'whisk.system/samples'. (code hBPMWW16ZcDGx7QZnHu7O5XOstoslNYI) Run 'wsk --help' for usage.

C:\Users\u390982\openwhisk\kubernetes-openwhisk\incubator-openwhisk-deploy-kube (master) λ wsk property get --auth whisk auth 23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP

dgrove-oss commented 6 years ago

The whisk.system packages get deployed using a different auth. Try using the system auth instead wsk -i property set --auth 789c46b1-71f6-4ed5-8c54-816aa4f8c502:abczO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP

wcorbett36 commented 6 years ago

The whisk.system packages get deployed using a different auth. Try using the system auth instead wsk -i property set --auth 789c46b1-71f6-4ed5-8c54-816aa4f8c502:abczO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP

Thanks, however I still got bad auth after this command.

dgrove-oss commented 6 years ago

It works for me; not sure why it is not working for you. Try passing in the auth directly in the command.

Daves-MBP:helm dgrove$ wsk -i --auth 789c46b1-71f6-4ed5-8c54-816aa4f8c502:abczO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP action get /whisk.system/samples/greeting
ok: got action samples/greeting
{
    "namespace": "whisk.system/samples",
    "name": "greeting",
    "version": "0.0.1",
    "exec": {
        "kind": "nodejs:6",
        "binary": false
    },
    "annotations": [
        {
            "key": "description",
            "value": "Returns a friendly greeting"
        },
        {
            "key": "sampleLogOutput",
            "value": "2016-03-22T01:07:08.384982272Z stdout: params: { place: 'Narrowsburg', payload: 'Cat' }"
        },
        {
            "key": "sampleOutput",
            "value": {
                "payload": "Hello, Cat from Narrowsburg!"
            }
        },
        {
            "key": "sampleInput",
            "value": {
                "payload": "Cat",
                "place": "Narrowsburg"
            }
        },
        {
            "key": "exec",
            "value": "nodejs:6"
        },
        {
            "key": "parameters",
            "value": [
                {
                    "name": "name",
                    "required": false
                },
                {
                    "description": "The string to be included in the return value",
                    "name": "place",
                    "required": false
                }
            ]
        }
    ],
    "limits": {
        "timeout": 60000,
        "memory": 256,
        "logs": 10
    },
    "publish": false
}
dgrove-oss commented 5 years ago

Closing as stale