kubeflow / manifests

A repository for Kustomize manifests
Apache License 2.0
773 stars 839 forks source link

authservice-0 is not ready #2220

Closed jonghyunho closed 10 months ago

jonghyunho commented 2 years ago

I think issue is related to https://github.com/kubeflow/manifests/issues/2064, but it was closed as unresolved.

authservice-0 is not ready with message OIDC provider setup failed and Readiness probe failed: HTTP probe failed with statuscode: 503.

Steps to reproduce

  1. kubernetes 1.21.13 installed
  2. kustomize 3.2.0 installed
  3. git clone -b v1.5-branch https://github.com/kubeflow/manifests.git (didn't change anything of manifests)
  4. while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

Problem

$ kubectl get pods authservice-0 -n istio-system
NAME            READY   STATUS    RESTARTS   AGE
authservice-0   0/1     Running   0          29m
$ kubectl logs authservice-0 -n istio-system
time="2022-06-02T20:08:41Z" level=info msg="Starting readiness probe at 8081"
time="2022-06-02T20:08:41Z" level=info msg="No  USERID_TOKEN_HEADER  specified, using 'kubeflow-userid-token' as default."
time="2022-06-02T20:08:41Z" level=info msg="No  SERVER_HOSTNAME  specified, using '' as default."
time="2022-06-02T20:08:41Z" level=info msg="No  SERVER_PORT  specified, using '8080' as default."
time="2022-06-02T20:08:41Z" level=info msg="No  SESSION_MAX_AGE  specified, using '86400' as default."
time="2022-06-02T20:08:41Z" level=info msg="Starting web server at :8080"
time="2022-06-02T20:08:41Z" level=error msg="OIDC provider setup failed, retrying in 10 seconds: Get http://dex.auth.svc.cluster.local:5556/dex/.well-known/openid-configuration: dial tcp 218.38.137.27:5556: connect: connection refused"
time="2022-06-02T20:08:52Z" level=error msg="OIDC provider setup failed, retrying in 10 seconds: Get http://dex.auth.svc.cluster.local:5556/dex/.well-known/openid-configuration: dial tcp 218.38.137.27:5556: connect: connection refused"
time="2022-06-02T20:09:02Z" level=error msg="OIDC provider setup failed, retrying in 10 seconds: Get http://dex.auth.svc.cluster.local:5556/dex/.well-known/openid-configuration: dial tcp 218.38.137.27:5556: connect: connection refused"
$ kubectl describe pod authservice-0 -n istio-system
Name:         authservice-0
Namespace:    istio-system
Priority:     0
Node:         ubuntu/221.143.109.131
Start Time:   Fri, 03 Jun 2022 05:08:35 +0900
Labels:       app=authservice
              controller-revision-hash=authservice-6db8b4db64
              statefulset.kubernetes.io/pod-name=authservice-0
Annotations:  cni.projectcalico.org/containerID: 6a76076ced3cb25345c600e567daeca6189b58b55a13d6b9eaa3c2d60e4e2469
              cni.projectcalico.org/podIP: 192.168.243.255/32
              cni.projectcalico.org/podIPs: 192.168.243.255/32
              sidecar.istio.io/inject: false
Status:       Running
IP:           192.168.243.255
IPs:
  IP:           192.168.243.255
Controlled By:  StatefulSet/authservice
Containers:
  authservice:
    Container ID:   docker://5dcc81b000c1fba0d50c5725c3df3e996d5f6f19acbc840f6acb111871c61aa6
    Image:          gcr.io/arrikto/kubeflow/oidc-authservice:28c59ef
    Image ID:       docker-pullable://gcr.io/arrikto/kubeflow/oidc-authservice@sha256:c9450b805ad5c333f6a0d9491719a1d3fb4449fe017e37d3ad4c7591c763746b
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Fri, 03 Jun 2022 05:08:41 +0900
    Ready:          False
    Restart Count:  0
    Readiness:      http-get http://:8081/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment Variables from:
      oidc-authservice-client      Secret     Optional: false
      oidc-authservice-parameters  ConfigMap  Optional: false
    Environment:                   <none>
    Mounts:
      /var/lib/authservice from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tct4v (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  authservice-pvc
    ReadOnly:   false
  kube-api-access-tct4v:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  33m                   default-scheduler  0/1 nodes are available: 1 persistentvolumeclaim "authservice-pvc" not found.
  Normal   Scheduled         33m                   default-scheduler  Successfully assigned istio-system/authservice-0 to ubuntu
  Normal   Pulling           33m                   kubelet            Pulling image "gcr.io/arrikto/kubeflow/oidc-authservice:28c59ef"
  Normal   Pulled            33m                   kubelet            Successfully pulled image "gcr.io/arrikto/kubeflow/oidc-authservice:28c59ef" in 1.006139432s
  Normal   Created           33m                   kubelet            Created container authservice
  Normal   Started           33m                   kubelet            Started container authservice
  Warning  Unhealthy         3m8s (x182 over 33m)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 503

Additional Information

$ kubectl get pods -A
NAMESPACE                   NAME                                                         READY   STATUS    RESTARTS   AGE
auth                        dex-5ddf47d88d-wrq6j                                         1/1     Running   0          34m
calico-apiserver            calico-apiserver-75db5d987c-gkjm4                            1/1     Running   0          57m
calico-apiserver            calico-apiserver-75db5d987c-kz9p9                            1/1     Running   0          57m
calico-system               calico-kube-controllers-59f859b79d-v772r                     1/1     Running   0          58m
calico-system               calico-node-dcgws                                            1/1     Running   0          58m
calico-system               calico-typha-79fff88df5-45gns                                1/1     Running   0          58m
cert-manager                cert-manager-7b8c77d4bd-f4277                                1/1     Running   0          34m
cert-manager                cert-manager-cainjector-7c744f57b5-bhjfj                     1/1     Running   0          34m
cert-manager                cert-manager-webhook-fcd445bc4-kr7wf                         1/1     Running   0          34m
istio-system                authservice-0                                                0/1     Running   0          34m
istio-system                cluster-local-gateway-64f58f66cb-fwn54                       1/1     Running   0          34m
istio-system                istio-ingressgateway-8577c57fb6-zbxxb                        1/1     Running   0          34m
istio-system                istiod-6c86784695-79mfb                                      1/1     Running   0          34m
knative-eventing            eventing-controller-79895f9c56-9jgqq                         1/1     Running   0          32m
knative-eventing            eventing-webhook-78f897666-dg22g                             1/1     Running   0          32m
knative-eventing            imc-controller-688df5bdb4-nft8p                              1/1     Running   0          32m
knative-eventing            imc-dispatcher-646978d797-tz9jw                              1/1     Running   0          32m
knative-eventing            mt-broker-controller-67c977497-tjnnc                         1/1     Running   0          32m
knative-eventing            mt-broker-filter-66d4d77c8b-lz9jv                            1/1     Running   0          32m
knative-eventing            mt-broker-ingress-5c8dc4b5d7-9xjpp                           1/1     Running   0          32m
knative-serving             activator-7476cc56d4-wx9gm                                   2/2     Running   0          30m
knative-serving             autoscaler-5c648f7465-jth22                                  2/2     Running   0          30m
knative-serving             controller-57c545cbfb-plfzk                                  2/2     Running   0          30m
knative-serving             istio-webhook-578b6b7654-jtbld                               2/2     Running   0          30m
knative-serving             networking-istio-6b88f745c-c8cxg                             2/2     Running   0          30m
knative-serving             webhook-6fffdc4d78-lpjlv                                     2/2     Running   0          30m
kserve                      kserve-controller-manager-0                                  2/2     Running   0          34m
kube-system                 coredns-558bd4d5db-kqjhf                                     1/1     Running   0          61m
kube-system                 coredns-558bd4d5db-tfd76                                     1/1     Running   0          61m
kube-system                 etcd-ubuntu                                                  1/1     Running   1          61m
kube-system                 kube-apiserver-ubuntu                                        1/1     Running   1          61m
kube-system                 kube-controller-manager-ubuntu                               1/1     Running   0          61m
kube-system                 kube-proxy-46glr                                             1/1     Running   0          61m
kube-system                 kube-scheduler-ubuntu                                        1/1     Running   1          61m
kubeflow-user-example-com   ml-pipeline-ui-artifact-d57bd98d7-v5q5z                      2/2     Running   0          41m
kubeflow-user-example-com   ml-pipeline-visualizationserver-65f5bfb4bf-hzj5m             2/2     Running   0          41m
kubeflow                    admission-webhook-deployment-7df7558c67-7sf5b                1/1     Running   0          33m
kubeflow                    cache-deployer-deployment-6f4bcc969-smkk5                    2/2     Running   1          33m
kubeflow                    cache-server-575d97c95-w4n7s                                 2/2     Running   0          33m
kubeflow                    centraldashboard-79f489b55-kjxwv                             2/2     Running   0          33m
kubeflow                    jupyter-web-app-deployment-5886974887-mhtjn                  1/1     Running   0          33m
kubeflow                    katib-controller-58ddb4b856-k8687                            1/1     Running   0          33m
kubeflow                    katib-db-manager-d77c6757f-xfgpk                             1/1     Running   0          33m
kubeflow                    katib-mysql-7894994f88-lnft4                                 1/1     Running   0          33m
kubeflow                    katib-ui-f787b9d88-qgs9w                                     1/1     Running   0          33m
kubeflow                    kfserving-controller-manager-0                               2/2     Running   0          33m
kubeflow                    kfserving-models-web-app-7884f597cf-xs48h                    2/2     Running   0          33m
kubeflow                    kserve-models-web-app-5c64c8d8bb-pqb62                       2/2     Running   0          33m
kubeflow                    kubeflow-pipelines-profile-controller-84bcbdb899-k9sk8       1/1     Running   0          33m
kubeflow                    metacontroller-0                                             1/1     Running   0          33m
kubeflow                    metadata-envoy-deployment-7b847ff6c5-75tvm                   1/1     Running   0          33m
kubeflow                    metadata-grpc-deployment-f8d68f687-fr7st                     2/2     Running   3          33m
kubeflow                    metadata-writer-78fc7d5bb8-58t5s                             2/2     Running   0          33m
kubeflow                    minio-5b65df66c9-qv2hj                                       2/2     Running   0          33m
kubeflow                    ml-pipeline-7bb5966955-jqzqc                                 2/2     Running   1          33m
kubeflow                    ml-pipeline-persistenceagent-87b6888c4-ckd4f                 2/2     Running   0          33m
kubeflow                    ml-pipeline-scheduledworkflow-665847bb9-8scpq                2/2     Running   0          33m
kubeflow                    ml-pipeline-ui-554ffbd6cd-4kfkf                              2/2     Running   0          33m
kubeflow                    ml-pipeline-viewer-crd-68777557fb-qtqnw                      2/2     Running   1          33m
kubeflow                    ml-pipeline-visualizationserver-66c54744c-d4hxn              2/2     Running   0          33m
kubeflow                    mysql-f7b9b7dd4-cf7j5                                        2/2     Running   0          33m
kubeflow                    notebook-controller-deployment-7474fbff66-bq85d              2/2     Running   1          33m
kubeflow                    profiles-deployment-5cc86bc965-dbr9q                         3/3     Running   1          33m
kubeflow                    tensorboard-controller-controller-manager-5cbddb7fb5-v6ksk   3/3     Running   1          33m
kubeflow                    tensorboards-web-app-deployment-7c5db448d7-fr47h             1/1     Running   0          33m
kubeflow                    training-operator-6bfc7b8d86-vkxfh                           1/1     Running   0          33m
kubeflow                    volumes-web-app-deployment-87484c848-jx2ws                   1/1     Running   0          33m
kubeflow                    workflow-controller-5cb67bb9db-4wckt                         2/2     Running   1          33m
tigera-operator             tigera-operator-85cfb9cdf7-scr8p                             1/1     Running   0          58m
shilf1 commented 2 years ago

Hello~ Mr. Ho jonghyun

I got same issue in my side. as I know this is related with kubernetes version. (Dex with Kubernetes v1.21) I tried this https://github.com/dexidp/dex/issues/2082 and it works.

please try to edit this file : common/dex/base/deployment.yaml https://github.com/kubeflow/manifests/pull/1883/files

- name: KUBERNETES_POD_NAMESPACE
  valueFrom:
   fieldRef:
     fieldPath: metadata.namespace
jonghyunho commented 2 years ago

please try to edit this file : common/dex/base/deployment.yaml https://github.com/kubeflow/manifests/pull/1883/files

I've been trying to use master or v1.5-branch branch which already includes https://github.com/kubeflow/manifests/pull/1883.

So, it still happens with https://github.com/kubeflow/manifests/pull/1883.

silverlining21 commented 2 years ago

same problem fix by ref to https://kuboard.cn/install/faq/selfLink.html k8s > 1.20 kube-apiserver remove param metadata.selfLink, but nfs-client-provisioner require it. simple update kube-apiserver by add - --feature-gates=RemoveSelfLink=false to kube-apiserver yaml.

DeepTalk19 commented 1 year ago

same problem, Please check whether the storageclass and PVs are set properly

Don12138 commented 1 year ago

https://github.com/kubeflow/manifests/issues/2064#issuecomment-1475079257 look at this

juliusvonkohout commented 10 months ago

/close

There has been no activity for a long time. Please reopen if necessary.

google-oss-prow[bot] commented 10 months ago

@juliusvonkohout: Closing this issue.

In response to [this](https://github.com/kubeflow/manifests/issues/2220#issuecomment-1692037276): >/close > >There has been no activity for a long time. Please reopen if necessary. Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
sachingupta771 commented 4 months ago

/reopen facing same isssue with kubeflow 1.7, the pod is crashing without any logs and issues listed, not getting what is happening

google-oss-prow[bot] commented 4 months ago

@sachingupta771: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to [this](https://github.com/kubeflow/manifests/issues/2220#issuecomment-1990932819): >/reopen >facing same isssue with kubeflow 1.7, the pod is crashing without any logs and issues listed, not getting what is happening Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
juliusvonkohout commented 4 months ago

@sachingupta771 Please use 1.8 or 1.8.1. Version 1.7 is end of life. If you need long term support, there are commercial distributions and freelancers available.