NVIDIA / deepops

Tools for building GPU clusters
BSD 3-Clause "New" or "Revised" License
1.27k stars 331 forks source link

Errors in k8s_deploy_kubeflow.sh #566

Closed syedfahadrizvi closed 4 years ago

syedfahadrizvi commented 4 years ago

We've set up a single-node k8s cluster using k8s-cluster.yml playbook and are able to run both the kubectl run command and k8s_verify_gpu.sh to verify GPUs are accessible. The specifications are:

After setting up rook-ceph and the monitoring tools using k8s_deploy_rook.sh and k8s_deploy_monitoring.sh respectively, we tried to run k8s_deploy_kubeflow.sh. Upon failure, we tried to set Kubeflow up manually by following the steps mentioned on Kubeflow Deployment with kfctl_k8s_istio, but using the file specified within the script (i.e. kfctl_k8s_istio.yaml). The exact same error was outputted and can be found below:

filename=\"coordinator/coordinator.go:120" INFO[0000] Creating directory /root/kubeflow-install/my-kubeflow/.cache filename="kfconfig/types.go:445" INFO[0000] Fetching https://github.com/kubeflow/manifests/archive/master.tar.gz to /root/kubeflow-install/my-kubeflow/.cache/manifests filename="kfconfig/types.go:493" INFO[0001] updating localPath to /root/kubeflow-install/my-kubeflow/.cache/manifests/manifests-master filename="kfconfig/types.go:540" INFO[0001] Fetch succeeded; LocalPath /root/kubeflow-install/my-kubeflow/.cache/manifests/manifests-master filename="kfconfig/types.go:561" INFO[0001] Processing application: istio-crds filename="kustomize/kustomize.go:408" INFO[0001] Processing application: istio-install filename="kustomize/kustomize.go:408" INFO[0001] Processing application: cluster-local-gateway filename="kustomize/kustomize.go:408" INFO[0001] Processing application: istio filename="kustomize/kustomize.go:408" INFO[0001] Processing application: add-anonymous-user-filter filename="kustomize/kustomize.go:408" INFO[0001] Processing application: application-crds filename="kustomize/kustomize.go:408" INFO[0001] Processing application: application filename="kustomize/kustomize.go:408" INFO[0001] Processing application: cert-manager-crds filename="kustomize/kustomize.go:408" INFO[0001] Processing application: cert-manager-kube-system-resources filename="kustomize/kustomize.go:408" INFO[0001] Processing application: cert-manager filename="kustomize/kustomize.go:408" INFO[0001] Processing application: metacontroller filename="kustomize/kustomize.go:408" INFO[0001] Processing application: argo filename="kustomize/kustomize.go:408" INFO[0001] Processing application: kubeflow-roles filename="kustomize/kustomize.go:408" INFO[0001] Processing application: centraldashboard filename="kustomize/kustomize.go:408" INFO[0001] Processing application: bootstrap filename="kustomize/kustomize.go:408" INFO[0001] Processing application: webhook filename="kustomize/kustomize.go:408" INFO[0001] Processing application: jupyter-web-app filename="kustomize/kustomize.go:408" INFO[0001] Processing application: spark-operator filename="kustomize/kustomize.go:408" INFO[0001] Processing application: metadata filename="kustomize/kustomize.go:408" INFO[0001] Processing application: notebook-controller filename="kustomize/kustomize.go:408" INFO[0001] Processing application: pytorch-job-crds filename="kustomize/kustomize.go:408" INFO[0001] Processing application: pytorch-operator filename="kustomize/kustomize.go:408" INFO[0001] Processing application: knative-crds filename="kustomize/kustomize.go:408" INFO[0001] Processing application: knative-install filename="kustomize/kustomize.go:408" INFO[0001] Processing application: kfserving-crds filename="kustomize/kustomize.go:408" INFO[0001] Processing application: kfserving-install filename="kustomize/kustomize.go:408" INFO[0001] Processing application: spartakus filename="kustomize/kustomize.go:408" INFO[0001] Processing application: tensorboard filename="kustomize/kustomize.go:408" INFO[0001] Processing application: tf-job-crds filename="kustomize/kustomize.go:408" INFO[0001] Processing application: tf-job-operator filename="kustomize/kustomize.go:408" INFO[0001] Processing application: katib-crds filename="kustomize/kustomize.go:408" INFO[0001] Processing application: katib-controller filename="kustomize/kustomize.go:408" INFO[0001] Processing application: api-service filename="kustomize/kustomize.go:408" INFO[0001] Processing application: minio filename="kustomize/kustomize.go:408" INFO[0001] Processing application: mysql filename="kustomize/kustomize.go:408" INFO[0001] Processing application: persistent-agent filename="kustomize/kustomize.go:408" INFO[0001] Processing application: pipelines-runner filename="kustomize/kustomize.go:408" INFO[0001] Processing application: pipelines-ui filename="kustomize/kustomize.go:408" INFO[0001] Processing application: pipelines-viewer filename="kustomize/kustomize.go:408" INFO[0001] Processing application: scheduledworkflow filename="kustomize/kustomize.go:408" INFO[0001] Processing application: pipeline-visualization-service filename="kustomize/kustomize.go:408" INFO[0001] Processing application: profiles filename="kustomize/kustomize.go:408" INFO[0001] Processing application: seldon-core-operator filename="kustomize/kustomize.go:408" INFO[0001] /root/kubeflow-install/my-kubeflow/.cache/manifests exists; not resyncing filename="kfconfig/types.go:468" INFO[0001] namespace: kubeflow filename="utils/k8utils.go:427" INFO[0001] Creating namespace: kubeflow filename="utils/k8utils.go:432" INFO[0001] log cluster name into KfDef: kubernetes filename="kustomize/kustomize.go:165" INFO[0001] Deploying application istio-crds filename="kustomize/kustomize.go:172" customresourcedefinition.apiextensions.k8s.io/adapters.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/apikeys.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/attributemanifests.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/authorizations.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/bypasses.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/certificates.certmanager.k8s.io unchanged customresourcedefinition.apiextensions.k8s.io/challenges.certmanager.k8s.io unchanged customresourcedefinition.apiextensions.k8s.io/checknothings.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/circonuses.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/cloudwatches.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/clusterissuers.certmanager.k8s.io unchanged customresourcedefinition.apiextensions.k8s.io/clusterrbacconfigs.rbac.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/deniers.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/destinationrules.networking.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/dogstatsds.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/edges.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/envoyfilters.networking.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/fluentds.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/gateways.networking.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/handlers.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/httpapispecbindings.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/httpapispecs.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/instances.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/issuers.certmanager.k8s.io unchanged customresourcedefinition.apiextensions.k8s.io/kubernetesenvs.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/kuberneteses.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/listcheckers.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/listentries.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/logentries.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/memquotas.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/meshpolicies.authentication.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/metrics.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/noops.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/opas.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/orders.certmanager.k8s.io unchanged customresourcedefinition.apiextensions.k8s.io/policies.authentication.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/prometheuses.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/quotas.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/quotaspecbindings.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/quotaspecs.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/rbacconfigs.rbac.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/rbacs.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/redisquotas.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/reportnothings.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/rules.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/serviceentries.networking.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/servicerolebindings.rbac.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/serviceroles.rbac.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/sidecars.networking.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/signalfxs.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/solarwindses.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/stackdrivers.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/statsds.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/stdios.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/templates.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/tracespans.config.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/virtualservices.networking.istio.io unchanged customresourcedefinition.apiextensions.k8s.io/zipkins.config.istio.io unchanged INFO[0003] Successfully applied application istio-crds filename="kustomize/kustomize.go:209" INFO[0003] Deploying application istio-install filename="kustomize/kustomize.go:172" namespace/istio-system created mutatingwebhookconfiguration.admissionregistration.k8s.io/istio-sidecar-injector configured serviceaccount/istio-citadel-service-account created serviceaccount/istio-cleanup-secrets-service-account created serviceaccount/istio-egressgateway-service-account created serviceaccount/istio-galley-service-account created serviceaccount/istio-grafana-post-install-account created serviceaccount/istio-ingressgateway-service-account created serviceaccount/istio-mixer-service-account created serviceaccount/istio-multi created serviceaccount/istio-pilot-service-account created serviceaccount/istio-security-post-install-account created serviceaccount/istio-sidecar-injector-service-account created serviceaccount/kiali-service-account created serviceaccount/prometheus created role.rbac.authorization.k8s.io/istio-ingressgateway-sds created clusterrole.rbac.authorization.k8s.io/istio-citadel-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-cleanup-secrets-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-egressgateway-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-galley-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-grafana-post-install-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-ingressgateway-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-mixer-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-pilot-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-reader configured clusterrole.rbac.authorization.k8s.io/istio-sidecar-injector-istio-system unchanged clusterrole.rbac.authorization.k8s.io/kiali unchanged clusterrole.rbac.authorization.k8s.io/kiali-viewer unchanged clusterrole.rbac.authorization.k8s.io/prometheus-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-security-post-install-istio-system unchanged rolebinding.rbac.authorization.k8s.io/istio-ingressgateway-sds created clusterrolebinding.rbac.authorization.k8s.io/istio-citadel-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-cleanup-secrets-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-egressgateway-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-galley-admin-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-grafana-post-install-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-ingressgateway-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-kiali-admin-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-mixer-admin-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-multi unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-pilot-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-sidecar-injector-admin-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/prometheus-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-security-post-install-role-binding-istio-system unchanged configmap/istio created configmap/istio-galley-configuration created configmap/istio-grafana created configmap/istio-grafana-configuration-dashboards-galley-dashboard created configmap/istio-grafana-configuration-dashboards-istio-mesh-dashboard created configmap/istio-grafana-configuration-dashboards-istio-performance-dashboard created configmap/istio-grafana-configuration-dashboards-istio-service-dashboard created configmap/istio-grafana-configuration-dashboards-istio-workload-dashboard created configmap/istio-grafana-configuration-dashboards-mixer-dashboard created configmap/istio-grafana-configuration-dashboards-pilot-dashboard created configmap/istio-grafana-custom-resources created configmap/istio-security-custom-resources created configmap/istio-sidecar-injector created configmap/kiali created configmap/prometheus created secret/kiali created service/grafana created service/istio-citadel created service/istio-egressgateway created service/istio-galley created service/istio-ingressgateway created service/istio-pilot created service/istio-policy created service/istio-sidecar-injector created service/istio-telemetry created service/jaeger-agent created service/jaeger-collector created service/jaeger-query created service/kiali created service/prometheus created service/tracing created service/zipkin created deployment.apps/grafana created deployment.apps/istio-citadel created deployment.apps/istio-egressgateway created deployment.apps/istio-galley created deployment.apps/istio-ingressgateway created deployment.apps/istio-pilot created deployment.apps/istio-policy created deployment.apps/istio-sidecar-injector created deployment.apps/istio-telemetry created deployment.apps/istio-tracing created deployment.apps/kiali created deployment.apps/prometheus created poddisruptionbudget.policy/istio-egressgateway created poddisruptionbudget.policy/istio-galley created poddisruptionbudget.policy/istio-ingressgateway created poddisruptionbudget.policy/istio-pilot created poddisruptionbudget.policy/istio-policy created poddisruptionbudget.policy/istio-telemetry created horizontalpodautoscaler.autoscaling/istio-egressgateway created horizontalpodautoscaler.autoscaling/istio-ingressgateway created horizontalpodautoscaler.autoscaling/istio-pilot created horizontalpodautoscaler.autoscaling/istio-policy created horizontalpodautoscaler.autoscaling/istio-telemetry created job.batch/istio-cleanup-secrets-1.1.6 created job.batch/istio-grafana-post-install-1.1.6 created job.batch/istio-security-post-install-1.1.6 created handler.config.istio.io/kubernetesenv created handler.config.istio.io/prometheus created handler.config.istio.io/stdio created kubernetes.config.istio.io/attributes created rule.config.istio.io/stdio created rule.config.istio.io/stdiotcp created rule.config.istio.io/tcpkubeattrgenrulerule created destinationrule.networking.istio.io/istio-policy created destinationrule.networking.istio.io/istio-telemetry created WARN[0023] Encountered error applying application istio-install: (kubeflow.error): Code 500 with message: Apply.Run Error [error when creating "/tmp/kout349752124": Internal error occurred: failed calling webhook "mixer.validation.istio.io": Post https://istio-galley.istio-system.svc:443/admitmixer?timeout=30s: dial tcp 10.233.54.154:443: connect: connection refused, error when creating "/tmp/kout349752124": Internal error occurred: failed calling webhook "mixer.validation.istio.io": Post https://istio-galley.istio-system.svc:443/admitmixer?timeout=30s: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "cluster.local")] filename="kustomize/kustomize.go:202" WARN[0023] Will retry in 2 seconds. filename="kustomize/kustomize.go:203" namespace/istio-system unchanged mutatingwebhookconfiguration.admissionregistration.k8s.io/istio-sidecar-injector configured serviceaccount/istio-citadel-service-account unchanged serviceaccount/istio-cleanup-secrets-service-account unchanged serviceaccount/istio-egressgateway-service-account unchanged serviceaccount/istio-galley-service-account unchanged serviceaccount/istio-grafana-post-install-account unchanged serviceaccount/istio-ingressgateway-service-account unchanged serviceaccount/istio-mixer-service-account unchanged serviceaccount/istio-multi unchanged serviceaccount/istio-pilot-service-account unchanged serviceaccount/istio-security-post-install-account unchanged serviceaccount/istio-sidecar-injector-service-account unchanged serviceaccount/kiali-service-account unchanged serviceaccount/prometheus unchanged role.rbac.authorization.k8s.io/istio-ingressgateway-sds unchanged clusterrole.rbac.authorization.k8s.io/istio-citadel-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-cleanup-secrets-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-egressgateway-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-galley-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-grafana-post-install-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-ingressgateway-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-mixer-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-pilot-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-reader unchanged clusterrole.rbac.authorization.k8s.io/istio-sidecar-injector-istio-system unchanged clusterrole.rbac.authorization.k8s.io/kiali unchanged clusterrole.rbac.authorization.k8s.io/kiali-viewer unchanged clusterrole.rbac.authorization.k8s.io/prometheus-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-security-post-install-istio-system unchanged rolebinding.rbac.authorization.k8s.io/istio-ingressgateway-sds unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-citadel-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-cleanup-secrets-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-egressgateway-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-galley-admin-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-grafana-post-install-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-ingressgateway-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-kiali-admin-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-mixer-admin-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-multi unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-pilot-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-sidecar-injector-admin-role-binding-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/prometheus-istio-system unchanged clusterrolebinding.rbac.authorization.k8s.io/istio-security-post-install-role-binding-istio-system unchanged configmap/istio unchanged configmap/istio-galley-configuration unchanged configmap/istio-grafana unchanged configmap/istio-grafana-configuration-dashboards-galley-dashboard unchanged configmap/istio-grafana-configuration-dashboards-istio-mesh-dashboard unchanged configmap/istio-grafana-configuration-dashboards-istio-performance-dashboard unchanged configmap/istio-grafana-configuration-dashboards-istio-service-dashboard unchanged configmap/istio-grafana-configuration-dashboards-istio-workload-dashboard unchanged configmap/istio-grafana-configuration-dashboards-mixer-dashboard unchanged configmap/istio-grafana-configuration-dashboards-pilot-dashboard unchanged configmap/istio-grafana-custom-resources unchanged configmap/istio-security-custom-resources unchanged configmap/istio-sidecar-injector unchanged configmap/kiali unchanged configmap/prometheus unchanged secret/kiali unchanged service/grafana unchanged service/istio-citadel unchanged service/istio-egressgateway unchanged service/istio-galley unchanged service/istio-ingressgateway unchanged service/istio-pilot unchanged service/istio-policy unchanged service/istio-sidecar-injector unchanged service/istio-telemetry unchanged service/jaeger-agent unchanged service/jaeger-collector unchanged service/jaeger-query unchanged service/kiali unchanged service/prometheus unchanged service/tracing unchanged service/zipkin unchanged deployment.apps/grafana unchanged deployment.apps/istio-citadel configured deployment.apps/istio-egressgateway unchanged deployment.apps/istio-galley configured deployment.apps/istio-ingressgateway unchanged deployment.apps/istio-pilot configured deployment.apps/istio-policy configured deployment.apps/istio-sidecar-injector configured deployment.apps/istio-telemetry configured deployment.apps/istio-tracing unchanged deployment.apps/kiali unchanged deployment.apps/prometheus unchanged poddisruptionbudget.policy/istio-egressgateway unchanged poddisruptionbudget.policy/istio-galley unchanged poddisruptionbudget.policy/istio-ingressgateway unchanged poddisruptionbudget.policy/istio-pilot unchanged poddisruptionbudget.policy/istio-policy unchanged poddisruptionbudget.policy/istio-telemetry unchanged horizontalpodautoscaler.autoscaling/istio-egressgateway unchanged horizontalpodautoscaler.autoscaling/istio-ingressgateway unchanged horizontalpodautoscaler.autoscaling/istio-pilot unchanged horizontalpodautoscaler.autoscaling/istio-policy unchanged horizontalpodautoscaler.autoscaling/istio-telemetry unchanged job.batch/istio-cleanup-secrets-1.1.6 unchanged job.batch/istio-grafana-post-install-1.1.6 unchanged job.batch/istio-security-post-install-1.1.6 unchanged attributemanifest.config.istio.io/istioproxy created attributemanifest.config.istio.io/kubernetes created handler.config.istio.io/kubernetesenv unchanged handler.config.istio.io/prometheus unchanged handler.config.istio.io/stdio unchanged kubernetes.config.istio.io/attributes unchanged logentry.config.istio.io/accesslog created logentry.config.istio.io/tcpaccesslog created metric.config.istio.io/requestcount created metric.config.istio.io/requestduration created metric.config.istio.io/requestsize created metric.config.istio.io/responsesize created metric.config.istio.io/tcpbytereceived created metric.config.istio.io/tcpbytesent created metric.config.istio.io/tcpconnectionsclosed created metric.config.istio.io/tcpconnectionsopened created rule.config.istio.io/kubeattrgenrulerule created rule.config.istio.io/promhttp created rule.config.istio.io/promtcp created rule.config.istio.io/promtcpconnectionclosed created rule.config.istio.io/promtcpconnectionopen created rule.config.istio.io/stdio unchanged rule.config.istio.io/stdiotcp unchanged rule.config.istio.io/tcpkubeattrgenrulerule unchanged destinationrule.networking.istio.io/istio-policy unchanged destinationrule.networking.istio.io/istio-telemetry unchanged INFO[0026] Successfully applied application istio-install filename="kustomize/kustomize.go:209" INFO[0026] Deploying application cluster-local-gateway filename="kustomize/kustomize.go:172" namespace/istio-system configured serviceaccount/cluster-local-gateway-service-account created serviceaccount/istio-multi configured clusterrole.rbac.authorization.k8s.io/cluster-local-gateway-istio-system unchanged clusterrole.rbac.authorization.k8s.io/istio-reader configured clusterrolebinding.rbac.authorization.k8s.io/cluster-local-gateway-istio-system unchanged configmap/cluster-local-gateway-parameters-tbbdb2842d created service/cluster-local-gateway created deployment.apps/cluster-local-gateway created poddisruptionbudget.policy/cluster-local-gateway created horizontalpodautoscaler.autoscaling/cluster-local-gateway created INFO[0026] Successfully applied application cluster-local-gateway filename="kustomize/kustomize.go:209" INFO[0026] Deploying application istio filename="kustomize/kustomize.go:172" clusterrole.rbac.authorization.k8s.io/kubeflow-istio-admin configured clusterrole.rbac.authorization.k8s.io/kubeflow-istio-edit unchanged clusterrole.rbac.authorization.k8s.io/kubeflow-istio-view unchanged configmap/istio-parameters-t6hhgfg9k2 created gateway.networking.istio.io/kubeflow-gateway created serviceentry.networking.istio.io/google-api-entry created serviceentry.networking.istio.io/google-storage-api-entry created virtualservice.networking.istio.io/google-api-vs created virtualservice.networking.istio.io/google-storage-api-vs created virtualservice.networking.istio.io/grafana-vs created clusterrbacconfig.rbac.istio.io/default unchanged INFO[0026] Successfully applied application istio filename="kustomize/kustomize.go:209" INFO[0026] Deploying application add-anonymous-user-filter filename="kustomize/kustomize.go:172" envoyfilter.networking.istio.io/add-user-filter created INFO[0026] Successfully applied application add-anonymous-user-filter filename="kustomize/kustomize.go:209" INFO[0026] Deploying application application-crds filename="kustomize/kustomize.go:172" customresourcedefinition.apiextensions.k8s.io/applications.app.k8s.io unchanged INFO[0026] Successfully applied application application-crds filename="kustomize/kustomize.go:209" INFO[0026] Deploying application application filename="kustomize/kustomize.go:172" serviceaccount/application-controller-service-account created clusterrole.rbac.authorization.k8s.io/application-controller-cluster-role unchanged clusterrolebinding.rbac.authorization.k8s.io/application-controller-cluster-role-binding unchanged configmap/application-controller-parameters created service/application-controller-service created statefulset.apps/application-controller-stateful-set created application.app.k8s.io/kubeflow created INFO[0027] Successfully applied application application filename="kustomize/kustomize.go:209" INFO[0027] Deploying application cert-manager-crds filename="kustomize/kustomize.go:172" customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io unchanged INFO[0027] Successfully applied application cert-manager-crds filename="kustomize/kustomize.go:209" INFO[0027] Deploying application cert-manager-kube-system-resources filename="kustomize/kustomize.go:172" role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection unchanged role.rbac.authorization.k8s.io/cert-manager:leaderelection unchanged rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection configured rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:webhook-authentication-reader configured rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection configured configmap/cert-manager-kube-params-parameters unchanged INFO[0027] Successfully applied application cert-manager-kube-system-resources filename="kustomize/kustomize.go:209" INFO[0027] Deploying application cert-manager filename="kustomize/kustomize.go:172" namespace/cert-manager created mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook configured serviceaccount/cert-manager created serviceaccount/cert-manager-cainjector created serviceaccount/cert-manager-webhook created clusterrole.rbac.authorization.k8s.io/cert-manager-edit unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-view unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:webhook-requester unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers unchanged clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders unchanged clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector unchanged clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates unchanged clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges unchanged clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers unchanged clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim unchanged clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers unchanged clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders unchanged clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:auth-delegator configured configmap/cert-manager-parameters created service/cert-manager created service/cert-manager-webhook created deployment.apps/cert-manager created deployment.apps/cert-manager-cainjector created deployment.apps/cert-manager-webhook created apiservice.apiregistration.k8s.io/v1beta1.webhook.cert-manager.io unchanged application.app.k8s.io/cert-manager created clusterissuer.cert-manager.io/kubeflow-self-signing-issuer unchanged validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook configured INFO[0028] Successfully applied application cert-manager filename="kustomize/kustomize.go:209" INFO[0028] Deploying application metacontroller filename="kustomize/kustomize.go:172" customresourcedefinition.apiextensions.k8s.io/compositecontrollers.metacontroller.k8s.io unchanged customresourcedefinition.apiextensions.k8s.io/controllerrevisions.metacontroller.k8s.io unchanged customresourcedefinition.apiextensions.k8s.io/decoratorcontrollers.metacontroller.k8s.io unchanged serviceaccount/meta-controller-service created clusterrolebinding.rbac.authorization.k8s.io/meta-controller-cluster-role-binding unchanged statefulset.apps/metacontroller created INFO[0028] Successfully applied application metacontroller filename="kustomize/kustomize.go:209" INFO[0028] Deploying application argo filename="kustomize/kustomize.go:172" customresourcedefinition.apiextensions.k8s.io/workflows.argoproj.io unchanged serviceaccount/argo created serviceaccount/argo-ui created clusterrole.rbac.authorization.k8s.io/argo unchanged clusterrole.rbac.authorization.k8s.io/argo-ui unchanged clusterrolebinding.rbac.authorization.k8s.io/argo unchanged clusterrolebinding.rbac.authorization.k8s.io/argo-ui unchanged configmap/workflow-controller-configmap created configmap/workflow-controller-parameters created service/argo-ui created deployment.apps/argo-ui created deployment.apps/workflow-controller created application.app.k8s.io/argo created virtualservice.networking.istio.io/argo-ui created INFO[0028] Successfully applied application argo filename="kustomize/kustomize.go:209" INFO[0028] Deploying application kubeflow-roles filename="kustomize/kustomize.go:172" clusterrole.rbac.authorization.k8s.io/kubeflow-admin configured clusterrole.rbac.authorization.k8s.io/kubeflow-edit configured clusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-admin unchanged clusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-edit unchanged clusterrole.rbac.authorization.k8s.io/kubeflow-kubernetes-view unchanged clusterrole.rbac.authorization.k8s.io/kubeflow-view configured INFO[0028] Successfully applied application kubeflow-roles filename="kustomize/kustomize.go:209" INFO[0028] Deploying application centraldashboard filename="kustomize/kustomize.go:172" serviceaccount/centraldashboard created role.rbac.authorization.k8s.io/centraldashboard created clusterrole.rbac.authorization.k8s.io/centraldashboard unchanged rolebinding.rbac.authorization.k8s.io/centraldashboard created clusterrolebinding.rbac.authorization.k8s.io/centraldashboard unchanged configmap/parameters created service/centraldashboard created deployment.apps/centraldashboard created application.app.k8s.io/centraldashboard created virtualservice.networking.istio.io/centraldashboard created INFO[0029] Successfully applied application centraldashboard filename="kustomize/kustomize.go:209" INFO[0029] Deploying application bootstrap filename="kustomize/kustomize.go:172" serviceaccount/admission-webhook-bootstrap-service-account created clusterrole.rbac.authorization.k8s.io/admission-webhook-bootstrap-cluster-role unchanged clusterrolebinding.rbac.authorization.k8s.io/admission-webhook-bootstrap-cluster-role-binding unchanged configmap/admission-webhook-bootstrap-config-map created statefulset.apps/admission-webhook-bootstrap-stateful-set created application.app.k8s.io/bootstrap created INFO[0029] Successfully applied application bootstrap filename="kustomize/kustomize.go:209" INFO[0029] Deploying application webhook filename="kustomize/kustomize.go:172" customresourcedefinition.apiextensions.k8s.io/poddefaults.kubeflow.org unchanged mutatingwebhookconfiguration.admissionregistration.k8s.io/admission-webhook-mutating-webhook-configuration configured serviceaccount/admission-webhook-service-account created clusterrole.rbac.authorization.k8s.io/admission-webhook-cluster-role unchanged clusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-admin configured clusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-edit configured clusterrole.rbac.authorization.k8s.io/admission-webhook-kubeflow-poddefaults-view unchanged clusterrolebinding.rbac.authorization.k8s.io/admission-webhook-cluster-role-binding unchanged configmap/admission-webhook-admission-webhook-parameters created service/admission-webhook-service created deployment.apps/admission-webhook-deployment created application.app.k8s.io/webhook created INFO[0029] Successfully applied application webhook filename="kustomize/kustomize.go:209" INFO[0029] Deploying application jupyter-web-app filename="kustomize/kustomize.go:172" serviceaccount/jupyter-web-app-service-account created role.rbac.authorization.k8s.io/jupyter-web-app-jupyter-notebook-role created clusterrole.rbac.authorization.k8s.io/jupyter-web-app-cluster-role unchanged clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-admin configured clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-edit unchanged clusterrole.rbac.authorization.k8s.io/jupyter-web-app-kubeflow-notebook-ui-view unchanged rolebinding.rbac.authorization.k8s.io/jupyter-web-app-jupyter-notebook-role-binding created clusterrolebinding.rbac.authorization.k8s.io/jupyter-web-app-cluster-role-binding unchanged configmap/jupyter-web-app-jupyter-web-app-config created configmap/jupyter-web-app-parameters created service/jupyter-web-app-service created deployment.apps/jupyter-web-app-deployment created application.app.k8s.io/jupyter-web-app created virtualservice.networking.istio.io/jupyter-web-app created INFO[0030] Successfully applied application jupyter-web-app filename="kustomize/kustomize.go:209" INFO[0030] Deploying application spark-operator filename="kustomize/kustomize.go:172" customresourcedefinition.apiextensions.k8s.io/scheduledsparkapplications.sparkoperator.k8s.io unchanged customresourcedefinition.apiextensions.k8s.io/sparkapplications.sparkoperator.k8s.io unchanged serviceaccount/spark-operatoroperator-sa created serviceaccount/spark-operatorspark created role.rbac.authorization.k8s.io/spark-operatorspark-role created clusterrole.rbac.authorization.k8s.io/spark-operatoroperator-cr unchanged rolebinding.rbac.authorization.k8s.io/spark-operatorspark-role-binding created clusterrolebinding.rbac.authorization.k8s.io/spark-operatorsparkoperator-crb unchanged deployment.apps/spark-operatorsparkoperator created application.app.k8s.io/spark-operator created INFO[0030] Successfully applied application spark-operator filename="kustomize/kustomize.go:209" INFO[0030] Deploying application metadata filename="kustomize/kustomize.go:172" ERRO[0030] error evaluating kustomization manifest for metadata Error env source files: [secrets.env]: evalsymlink failure on '/root/kubeflow-install/my-kubeflow/kustomize/metadata/secrets.env' : lstat /root/kubeflow-install/my-kubeflow/kustomize/metadata/secrets.env: no such file or directory filename="kustomize/kustomize.go:175" Error: failed to apply: (kubeflow.error): Code 500 with message: kfApp Apply failed for kustomize: (kubeflow.error): Code 500 with message: error evaluating kustomization manifest for metadata Error env source files: [secrets.env]: evalsymlink failure on '/root/kubeflow-install/my-kubeflow/kustomize/metadata/secrets.env' : lstat /root/kubeflow-install/my-kubeflow/kustomize/metadata/secrets.env: no such file or directory Usage: kfctl apply -f ${CONFIG} [flags]

Flags: -f, --file string Static config file to use. Can be either a local path: export CONFIG=./kfctl_gcp_iap.yaml or a URL: export CONFIG=https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_gcp_iap.v1.0.0.yaml export CONFIG=https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_istio_dex.v1.0.0.yaml export CONFIG=https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_aws.v1.0.0.yaml export CONFIG=https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.0.yaml kfctl apply -V --file=${CONFIG} -h, --help help for apply -V, --verbose verbose output default is false

failed to apply: (kubeflow.error): Code 500 with message: kfApp Apply failed for kustomize: (kubeflow.error): Code 500 with message: error evaluating kustomization manifest for metadata Error env source files: [secrets.env]: evalsymlink failure on '/root/kubeflow-install/my-kubeflow/kustomize/metadata/secrets.env' : lstat /root/kubeflow-install/my-kubeflow/kustomize/metadata/secrets.env: no such file or directory

arnoldas500 commented 4 years ago

I am having the same problem without finding any solution

supertetelman commented 4 years ago

This looks like it is related to a bug in the istio manifest. There currently is no working version of that manfiest, see https://github.com/kubeflow/manifests/issues/1290.

For now, the dex_istio manifest is working, so you should be able to install/run kubeflow with ./scripts/k8s_deploy_kubeflow.sh -x.

I'll update this bug and the kubeflow script as soon as a working version of the manifest is available.

arnoldas500 commented 4 years ago

specifying -x option leads to a warning: " WARN[0016] Encountered error applying application istio-install: (kubeflow.error): Code 500 with message: Apply.Run Error [error when creating "/tmp/kout316433951": Internal error occurred: failed calling webhook "mixer.validation.istio.io": Post https://istio-galley.istio-system.svc:443/admitmixer?timeout=30s: dial tcp 10.233.28.156:443: connect: connection refused, error when creating "/tmp/kout316433951": Internal error occurred: failed calling webhook "pilot.validation.istio.io": Post https://istio-galley.istio-system.svc:443/admitpilot?timeout=30s: dial tcp 10.233.28.156:443: connect: connection refused] filename="kustomize/kustomize.go:202" WARN[0016] Will retry in 3 seconds. filename="kustomize/kustomize.go:203" " anyone else experience this?

supertetelman commented 4 years ago

I have never seen that @arnoldas500. Did you cleanly remove any previous/failed installs of kubeflow before running the installation?

arnoldas500 commented 4 years ago

Yes, I ran the following to remove everything:

kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get -n istio-system kubectl get apiservice | grep False kubectl delete apiservice v1beta1.webhook.cert-manager.io kubectl delete apiservice v1beta1.custom.metrics.k8s.io
kubectl delete validatingwebhookconfigurations --all kubectl delete mutatingwebhookconfigurations --all kubectl delete crds --all

./scripts/k8s_deploy_kubeflow.sh -D

supertetelman commented 4 years ago

We've recently pushed a few changes to the script to do a better job cleaning up.

I know I've seen a few issues where after deleting the namespaces gets stuck in a terminating state. Can you check for that?

I believe the initial issue has been resolved and we're now hitting environmental issues so we are unmarking this as critical.