shikanon / kubeflow-manifests

kubeflow国内一键安装文件
GNU General Public License v3.0
338 stars 117 forks source link

(非kind安装)Pod无法完全起来[waiting for a volume to be created, either by external provisioner "rancher.io/local-path"] #60

Closed Corezcy closed 3 years ago

Corezcy commented 3 years ago

在单独安装docker/ k8s后,使用install.py安装kubeflow后,pod状态如:

kubectl get pod -A
NAMESPACE          NAME                                       READY   STATUS              RESTARTS   AGE
auth               dex-68b9bb7889-7t5zp                       1/1     Running             0          107m
cert-manager       cert-manager-649f8dfd4b-bp8hh              1/1     Running             0          4h39m
cert-manager       cert-manager-cainjector-75cd8bbf6d-hpnkn   1/1     Running             0          4h39m
cert-manager       cert-manager-webhook-5b5cd9bd6f-dqjkq      1/1     Running             1          4h39m
istio-system       authservice-0                              0/1     Pending             0          4h38m
istio-system       cluster-local-gateway-74d9fd9586-prw4v     0/1     ContainerCreating   0          107m
istio-system       istio-ingressgateway-8bf685655-6s6bn       0/1     ContainerCreating   0          107m
istio-system       istiod-6f99d55d99-bqln5                    0/1     Running             2          2m40s
istio-system       istiod-756554b96b-8l49j                    0/1     CrashLoopBackOff    21         107m
knative-eventing   broker-controller-cfb5ccb77-hrjtf          1/1     Running             0          4h37m
knative-eventing   eventing-controller-8657cd4b8-t65gk        1/1     Running             0          4h37m
knative-eventing   eventing-webhook-67f86f4d4d-zwh9g          0/1     CrashLoopBackOff    45         4h37m
knative-eventing   imc-controller-68bd666784-4qw6d            1/1     Running             0          4h37m
knative-eventing   imc-dispatcher-78ff9dd847-f7gkp            0/1     CrashLoopBackOff    45         4h37m
knative-serving    activator-54b777546f-dp47d                 1/1     Running             0          4h37m
knative-serving    autoscaler-79bbc84d47-jbqzj                1/1     Running             0          4h37m
knative-serving    controller-dd65cb4b7-wrwbd                 1/1     Running             0          4h37m
knative-serving    istio-webhook-5f545fc44b-7j2bg             0/1     CrashLoopBackOff    45         4h37m
knative-serving    networking-istio-6b6df495d6-44wzr          1/1     Running             0          4h37m
knative-serving    webhook-9ff656f95-mzhj6                    1/1     Running             0          4h37m
kube-system        coredns-54d67798b7-9tvk6                   1/1     Running             25         227d
kube-system        coredns-54d67798b7-wqsts                   1/1     Running             27         227d
kube-system        etcd-k8s-master01                          1/1     Running             26         227d
kube-system        kube-apiserver-k8s-master01                1/1     Running             322        227d
kube-system        kube-controller-manager-k8s-master01       1/1     Running             32         226d
kube-system        kube-flannel-ds-jdh7f                      1/1     Running             3562       226d
kube-system        kube-flannel-ds-jssrl                      1/1     Running             1          46h
kube-system        kube-flannel-ds-lkqzd                      1/1     Running             41         225d
kube-system        kube-proxy-76c8r                           1/1     Running             1          46h
kube-system        kube-proxy-9z7jf                           1/1     Running             16         183d
kube-system        kube-proxy-tnzqv                           1/1     Running             26         180d
kube-system        kube-scheduler-k8s-master01                1/1     Running             32         226d
kube-system        metrics-server-v0.3.6-876b95bc8-jfjzj      2/2     Running             49         224d
tigera-operator    tigera-operator-7c5d47c4b5-kq2nm           1/1     Running             3          46h

istio-ingressgateway-8bf685655-6s6bn没有起来的原因:

Events:
  Type     Reason       Age                    From     Message
  ----     ------       ----                   ----     -------
  Warning  FailedMount  48m (x5 over 87m)      kubelet  Unable to attach or mount volumes: unmounted volumes=[istiod-ca-cert], unattached volumes=[istio-data podinfo ingressgateway-certs ingressgateway-ca-certs istio-ingressgateway-service-account-token-8zpw8 istio-envoy config-volume istiod-ca-cert]: timed out waiting for the condition
  Warning  FailedMount  19m (x4 over 103m)     kubelet  Unable to attach or mount volumes: unmounted volumes=[istiod-ca-cert], unattached volumes=[config-volume istiod-ca-cert istio-data podinfo ingressgateway-certs ingressgateway-ca-certs istio-ingressgateway-service-account-token-8zpw8 istio-envoy]: timed out waiting for the condition
  Warning  FailedMount  7m54s (x4 over 71m)    kubelet  Unable to attach or mount volumes: unmounted volumes=[istiod-ca-cert], unattached volumes=[istiod-ca-cert istio-data podinfo ingressgateway-certs ingressgateway-ca-certs istio-ingressgateway-service-account-token-8zpw8 istio-envoy config-volume]: timed out waiting for the condition
  Warning  FailedMount  3m46s (x60 over 109m)  kubelet  MountVolume.SetUp failed for volume "istiod-ca-cert" : configmap "istio-ca-root-cert" not found

Other:

kubectl get mutatingwebhookconfigurations -A
NAME                                               WEBHOOKS   AGE
admission-webhook-mutating-webhook-configuration   1          44h
cert-manager-webhook                               1          44h
inferenceservice.serving.kubeflow.org              3          44h
istio-sidecar-injector                             1          44h
katib.kubeflow.org                                 2          44h
sinkbindings.webhook.sources.knative.dev           1          44h
webhook.eventing.knative.dev                       1          44h
webhook.istio.networking.internal.knative.dev      1          44h
webhook.serving.knative.dev                        1          44h

查了一天的帖子,实在是不知道从哪改了,求教!!!

Corezcy commented 3 years ago
$ kubectl get all -n kubeflowkubectl get all -n kubeflow
NAME                                                                TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
service/admission-webhook-service                                   ClusterIP   10.1.17.73     <none>        443/TCP             5h9m
service/cache-server                                                ClusterIP   10.1.7.205     <none>        443/TCP             5h10m
service/centraldashboard                                            ClusterIP   10.1.128.167   <none>        80/TCP              5h9m
service/jupyter-web-app-service                                     ClusterIP   10.1.57.31     <none>        80/TCP              5h9m
service/katib-controller                                            ClusterIP   10.1.96.248    <none>        443/TCP,8080/TCP    5h10m
service/katib-db-manager                                            ClusterIP   10.1.175.201   <none>        6789/TCP            5h10m
service/katib-mysql                                                 ClusterIP   10.1.230.240   <none>        3306/TCP            5h10m
service/katib-ui                                                    ClusterIP   10.1.35.241    <none>        80/TCP              5h10m
service/kfserving-controller-manager-metrics-service                ClusterIP   10.1.179.29    <none>        8443/TCP            5h10m
service/kfserving-controller-manager-service                        ClusterIP   10.1.180.172   <none>        443/TCP             5h10m
service/kfserving-webhook-server-service                            ClusterIP   10.1.73.9      <none>        443/TCP             5h10m
service/kubeflow-pipelines-profile-controller                       ClusterIP   10.1.228.247   <none>        80/TCP              5h10m
service/metadata-envoy-service                                      ClusterIP   10.1.169.123   <none>        9090/TCP            5h10m
service/metadata-grpc-service                                       ClusterIP   10.1.21.11     <none>        8080/TCP            5h10m
service/minio-service                                               ClusterIP   10.1.132.61    <none>        9000/TCP            5h10m
service/ml-pipeline                                                 ClusterIP   10.1.35.16     <none>        8888/TCP,8887/TCP   5h10m
service/ml-pipeline-ui                                              ClusterIP   10.1.40.129    <none>        80/TCP              5h10m
service/ml-pipeline-visualizationserver                             ClusterIP   10.1.202.177   <none>        8888/TCP            5h10m
service/mysql                                                       ClusterIP   10.1.196.6     <none>        3306/TCP            5h10m
service/notebook-controller-service                                 ClusterIP   10.1.121.222   <none>        443/TCP             5h9m
service/profiles-kfam                                               ClusterIP   10.1.142.182   <none>        8081/TCP            5h9m
service/pytorch-operator                                            ClusterIP   10.1.255.160   <none>        8443/TCP            5h8m
service/tensorboard-controller-controller-manager-metrics-service   ClusterIP   10.1.125.72    <none>        8443/TCP            5h8m
service/tensorboards-web-app-service                                ClusterIP   10.1.158.25    <none>        80/TCP              5h8m
service/tf-job-operator                                             ClusterIP   10.1.213.139   <none>        8443/TCP            5h8m
service/volumes-web-app-service                                     ClusterIP   10.1.63.73     <none>        80/TCP              5h9m
service/workflow-controller-metrics                                 ClusterIP   10.1.140.252   <none>        9090/TCP            5h10m
service/xgboost-operator-service                                    ClusterIP   10.1.6.7       <none>        443/TCP             5h7m

NAME                                                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/admission-webhook-deployment                0/1     0            0           5h9m
deployment.apps/cache-deployer-deployment                   0/1     0            0           5h10m
deployment.apps/cache-server                                0/1     0            0           141m
deployment.apps/centraldashboard                            0/1     0            0           5h9m
deployment.apps/jupyter-web-app-deployment                  0/1     0            0           141m
deployment.apps/katib-controller                            0/1     0            0           5h10m
deployment.apps/katib-db-manager                            0/1     0            0           5h10m
deployment.apps/katib-mysql                                 0/1     0            0           5h10m
deployment.apps/katib-ui                                    0/1     0            0           5h10m
deployment.apps/kubeflow-pipelines-profile-controller       0/1     0            0           141m
deployment.apps/metadata-envoy-deployment                   0/1     0            0           5h10m
deployment.apps/metadata-grpc-deployment                    0/1     0            0           5h10m
deployment.apps/metadata-writer                             0/1     0            0           5h10m
deployment.apps/minio                                       0/1     0            0           141m
deployment.apps/ml-pipeline                                 0/1     0            0           5h10m
deployment.apps/ml-pipeline-persistenceagent                0/1     0            0           5h10m
deployment.apps/ml-pipeline-scheduledworkflow               0/1     0            0           5h10m
deployment.apps/ml-pipeline-ui                              0/1     0            0           5h10m
deployment.apps/ml-pipeline-viewer-crd                      0/1     0            0           5h10m
deployment.apps/ml-pipeline-visualizationserver             0/1     0            0           5h10m
deployment.apps/mpi-operator                                0/1     0            0           5h8m
deployment.apps/mxnet-operator                              0/1     0            0           5h7m
deployment.apps/mysql                                       0/1     0            0           141m
deployment.apps/notebook-controller-deployment              0/1     0            0           5h9m
deployment.apps/profiles-deployment                         0/1     0            0           5h9m
deployment.apps/pytorch-operator                            0/1     0            0           5h8m
deployment.apps/tensorboard-controller-controller-manager   0/1     0            0           5h8m
deployment.apps/tensorboards-web-app-deployment             0/1     0            0           141m
deployment.apps/tf-job-operator                             0/1     0            0           5h8m
deployment.apps/volumes-web-app-deployment                  0/1     0            0           141m
deployment.apps/workflow-controller                         0/1     0            0           141m
deployment.apps/xgboost-operator-deployment                 0/1     0            0           5h7m

NAME                                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/admission-webhook-deployment-5f5cc7968b               1         0         0       5h9m
replicaset.apps/cache-deployer-deployment-64598b6c87                  1         0         0       5h10m
replicaset.apps/cache-server-59d67c7584                               1         0         0       141m
replicaset.apps/centraldashboard-7b6b6cc7fc                           1         0         0       5h9m
replicaset.apps/jupyter-web-app-deployment-7c6974bb88                 1         0         0       141m
replicaset.apps/katib-controller-7b784c44dd                           1         0         0       5h10m
replicaset.apps/katib-db-manager-6c5757dc64                           1         0         0       5h10m
replicaset.apps/katib-mysql-79d75c7444                                1         0         0       5h10m
replicaset.apps/katib-ui-69f5b6795d                                   1         0         0       5h10m
replicaset.apps/kubeflow-pipelines-profile-controller-76c45c8c6b      1         0         0       141m
replicaset.apps/metadata-envoy-deployment-56f745f7fb                  1         0         0       5h10m
replicaset.apps/metadata-grpc-deployment-6494577fdb                   1         0         0       5h10m
replicaset.apps/metadata-writer-b7ff9787                              1         0         0       5h10m
replicaset.apps/minio-cc8f7c6d                                        1         0         0       141m
replicaset.apps/ml-pipeline-66bcb9d79d                                1         0         0       5h10m
replicaset.apps/ml-pipeline-persistenceagent-7fb8f6dc68               1         0         0       5h10m
replicaset.apps/ml-pipeline-scheduledworkflow-64bcfd6596              1         0         0       5h10m
replicaset.apps/ml-pipeline-ui-8578f6685f                             1         0         0       5h10m
replicaset.apps/ml-pipeline-viewer-crd-565fb9b5c5                     1         0         0       5h10m
replicaset.apps/ml-pipeline-visualizationserver-b7c7d49fb             1         0         0       5h10m
replicaset.apps/mpi-operator-794849c566                               1         0         0       5h8m
replicaset.apps/mxnet-operator-6668d797d4                             1         0         0       5h7m
replicaset.apps/mysql-c8d548489                                       1         0         0       141m
replicaset.apps/notebook-controller-deployment-6795dd887b             1         0         0       5h9m
replicaset.apps/profiles-deployment-84bd4f9bc7                        1         0         0       5h9m
replicaset.apps/pytorch-operator-6887749499                           1         0         0       5h8m
replicaset.apps/tensorboard-controller-controller-manager-dd896c8df   1         0         0       5h8m
replicaset.apps/tensorboards-web-app-deployment-5969cd5b68            1         0         0       141m
replicaset.apps/tf-job-operator-ccb48b77b                             1         0         0       5h8m
replicaset.apps/volumes-web-app-deployment-867dfb5b5c                 1         0         0       141m
replicaset.apps/workflow-controller-6885c56f65                        1         0         0       141m
replicaset.apps/xgboost-operator-deployment-665cf9bf8d                1         0         0       5h7m

NAME                                            READY   AGE
statefulset.apps/kfserving-controller-manager   0/1     5h10m
statefulset.apps/metacontroller                 0/1     5h10m
shikanon commented 3 years ago

@Corezcy 你这个是 istio 没安装成功,可以先卸载再安装看看:

kubectl delete -f manifest1.3/
Corezcy commented 3 years ago

@Corezcy 你这个是 istio 没安装成功,可以先卸载再安装看看:

kubectl delete -f manifest1.3/

对的,我发现我直接执行python install.py,中间有很多镜像没拉下来,不知道什么原因。

Corezcy commented 3 years ago
$ kubectl get pod -n kubeflow
NAME                                                        READY   STATUS             RESTARTS   AGE
admission-webhook-deployment-5f5cc7968b-67mlf               1/1     Running            0          87m
cache-deployer-deployment-64598b6c87-scsp6                  1/2     CrashLoopBackOff   15         87m
cache-server-59d67c7584-txdc6                               0/2     Init:0/1           0          64m
centraldashboard-7b6b6cc7fc-pf4rc                           1/1     Running            0          85m
jupyter-web-app-deployment-6c698ccb99-nfgxg                 1/1     Running            0          65m
katib-controller-7b784c44dd-nbbcc                           1/1     Running            0          84m
katib-db-manager-6c5757dc64-zr777                           0/1     CrashLoopBackOff   19         84m
katib-mysql-79d75c7444-sgp79                                0/1     Pending            0          84m
katib-ui-69f5b6795d-lgll6                                   0/1     CrashLoopBackOff   19         84m
kfserving-controller-manager-0                              2/2     Running            0          84m
kubeflow-pipelines-profile-controller-76c45c8c6b-xjthl      1/1     Running            0          83m
metacontroller-0                                            1/1     Running            0          87m
metadata-envoy-deployment-56f745f7fb-fhgbz                  1/1     Running            0          87m
metadata-grpc-deployment-6494577fdb-pjqzz                   2/2     Running            3          87m
metadata-writer-b7ff9787-xbwgp                              2/2     Running            0          87m
minio-cc8f7c6d-474xg                                        2/2     Running            0          64m
ml-pipeline-66bcb9d79d-z6rlv                                2/2     Running            1          87m
ml-pipeline-persistenceagent-7fb8f6dc68-pbcg5               2/2     Running            0          87m
ml-pipeline-scheduledworkflow-64bcfd6596-78zsj              2/2     Running            0          87m
ml-pipeline-ui-8578f6685f-jzmmk                             2/2     Running            0          87m
ml-pipeline-viewer-crd-565fb9b5c5-r9thj                     1/2     CrashLoopBackOff   20         87m
ml-pipeline-visualizationserver-b7c7d49fb-j89nn             2/2     Running            0          87m
mpi-operator-794849c566-t64zd                               1/1     Running            0          87m
mxnet-operator-6668d797d4-zq8p4                             1/1     Running            0          87m
mysql-9dfc684cd-ldwl6                                       0/2     Pending            0          64m
notebook-controller-deployment-6795dd887b-jsk5q             1/1     Running            0          84m
profiles-deployment-84bd4f9bc7-rcdrb                        2/2     Running            0          87m
pytorch-operator-6887749499-v9rqw                           2/2     Running            0          84m
tensorboard-controller-controller-manager-dd896c8df-h2qn4   3/3     Running            2          84m
tensorboards-web-app-deployment-6f7f7ffc66-5jmvw            1/1     Running            0          64m
tf-job-operator-ccb48b77b-dvpjc                             1/1     Running            0          87m
volumes-web-app-deployment-65595f5694-9ptwq                 1/1     Running            0          64m
workflow-controller-6885c56f65-dx6cg                        2/2     Running            2          64m
xgboost-operator-deployment-665cf9bf8d-tlfln                2/2     Running            2          84m
$ kubectl describe pod cache-server-59d67c7584-txdc6 -n kubeflow
Events:
  Type     Reason       Age                   From     Message
  ----     ------       ----                  ----     -------
  Warning  FailedMount  50m (x3 over 59m)     kubelet  Unable to attach or mount volumes: unmounted volumes=[webhook-tls-certs], unattached volumes=[webhook-tls-certs istiod-ca-cert istio-data istio-envoy istio-token istio-podinfo kubeflow-pipelines-cache-token-sm86v]: timed out waiting for the condition
  Warning  FailedMount  47m                   kubelet  Unable to attach or mount volumes: unmounted volumes=[webhook-tls-certs], unattached volumes=[istio-podinfo kubeflow-pipelines-cache-token-sm86v webhook-tls-certs istiod-ca-cert istio-data istio-envoy istio-token]: timed out waiting for the condition
  Warning  FailedMount  22m (x3 over 54m)     kubelet  Unable to attach or mount volumes: unmounted volumes=[webhook-tls-certs], unattached volumes=[istio-data istio-envoy istio-token istio-podinfo kubeflow-pipelines-cache-token-sm86v webhook-tls-certs istiod-ca-cert]: timed out waiting for the condition
  Warning  FailedMount  18m                   kubelet  Unable to attach or mount volumes: unmounted volumes=[webhook-tls-certs], unattached volumes=[istiod-ca-cert istio-data istio-envoy istio-token istio-podinfo kubeflow-pipelines-cache-token-sm86v webhook-tls-certs]: timed out waiting for the condition
  Warning  FailedMount  8m21s (x35 over 63m)  kubelet  MountVolume.SetUp failed for volume "webhook-tls-certs" : secret "webhook-server-tls" not found
  Warning  FailedMount  2m27s (x8 over 56m)   kubelet  Unable to attach or mount volumes: unmounted volumes=[webhook-tls-certs], unattached volumes=[istio-token istio-podinfo kubeflow-pipelines-cache-token-sm86v webhook-tls-certs istiod-ca-cert istio-data istio-envoy]: timed out waiting for the condition
$ kubectl describe pod katib-db-manager-6c5757dc64-zr777 -n kubeflow
Events:
  Type     Reason     Age                  From     Message
  ----     ------     ----                 ----     -------
  Warning  Unhealthy  11m (x16 over 84m)   kubelet  Liveness probe failed:
  Normal   Pulled     6m6s (x20 over 86m)  kubelet  Container image "registry.cn-shenzhen.aliyuncs.com/tensorbytes/kubeflowkatib-katib-db-manager:v0.11.0-f54bf" already present on machine
  Warning  BackOff    73s (x309 over 84m)  kubelet  Back-off restarting failed container
$ kubectl describe pod ml-pipeline-viewer-crd-565fb9b5c5-r9thj -n kubeflow
Events:
  Type     Reason   Age                 From     Message
  ----     ------   ----                ----     -------
  Warning  BackOff  8s (x383 over 89m)  kubelet  Back-off restarting failed container
$ kubectl describe pod mysql-9dfc684cd-ldwl6 -n kubeflow
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  117s (x63 over 67m)  default-scheduler  0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.

请问大佬还是因为没安装成功吗?一头雾水,就是成功不了

shikanon commented 3 years ago

@Corezcy 你这个是 istio 没安装成功,可以先卸载再安装看看:

kubectl delete -f manifest1.3/

对的,我发现我直接执行python install.py,中间有很多镜像没拉下来,不知道什么原因。

哪些镜像没拉取下来?

shikanon commented 3 years ago

@Corezcy 你是不是PVC没创建成功?试试看看:

kubectl get pvc -A
Corezcy commented 3 years ago
$ kubectl get pvc -A
NAMESPACE      NAME              STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
istio-system   authservice-pvc   Pending                                      local-path     29m
kubeflow       katib-mysql       Pending                                      local-path     9m53s
kubeflow       minio-pvc         Pending                                      local-path     27m
kubeflow       mysql-pv-claim    Pending                                      local-path     27m
$ kubectl get pv -A
No resources found

使用kubectl describe查询log都显示

Events:
  Type    Reason                Age                    From                         Message
  ----    ------                ----                   ----                         -------
  Normal  WaitForFirstConsumer  28m                    persistentvolume-controller  waiting for first consumer to be created before binding
  Normal  ExternalProvisioning  3m51s (x102 over 28m)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
$ kubectl get storageclass
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  28m

您知道是什么原因吗? (那些没拉下来的镜像我delete 再重新执行python install.py,大概试了5,6次就好了)

shikanon commented 3 years ago

@Corezcy 你这是PVC问题,你是用kind安装的吗?如果不是你可以安装/local-path下面的 local-path-storage 作为默认 storageclass

Corezcy commented 3 years ago

@Corezcy 你这是PVC问题,你是用kind安装的吗?如果不是你可以安装/local-path下面的 local-path-storage 作为默认 storageclass

不是kind安装的,直接安装K8s。好的,我先试试

Corezcy commented 3 years ago

执行 python install.py 的时候,出现以下错误:

Error from server (NotFound): error when creating "./manifest1.3/033-user-namespace-user-namespace-base.yaml": the server could not find the requested resource (post profiles.kubeflow.org)
b'configmap/default-install-config-9h2h2b6hbk created\n'
start to patch...
Error from server (NotFound): error when deleting "./patch/auth.yaml": the server could not find the requested resource (delete profiles.kubeflow.org kubeflow-user-example-com)
b'configmap "dex" deleted\ndeployment.apps "dex" deleted\nconfigmap "default-install-config-9h2h2b6hbk" deleted\n'
Error from server (NotFound): error when creating "./patch/auth.yaml": the server could not find the requested resource (post profiles.kubeflow.org)
b'configmap/dex created\ndeployment.apps/dex created\nconfigmap/default-install-config-9h2h2b6hbk created\n'

请问这个碍事吗? 我已经把local-path-storage 作为默认 storageclass并卸载重新安装,还是不行,难受。

Corezcy commented 3 years ago

lADPD2sQwe5lsvfNDY_NA-o_1002_3471 感觉可能是这个问题。

starstream commented 3 years ago

@Corezcy 你这是PVC问题,你是用kind安装的吗?如果不是你可以安装/local-path下面的 local-path-storage 作为默认 storageclass

请问kubectl apply -f local-path-storage.yaml后,执行kubectl get pvc -A,发现所有的pvc都在pending:

NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE istio-system authservice-pvc Pending local-path 7m55s kubeflow katib-mysql Pending local-path 5m49s kubeflow minio-pvc Pending local-path 6m10s kubeflow mysql-pv-claim Pending local-path 6m10s

kubectl describe pvc authservice-pvc -nistio-system查看pvc: Name: authservice-pvc Namespace: istio-system StorageClass: local-path Status: Pending Volume: Labels: Annotations: volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path volume.kubernetes.io/selected-node: iz8vb5qemr6glj4sj3kc0gz Finalizers: [kubernetes.io/pvc-protection] Capacity: Access Modes: VolumeMode: Filesystem Used By: authservice-0 Events: Type Reason Age From Message


Normal WaitForFirstConsumer 8m19s persistentvolume-controller waiting for first consumer to be created before binding Normal ExternalProvisioning 2m30s (x25 over 8m19s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator

Corezcy commented 3 years ago

@Corezcy 你这是PVC问题,你是用kind安装的吗?如果不是你可以安装/local-path下面的 local-path-storage 作为默认 storageclass

请问kubectl apply -f local-path-storage.yaml后,执行kubectl get pvc -A,发现所有的pvc都在pending:

NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE istio-system authservice-pvc Pending local-path 7m55s kubeflow katib-mysql Pending local-path 5m49s kubeflow minio-pvc Pending local-path 6m10s kubeflow mysql-pv-claim Pending local-path 6m10s

kubectl describe pvc authservice-pvc -nistio-system查看pvc: Name: authservice-pvc Namespace: istio-system StorageClass: local-path Status: Pending Volume: Labels: Annotations: volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path volume.kubernetes.io/selected-node: iz8vb5qemr6glj4sj3kc0gz Finalizers: [kubernetes.io/pvc-protection] Capacity: Access Modes: VolumeMode: Filesystem Used By: authservice-0 Events: Type Reason Age From Message

Normal WaitForFirstConsumer 8m19s persistentvolume-controller waiting for first consumer to be created before binding Normal ExternalProvisioning 2m30s (x25 over 8m19s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator

和我的错误一模一样,但是我还是不知道如何修改……,请问你有啥思路吗?

还有一个问题,你的K8s是用kind装的吗?

starstream commented 3 years ago

@Corezcy 你这是PVC问题,你是用kind安装的吗?如果不是你可以安装/local-path下面的 local-path-storage 作为默认 storageclass

请问kubectl apply -f local-path-storage.yaml后,执行kubectl get pvc -A,发现所有的pvc都在pending: NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE istio-system authservice-pvc Pending local-path 7m55s kubeflow katib-mysql Pending local-path 5m49s kubeflow minio-pvc Pending local-path 6m10s kubeflow mysql-pv-claim Pending local-path 6m10s kubectl describe pvc authservice-pvc -nistio-system查看pvc: Name: authservice-pvc Namespace: istio-system StorageClass: local-path Status: Pending Volume: Labels: Annotations: volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path volume.kubernetes.io/selected-node: iz8vb5qemr6glj4sj3kc0gz Finalizers: [kubernetes.io/pvc-protection] Capacity: Access Modes: VolumeMode: Filesystem Used By: authservice-0 Events: Type Reason Age From Message Normal WaitForFirstConsumer 8m19s persistentvolume-controller waiting for first consumer to be created before binding Normal ExternalProvisioning 2m30s (x25 over 8m19s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator

和我的错误一模一样,但是我还是不知道如何修改……,请问你有啥思路吗?

还有一个问题,你的K8s是用kind装的吗?

不是用kind装的k8s。我在尝试设置一个nfs的默认存储类,不知道是否可行

Corezcy commented 3 years ago

设置一个nfs的默认存储类

这边设置一个nfs的默认存储类,已经可以啦! Screenshot-20210830163012-3712x1816

starstream commented 3 years ago

设置一个nfs的默认存储类

我这边设置一个nfs的默认存储类,已经可以啦! Screenshot-20210830163012-3712x1816

请问你有设置成功的教程吗,我在网上找的教程,但是设置后没有自动创建pv

Corezcy commented 3 years ago

在另一台电脑(作为node)上,配置nfs,然后joinmaster,https://www.cnblogs.com/dakewei/p/11716270.html

Corezcy commented 3 years ago

@Corezcy 你这是PVC问题,你是用kind安装的吗?如果不是你可以安装/local-path下面的 local-path-storage 作为默认 storageclass

十分感谢您的解答!

mhh12121 commented 2 years ago

执行 python install.py 的时候,出现以下错误:

Error from server (NotFound): error when creating "./manifest1.3/033-user-namespace-user-namespace-base.yaml": the server could not find the requested resource (post profiles.kubeflow.org)
b'configmap/default-install-config-9h2h2b6hbk created\n'
start to patch...
Error from server (NotFound): error when deleting "./patch/auth.yaml": the server could not find the requested resource (delete profiles.kubeflow.org kubeflow-user-example-com)
b'configmap "dex" deleted\ndeployment.apps "dex" deleted\nconfigmap "default-install-config-9h2h2b6hbk" deleted\n'
Error from server (NotFound): error when creating "./patch/auth.yaml": the server could not find the requested resource (post profiles.kubeflow.org)
b'configmap/dex created\ndeployment.apps/dex created\nconfigmap/default-install-config-9h2h2b6hbk created\n'

请问这个碍事吗? 我已经把local-path-storage 作为默认 storageclass并卸载重新安装,还是不行,难受。

这个应该是PVC问题,作者的local-path-storage.yaml版本好像删掉了一些helperPod的东西,与你当前使用的环境(非kind安装k8s) 可能不适配,可以去rancher仓库 下载一个新的尝试一下; 我在单机下kubeadm安装的k8s也出现这种问题,暂时使用这个解决;

wenyangzz commented 2 years ago

执行 python install.py 的时候,出现以下错误:

Error from server (NotFound): error when creating "./manifest1.3/033-user-namespace-user-namespace-base.yaml": the server could not find the requested resource (post profiles.kubeflow.org)
b'configmap/default-install-config-9h2h2b6hbk created\n'
start to patch...
Error from server (NotFound): error when deleting "./patch/auth.yaml": the server could not find the requested resource (delete profiles.kubeflow.org kubeflow-user-example-com)
b'configmap "dex" deleted\ndeployment.apps "dex" deleted\nconfigmap "default-install-config-9h2h2b6hbk" deleted\n'
Error from server (NotFound): error when creating "./patch/auth.yaml": the server could not find the requested resource (post profiles.kubeflow.org)
b'configmap/dex created\ndeployment.apps/dex created\nconfigmap/default-install-config-9h2h2b6hbk created\n'

请问这个碍事吗? 我已经把local-path-storage 作为默认 storageclass并卸载重新安装,还是不行,难受。

这个应该是PVC问题,作者的local-path-storage.yaml版本好像删掉了一些helperPod的东西,与你当前使用的环境(非kind安装k8s) 可能不适配,可以去rancher仓库 下载一个新的尝试一下; 我在单机下kubeadm安装的k8s也出现这种问题,暂时使用这个解决; 您好!

在另一台电脑(作为node)上,配置nfs,然后joinmaster,https://www.cnblogs.com/dakewei/p/11716270.html

请问您配置nfs后,还需要执行apply -f local-path-storage.yaml这个嘛

YeeHen commented 2 years ago

执行 python install.py 的时候,出现以下错误:

Error from server (NotFound): error when creating "./manifest1.3/033-user-namespace-user-namespace-base.yaml": the server could not find the requested resource (post profiles.kubeflow.org)
b'configmap/default-install-config-9h2h2b6hbk created\n'
start to patch...
Error from server (NotFound): error when deleting "./patch/auth.yaml": the server could not find the requested resource (delete profiles.kubeflow.org kubeflow-user-example-com)
b'configmap "dex" deleted\ndeployment.apps "dex" deleted\nconfigmap "default-install-config-9h2h2b6hbk" deleted\n'
Error from server (NotFound): error when creating "./patch/auth.yaml": the server could not find the requested resource (post profiles.kubeflow.org)
b'configmap/dex created\ndeployment.apps/dex created\nconfigmap/default-install-config-9h2h2b6hbk created\n'

请问这个碍事吗? 我已经把local-path-storage 作为默认 storageclass并卸载重新安装,还是不行,难受。

这个应该是PVC问题,作者的local-path-storage.yaml版本好像删掉了一些helperPod的东西,与你当前使用的环境(非kind安装k8s) 可能不适配,可以去rancher仓库 下载一个新的尝试一下; 我在单机下kubeadm安装的k8s也出现这种问题,暂时使用这个解决; 您好!

在另一台电脑(作为node)上,配置nfs,然后joinmaster,https://www.cnblogs.com/dakewei/p/11716270.html

请问您配置nfs后,还需要执行apply -f local-path-storage.yaml这个嘛

我使用这种方法 okay 了。更新了下 local-path-storage.yaml