gluster / gluster-kubernetes

GlusterFS Native Storage Service for Kubernetes
Apache License 2.0
874 stars 390 forks source link

Unable to create node: New Node doesn't have glusterd running #620

Open rysyd opened 4 years ago

rysyd commented 4 years ago

Hi, I don't know what happened, it doesn't work……

[root@k8sm90 heketi]# heketi-cli -s $HEKETI_CLI_SERVER --user 'admin' --secret 'My Secret' topology load --json=topology-sample.json
Creating cluster ... ID: a7e089fea96306ab96e9b8aa02bed584
        Allowing file volumes on cluster.
        Allowing block volumes on cluster.
        Creating node 192.168.1.90 ... Unable to create node: New Node doesn't have glusterd running
        Creating node 192.168.1.91 ... Unable to create node: New Node doesn't have glusterd running
        Creating node 192.168.1.92 ... Unable to create node: New Node doesn't have glusterd running

topology-sample.json:

{
    "clusters": [
        {
            "nodes": [
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.1.90"
                            ],
                            "storage": [
                                "192.168.1.90"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        {
                            "name": "/dev/sdb",
                            "destroydata": false
                        }
                    ]
                },
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.1.91"
                            ],
                            "storage": [
                                "192.168.1.91"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        {
                            "name": "/dev/sdb",
                            "destroydata": false
                        }
                    ]
                },
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.1.92"
                            ],
                            "storage": [
                                "192.168.1.92"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        {
                            "name": "/dev/sdb",
                            "destroydata": false
                        }
                    ]
                }
            ]
        }
    ]
}
[root@k8sm90 heketi]# kubectl get pods
NAME                                                  READY   STATUS      RESTARTS   AGE
dapi-test-pod                                         0/1     Completed   0          4d19h
deploy-heketi-6c687b4b84-xjncs                        1/1     Running     0          79m
glusterfs-5tk8m                                       1/1     Running     0          89m
glusterfs-7l2jk                                       1/1     Running     0          89m
glusterfs-r9bgz                                       1/1     Running     0          89m
nginx-deployment-9f5dd848b-28sxl                      1/1     Running     0          20h
pi-xwhcd                                              0/1     Completed   0          26h
pod-flag                                              1/1     Running     0          45h
redis-master-30a30f972c365e831700ecde3264dd55-ck754   1/1     Running     0          18h
[root@k8sm90 heketi]# kubectl get svc
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
deploy-heketi   ClusterIP   10.244.80.108   <none>        8080/TCP       79m
kubernetes      ClusterIP   10.244.0.1      <none>        443/TCP        7d
nginx-service   NodePort    10.244.162.23   <none>        80:32600/TCP   7d
[root@k8sm90 heketi]# curl http://10.244.80.108:8080/hello
Hello from Heketi

log:

[root@k8sm90 heketi]# kubectl logs deploy-heketi-6c687b4b84-xjncs
……
[negroni] 2019-11-13T04:07:39Z | 200 |   297.726µs | 10.244.80.108:8080 | GET /clusters
[negroni] 2019-11-13T04:07:39Z | 201 |   1.797175ms | 10.244.80.108:8080 | POST /clusters
[cmdexec] INFO 2019/11/13 04:07:39 Check Glusterd service status in node 192.168.1.90
[kubeexec] ERROR 2019/11/13 04:07:39 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: Get https://10.244.0.1:443/api/v1/namespaces/default/pods?labelSelector=glusterfs-node: x509: certificate is valid for 127.0.0.1, 192.168.1.1, 192.168.1.90, not 10.244.0.1
[kubeexec] ERROR 2019/11/13 04:07:39 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods
[cmdexec] ERROR 2019/11/13 04:07:39 heketi/executors/cmdexec/peer.go:80:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods
[heketi] ERROR 2019/11/13 04:07:39 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods
[heketi] ERROR 2019/11/13 04:07:39 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] 2019-11-13T04:07:39Z | 400 |   9.029799ms | 10.244.80.108:8080 | POST /nodes
[cmdexec] INFO 2019/11/13 04:07:39 Check Glusterd service status in node 192.168.1.91
[negroni] 2019-11-13T04:07:39Z | 400 |   10.046541ms | 10.244.80.108:8080 | POST /nodes
[kubeexec] ERROR 2019/11/13 04:07:39 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: Get https://10.244.0.1:443/api/v1/namespaces/default/pods?labelSelector=glusterfs-node: x509: certificate is valid for 127.0.0.1, 192.168.1.1, 192.168.1.90, not 10.244.0.1
[kubeexec] ERROR 2019/11/13 04:07:39 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods
[cmdexec] ERROR 2019/11/13 04:07:39 heketi/executors/cmdexec/peer.go:80:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods
[heketi] ERROR 2019/11/13 04:07:39 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods
[heketi] ERROR 2019/11/13 04:07:39 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[cmdexec] INFO 2019/11/13 04:07:39 Check Glusterd service status in node 192.168.1.92
[kubeexec] ERROR 2019/11/13 04:07:39 heketi/pkg/remoteexec/kube/target.go:134:kube.TargetDaemonSet.GetTargetPod: Get https://10.244.0.1:443/api/v1/namespaces/default/pods?labelSelector=glusterfs-node: x509: certificate is valid for 127.0.0.1, 192.168.1.1, 192.168.1.90, not 10.244.0.1
[kubeexec] ERROR 2019/11/13 04:07:39 heketi/pkg/remoteexec/kube/target.go:135:kube.TargetDaemonSet.GetTargetPod: Failed to get list of pods
[cmdexec] ERROR 2019/11/13 04:07:39 heketi/executors/cmdexec/peer.go:80:cmdexec.(*CmdExecutor).GlusterdCheck: Failed to get list of pods
[heketi] ERROR 2019/11/13 04:07:39 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Failed to get list of pods
[heketi] ERROR 2019/11/13 04:07:39 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] 2019-11-13T04:07:39Z | 400 |   8.675361ms | 10.244.80.108:8080 | POST /nodes
[negroni] 2019-11-13T04:07:39Z | 200 |   358.12µs | 10.244.80.108:8080 | GET /clusters/a7e089fea96306ab96e9b8aa02bed584
[heketi] INFO 2019/11/13 04:07:39 Deleted cluster [a7e089fea96306ab96e9b8aa02bed584]
[negroni] 2019-11-13T04:07:39Z | 200 |   1.589577ms | 10.244.80.108:8080 | DELETE /clusters/a7e089fea96306ab96e9b8aa02bed584
[heketi] INFO 2019/11/13 04:08:56 Starting Node Health Status refresh
[heketi] INFO 2019/11/13 04:08:56 Cleaned 0 nodes from health cache
[heketi] INFO 2019/11/13 04:10:56 Starting Node Health Status refresh
[heketi] INFO 2019/11/13 04:10:56 Cleaned 0 nodes from health cache

How can I fix it?

mart3051 commented 4 years ago

It is same for me. I am trying to run it on Amazon EKS running on CentOS

freeeflyer commented 4 years ago

I had the same issue.. An "abort" followed by a "deploy" did the trick (don't know why)

However beware: The "Creating node" steps are very slow.