gluster / gluster-kubernetes

GlusterFS Native Storage Service for Kubernetes
Apache License 2.0
875 stars 389 forks source link

Unable to create node: New Node doesn't have glusterd running #527

Open guidoilbaldo opened 5 years ago

guidoilbaldo commented 5 years ago

Hi,

I'm trying to deploy heketi on our staging kube cluster (installed on CentOS VMs with rke) using gk-deploy script. We already have a GlusterFS cluster deployed on 4 separated nodes (with glusterd running and peer status fine) and each of those nodes have a dedicated disk for GlusterFS (/dev/sdb or /dev/vdb, depending on our virtualizator). Below you can find our topology.json file:

    "clusters": [
      {
        "nodes": [
          {
            "node": {
              "hostnames": {
                "manage": [
                  "10.32.40.202"
                ],
                "storage": [
                  "10.32.40.202"
                ]
              },
              "zone": 1
            },
            "devices": [
              "/dev/sdb"
            ]
          },
          {
            "node": {
              "hostnames": {
                "manage": [
                  "10.32.40.203"
                ],
                "storage": [
                  "10.32.40.203"
                ]
              },
              "zone": 1
            },
            "devices": [
              "/dev/sdb"
            ]
          },
          {
            "node": {
              "hostnames": {
                "manage": [
                  "10.36.0.202"
                ],
                "storage": [
                  "10.36.0.202"
                ]
              },
              "zone": 1
            },
            "devices": [
              "/dev/vdb"
            ]
          },
          {
            "node": {
              "hostnames": {
                "manage": [
                  "10.36.0.203"
                ],
                "storage": [
                  "10.36.0.203"
                ]
              },
              "zone": 1
            },
            "devices": [
              "/dev/vdb"
            ]
          }
        ]
      }
    ]
  }

We used IPs both for manage and storage fields because nodes are external from kubernetes cluster. We haven't created block devices on gluster servers as heketi doc specified that they should be left untouched. Here's the output of glusterd status on one of the above nodes:

[root@gluster-1 ~]# systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2018-10-15 09:15:29 UTC; 1h 17min ago
  Process: 57350 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 57351 (glusterd)
   CGroup: /system.slice/glusterd.service
           └─57351 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

Oct 15 09:15:29 gluster-1 systemd[1]: Starting GlusterFS, a clustered file-system server...
Oct 15 09:15:29 gluster-1 systemd[1]: Started GlusterFS, a clustered file-system server.

All kube nodes have access to gluster ones on ports 22, 24007 and 49152-49251 as suggested by gk-deploy script, as well as my local machine running gk-deploy script (we manage kube cluster locally with kubectl). Below there's the output of gk-deploy run:

⋊> ~ ◦ ./gk-deploy --ssh-keyfile ~/.ssh/id_rsa --ssh-user root --ssh-port 22 topology.json
Using Kubernetes CLI.
Using namespace "default".
Checking for pre-existing resources...
  GlusterFS pods ... not found.
  deploy-heketi pod ... found.
  heketi pod ... not found.
  gluster-s3 pod ... not found.
Creating initial resources ... Error from server (AlreadyExists): error when creating "/Users/sguido/Work/git/gluster-kubernetes/deploy/kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
Error from server (AlreadyExists): clusterrolebindings.rbac.authorization.k8s.io "heketi-sa-view" already exists
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view not labeled
OK
Creating cluster ... ID: 129eb59a609ec2e740ac832698721e3a
Allowing file volumes on cluster.
Allowing block volumes on cluster.
Creating node 10.32.40.202 ... Unable to create node: New Node doesn't have glusterd running
Creating node 10.32.40.203 ... Unable to create node: New Node doesn't have glusterd running
Creating node 10.36.0.202 ... Unable to create node: New Node doesn't have glusterd running
Creating node 10.36.0.203 ... Unable to create node: New Node doesn't have glusterd running
Error loading the cluster topology.
Please check the failed node or device and rerun this script.

And also heketi logs from kube:

⋊> ~ ◦ kubectl logs deploy-heketi-559446b649-2tt4q
[negroni] Started GET /clusters
[negroni] Completed 200 OK in 91.263µs
[negroni] Started POST /clusters
[negroni] Completed 201 Created in 2.432301ms
[negroni] Started POST /nodes
[cmdexec] INFO 2018/10/15 11:43:38 Check Glusterd service status in node 10.32.40.202
[kubeexec] ERROR 2018/10/15 11:43:38 heketi/executors/kubeexec/kubeexec.go:310:kubeexec.(*KubeExecutor).getPodNameFromDaemonSet: Unable to find a GlusterFS pod on host 10.32.40.202 with a label key glusterfs-node
[cmdexec] ERROR 2018/10/15 11:43:38 heketi/executors/cmdexec/peer.go:76:cmdexec.(*CmdExecutor).GlusterdCheck: Unable to find a GlusterFS pod on host 10.32.40.202 with a label key glusterfs-node
[heketi] ERROR 2018/10/15 11:43:38 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Unable to find a GlusterFS pod on host 10.32.40.202 with a label key glusterfs-node
[heketi] ERROR 2018/10/15 11:43:38 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] Completed 400 Bad Request in 9.656696ms
[negroni] Started POST /nodes
[cmdexec] INFO 2018/10/15 11:43:38 Check Glusterd service status in node 10.32.40.203
[kubeexec] ERROR 2018/10/15 11:43:38 heketi/executors/kubeexec/kubeexec.go:310:kubeexec.(*KubeExecutor).getPodNameFromDaemonSet: Unable to find a GlusterFS pod on host 10.32.40.203 with a label key glusterfs-node
[cmdexec] ERROR 2018/10/15 11:43:38 heketi/executors/cmdexec/peer.go:76:cmdexec.(*CmdExecutor).GlusterdCheck: Unable to find a GlusterFS pod on host 10.32.40.203 with a label key glusterfs-node
[heketi] ERROR 2018/10/15 11:43:38 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Unable to find a GlusterFS pod on host 10.32.40.203 with a label key glusterfs-node
[heketi] ERROR 2018/10/15 11:43:38 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] Completed 400 Bad Request in 5.015465ms
[negroni] Started POST /nodes
[cmdexec] INFO 2018/10/15 11:43:38 Check Glusterd service status in node 10.36.0.202
[negroni] Completed 400 Bad Request in 5.261841ms
[kubeexec] ERROR 2018/10/15 11:43:38 heketi/executors/kubeexec/kubeexec.go:310:kubeexec.(*KubeExecutor).getPodNameFromDaemonSet: Unable to find a GlusterFS pod on host 10.36.0.202 with a label key glusterfs-node
[cmdexec] ERROR 2018/10/15 11:43:38 heketi/executors/cmdexec/peer.go:76:cmdexec.(*CmdExecutor).GlusterdCheck: Unable to find a GlusterFS pod on host 10.36.0.202 with a label key glusterfs-node
[heketi] ERROR 2018/10/15 11:43:38 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Unable to find a GlusterFS pod on host 10.36.0.202 with a label key glusterfs-node
[heketi] ERROR 2018/10/15 11:43:38 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] Started POST /nodes
[cmdexec] INFO 2018/10/15 11:43:38 Check Glusterd service status in node 10.36.0.203
[kubeexec] ERROR 2018/10/15 11:43:38 heketi/executors/kubeexec/kubeexec.go:310:kubeexec.(*KubeExecutor).getPodNameFromDaemonSet: Unable to find a GlusterFS pod on host 10.36.0.203 with a label key glusterfs-node
[cmdexec] ERROR 2018/10/15 11:43:38 heketi/executors/cmdexec/peer.go:76:cmdexec.(*CmdExecutor).GlusterdCheck: Unable to find a GlusterFS pod on host 10.36.0.203 with a label key glusterfs-node
[heketi] ERROR 2018/10/15 11:43:38 heketi/apps/glusterfs/app_node.go:107:glusterfs.(*App).NodeAdd: Unable to find a GlusterFS pod on host 10.36.0.203 with a label key glusterfs-node
[heketi] ERROR 2018/10/15 11:43:38 heketi/apps/glusterfs/app_node.go:108:glusterfs.(*App).NodeAdd: New Node doesn't have glusterd running
[negroni] Completed 400 Bad Request in 6.230574ms
[negroni] Started GET /clusters/129eb59a609ec2e740ac832698721e3a
[negroni] Completed 200 OK in 225.281µs
[negroni] Started DELETE /clusters/129eb59a609ec2e740ac832698721e3a
[heketi] INFO 2018/10/15 11:43:38 Deleted cluster [129eb59a609ec2e740ac832698721e3a]
[negroni] Completed 200 OK in 2.149652ms

I don't understand if gk-deploy is trying to look for gluster pods on those nodes and it fails or why it couldn't connect to glusterd daemon running on port 24007. Help would be much appreciated.

drake7707 commented 5 years ago

getPodNameFromDaemonSet: Unable to find a GlusterFS pod on host You need to change the executor in the gk-deploy script, the default executor is kubernetes which will use the manage hostnames to look for the kubernetes nodes to determine where the glusterfs pods are running on.

guidoilbaldo commented 5 years ago

Yep, that solved it! Thank you @drake7707 👍

lingyuguo commented 5 years ago

Hello! Can I have your modified gk-deploy script? @guidoilbaldo

rakou commented 5 years ago

Yep, that solved it! Thank you @drake7707 👍

May I know what you fixed in the gk_deploy script?

phlogistonjohn commented 5 years ago

If you provide the gk-deploy script a --ssh-keyfile it is supposed to switch from kubernetes executor to ssh executor. It may not be obvious but I don't think you need to modify the script.