gluster / gluster-kubernetes

GlusterFS Native Storage Service for Kubernetes
Apache License 2.0
875 stars 389 forks source link

Heketi pod crush #579

Open malhammadi93 opened 5 years ago

malhammadi93 commented 5 years ago

I tried gluster cluster online and the cluster is up and running. I moved it to the intranet without internet connection, and I restarted the installation. During the process of adding the devices the heketi pod crushes, and the script returns 0 and fails. I tired to load the topology using the heketi pod, but it crushes and the pod stops. I checked the logs it exits during "pvcreate". Is there an issue if there is no internet?

malhammadi93 commented 5 years ago

root@manager01:~/gluster-kubernetes/deploy# ./gk-deploy -gy -n gfs -c kubectlUsing Kubernetes CLI. Using namespace "gfs". Checking for pre-existing resources... GlusterFS pods ... found. deploy-heketi pod ... found. heketi pod ... not found. gluster-s3 pod ... not found. Creating initial resources ... Error from server (AlreadyExists): error when creating "/home/ubuntu/gluster-kubernetes/deploy/kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists Error from server (AlreadyExists): clusterrolebindings.rbac.authorization.k8s.io "heketi-sa-view" already exists clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view not labeled OK Found node worker07 on cluster b523ae79f2c40d9df5dfe562d90bd716 Found device /dev/sda Found device /dev/sdb Found device /dev/sdc Found node worker08 on cluster b523ae79f2c40d9df5dfe562d90bd716 Found device /dev/sda Found device /dev/sdb Found device /dev/sdc Found node worker09 on cluster b523ae79f2c40d9df5dfe562d90bd716 Found device /dev/sda Found device /dev/sdb Found device /dev/sdc Found node worker10 on cluster b523ae79f2c40d9df5dfe562d90bd716 Adding device /dev/sda ... return 0 heketi topology loaded. cat: /tmp/heketi-storage.json: No such file or directory command terminated with exit code 1 error: no objects passed to create Failed on creating heketi storage resources.

malhammadi93 commented 5 years ago

heketi-cli topology load --json=/home/topology.json Found node worker07 on cluster b523ae79f2c40d9df5dfe562d90bd716 Found device /dev/sda Found device /dev/sdb Found device /dev/sdc Found node worker08 on cluster b523ae79f2c40d9df5dfe562d90bd716 Found device /dev/sda Found device /dev/sdb Found device /dev/sdc Found node worker09 on cluster b523ae79f2c40d9df5dfe562d90bd716 Found device /dev/sda Found device /dev/sdb Found device /dev/sdc Found node worker10 on cluster b523ae79f2c40d9df5dfe562d90bd716 Adding device /dev/sda ...

malhammadi93 commented 5 years ago

[kubeexec] ERROR 2019/04/15 07:22:57 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [gluster --mode=script --timeout=600 volume stop heketidbstorage force] on [pod:glusterfs-kl78l c:glusterfs ns:gfs (from host:worker09 selector:glusterfs-node)]: Err[command terminated with exit code 1]: Stdout []: Stderr [volume stop: heketidbstorage: failed: Volume heketidbstorage does not exist [cmdexec] ERROR 2019/04/15 07:22:57 heketi/executors/cmdexec/volume.go:150:cmdexec.(CmdExecutor).VolumeDestroy: Unable to stop volume heketidbstorage: volume stop: heketidbstorage: failed: Volume heketidbstorage does not exist [kubeexec] ERROR 2019/04/15 07:22:58 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [gluster --mode=script --timeout=600 volume delete heketidbstorage] on [pod:glusterfs-kl78l c:glusterfs ns:gfs (from host:worker09 selector:glusterfs-node)]: Err[command terminated with exit code 1]: Stdout []: Stderr [volume delete: heketidbstorage: failed: Volume heketidbstorage does not exist [cmdexec] ERROR 2019/04/15 07:22:58 heketi/executors/cmdexec/volume.go:160:cmdexec.(CmdExecutor).VolumeDestroy: Unable to delete volume heketidbstorage: volume delete: heketidbstorage: failed: Volume heketidbstorage does not exist [kubeexec] ERROR 2019/04/15 07:27:12 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/sda'] on [pod:glusterfs-ttvp5 c:glusterfs ns:gfs (from host:worker10 selector:glusterfs-node)]: Err[command terminated with exit code 5]: Stdout []: Stderr [ WARNING: Device /dev/sda not initialized in udev database even after waiting 10000000 microseconds. [kubeexec] ERROR 2019/04/15 07:31:33 heketi/pkg/remoteexec/kube/exec.go:85:kube.ExecCommands: Failed to run command [pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/sda'] on [pod:glusterfs-ttvp5 c:glusterfs ns:gfs (from host:worker10 selector:glusterfs-node)]: Err[command terminated with exit code 5]: Stdout []: Stderr [ WARNING: Device /dev/sda not initialized in udev database even after waiting 10000000 microseconds.

JannikZed commented 5 years ago

I'm getting the same error when trying to setup my glusterFS system. The first creation doesn't seem to work:

Failed to run command [pvcreate -qq --metadatasize=128M --dataalignment=256K '/dev/disk/by-id/scsi-0HC_Volume_2662986'] on glusterfs-g98sw: Err[command terminated with exit code 5]: Stdout []: Stderr [ WARNING: Device /dev/sda1 not initialized in udev database even after waiting 10000000 microseconds.