gluster / gluster-kubernetes

GlusterFS Native Storage Service for Kubernetes
Apache License 2.0
874 stars 390 forks source link

Error waiting for job 'heketi-storage-copy-job' to complete #648

Open mewais opened 4 years ago

mewais commented 4 years ago

As the title says, while using the gk-deploy script, it ends in failure with the following error: Error waiting for job 'heketi-storage-copy-job' to complete.

Here's the command: ./gk-deploy -g ../../storage_topology.json --admin-key admin --user-key user

And here's the output:

Do you wish to proceed with deployment?

[Y]es, [N]o? [Default: Y]: Y
Using Kubernetes CLI.
Using namespace "default".
Checking for pre-existing resources...
  GlusterFS pods ... not found.
  deploy-heketi pod ... not found.
  heketi pod ... not found.
  gluster-s3 pod ... not found.
Creating initial resources ... serviceaccount/heketi-service-account created
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view created
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view labeled
OK
node/storage-worker-01 labeled
node/storage-worker-02 labeled
node/storage-worker-03 labeled
daemonset.apps/glusterfs created
Waiting for GlusterFS pods to start ... OK
secret/heketi-config-secret created
secret/heketi-config-secret labeled
service/deploy-heketi created
deployment.apps/deploy-heketi created
Waiting for deploy-heketi pod to start ... OK
Creating cluster ... ID: 2f6eba4d13572a03c0b0af76cd519b79
Allowing file volumes on cluster.
Allowing block volumes on cluster.
Creating node storage-worker-01 ... ID: ed6fea5e1c5f9a7ccb1784e3f5126d4d
Adding device /dev/sdb ... OK
Creating node storage-worker-02 ... ID: 6f1db063cf2853b790156d749dcf1e1c
Adding device /dev/sdb ... OK
Creating node storage-worker-03 ... ID: 4d6ed79b2395317bf305ebb8f20babd2
Adding device /dev/sdb ... OK
heketi topology loaded.
Saving /tmp/heketi-storage.json
secret/heketi-storage-secret created
endpoints/heketi-storage-endpoints created
service/heketi-storage-endpoints created
job.batch/heketi-storage-copy-job created
Error waiting for job 'heketi-storage-copy-job' to complete.

I used the get event command to find out the errors: kubectl get ev --field-selector involvedObject.name=heketi-storage-copy-job-hbvv4

And the output was:

LAST SEEN   TYPE      REASON        OBJECT                              MESSAGE
<unknown>   Normal    Scheduled     pod/heketi-storage-copy-job-hbvv4   Successfully assigned default/heketi-storage-copy-job-hbvv4 to storage-worker-03
7m49s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   MountVolume.SetUp failed for volume "heketi-storage" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.10.2.1:10.10.2.2:10.10.2.3,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/heketi-storage/heketi-storage-copy-job-hbvv4-glusterfs.log,log-level=ERROR 10.10.2.3:heketidbstorage /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage
Output: Running scope as unit: run-r174fd3d54bd2476ab9e056279ea42029.scope
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2020-04-22 06:34:28.998556] E [MSGID: 100026] [glusterfsd.c:2403:glusterfs_process_volfp] 0-: failed to construct the graph
[2020-04-22 06:34:28.998701] E [graph.c:1102:glusterfs_graph_destroy] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x537) [0x5556177d7e77] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x180) [0x5556177d1ea0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_destroy+0x6b) [0x7fb786f0eeab] ) 0-graph: invalid argument: graph [Invalid argument]
7m49s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   MountVolume.SetUp failed for volume "heketi-storage" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.10.2.1:10.10.2.2:10.10.2.3,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/heketi-storage/heketi-storage-copy-job-hbvv4-glusterfs.log,log-level=ERROR 10.10.2.3:heketidbstorage /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage
Output: Running scope as unit: run-r9787f39108aa4c99a177e556522e9f17.scope
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2020-04-22 06:34:29.663496] E [MSGID: 100026] [glusterfsd.c:2403:glusterfs_process_volfp] 0-: failed to construct the graph
[2020-04-22 06:34:29.663644] E [graph.c:1102:glusterfs_graph_destroy] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x537) [0x55dc332fce77] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x180) [0x55dc332f6ea0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_destroy+0x6b) [0x7f056a420eab] ) 0-graph: invalid argument: graph [Invalid argument]
7m48s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   MountVolume.SetUp failed for volume "heketi-storage" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.10.2.1:10.10.2.2:10.10.2.3,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/heketi-storage/heketi-storage-copy-job-hbvv4-glusterfs.log,log-level=ERROR 10.10.2.2:heketidbstorage /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage
Output: Running scope as unit: run-r80514c65cb6647f6ac2697b5114f469b.scope
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2020-04-22 06:34:30.779343] E [MSGID: 100026] [glusterfsd.c:2403:glusterfs_process_volfp] 0-: failed to construct the graph
[2020-04-22 06:34:30.779493] E [graph.c:1102:glusterfs_graph_destroy] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x537) [0x55f294288e77] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x180) [0x55f294282ea0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_destroy+0x6b) [0x7f070e30beab] ) 0-graph: invalid argument: graph [Invalid argument]
7m45s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   MountVolume.SetUp failed for volume "heketi-storage" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.10.2.1:10.10.2.2:10.10.2.3,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/heketi-storage/heketi-storage-copy-job-hbvv4-glusterfs.log,log-level=ERROR 10.10.2.3:heketidbstorage /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage
Output: Running scope as unit: run-r976c7b489f754d9caec29e1da0d1884d.scope
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2020-04-22 06:34:33.011549] E [MSGID: 100026] [glusterfsd.c:2403:glusterfs_process_volfp] 0-: failed to construct the graph
[2020-04-22 06:34:33.011731] E [graph.c:1102:glusterfs_graph_destroy] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x537) [0x560cd9c99e77] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x180) [0x560cd9c93ea0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_destroy+0x6b) [0x7f7dcd745eab] ) 0-graph: invalid argument: graph [Invalid argument]
7m41s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   MountVolume.SetUp failed for volume "heketi-storage" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.10.2.1:10.10.2.2:10.10.2.3,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/heketi-storage/heketi-storage-copy-job-hbvv4-glusterfs.log,log-level=ERROR 10.10.2.3:heketidbstorage /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage
Output: Running scope as unit: run-rdbd0b7e6802c414b813bd3b188076e4d.scope
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2020-04-22 06:34:37.143898] E [MSGID: 100026] [glusterfsd.c:2403:glusterfs_process_volfp] 0-: failed to construct the graph
[2020-04-22 06:34:37.144040] E [graph.c:1102:glusterfs_graph_destroy] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x537) [0x55f51bd1ae77] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x180) [0x55f51bd14ea0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_destroy+0x6b) [0x7febafc8beab] ) 0-graph: invalid argument: graph [Invalid argument]
7m33s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   MountVolume.SetUp failed for volume "heketi-storage" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.10.2.1:10.10.2.2:10.10.2.3,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/heketi-storage/heketi-storage-copy-job-hbvv4-glusterfs.log,log-level=ERROR 10.10.2.3:heketidbstorage /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage
Output: Running scope as unit: run-r867c9cb2f25a4f6ba066719ce4ca483d.scope
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2020-04-22 06:34:45.293488] E [MSGID: 100026] [glusterfsd.c:2403:glusterfs_process_volfp] 0-: failed to construct the graph
[2020-04-22 06:34:45.293683] E [graph.c:1102:glusterfs_graph_destroy] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x537) [0x55bb171bfe77] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x180) [0x55bb171b9ea0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_destroy+0x6b) [0x7f5d47a95eab] ) 0-graph: invalid argument: graph [Invalid argument]
7m17s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   MountVolume.SetUp failed for volume "heketi-storage" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.10.2.1:10.10.2.2:10.10.2.3,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/heketi-storage/heketi-storage-copy-job-hbvv4-glusterfs.log,log-level=ERROR 10.10.2.2:heketidbstorage /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage
Output: Running scope as unit: run-re7d2348c9f334f6eaed9a6216ef332bf.scope
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2020-04-22 06:35:01.489830] E [MSGID: 100026] [glusterfsd.c:2403:glusterfs_process_volfp] 0-: failed to construct the graph
[2020-04-22 06:35:01.489995] E [graph.c:1102:glusterfs_graph_destroy] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x537) [0x55b7595fbe77] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x180) [0x55b7595f5ea0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_destroy+0x6b) [0x7f50b6965eab] ) 0-graph: invalid argument: graph [Invalid argument]
6m45s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   MountVolume.SetUp failed for volume "heketi-storage" : mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage --scope -- mount -t glusterfs -o auto_unmount,backup-volfile-servers=10.10.2.1:10.10.2.2:10.10.2.3,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/heketi-storage/heketi-storage-copy-job-hbvv4-glusterfs.log,log-level=ERROR 10.10.2.2:heketidbstorage /var/lib/kubelet/pods/e449dc88-4642-445b-9994-d183ef3f9b9f/volumes/kubernetes.io~glusterfs/heketi-storage
Output: Running scope as unit: run-r48261be86e40451bb4210c19286f31d8.scope
Mount failed. Please check the log file for more details.
, the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2020-04-22 06:35:33.693499] E [MSGID: 100026] [glusterfsd.c:2403:glusterfs_process_volfp] 0-: failed to construct the graph
[2020-04-22 06:35:33.693662] E [graph.c:1102:glusterfs_graph_destroy] (-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x537) [0x55be60e85e77] -->/usr/sbin/glusterfs(glusterfs_process_volfp+0x180) [0x55be60e7fea0] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(glusterfs_graph_destroy+0x6b) [0x7f20b8b7eeab] ) 0-graph: invalid argument: graph [Invalid argument]
3m30s       Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   Unable to attach or mount volumes: unmounted volumes=[heketi-storage], unattached volumes=[heketi-storage heketi-storage-secret default-token-t7qfx]: timed out waiting for the condition
76s         Warning   FailedMount   pod/heketi-storage-copy-job-hbvv4   (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[heketi-storage heketi-storage-secret default-token-t7qfx], unattached volumes=[heketi-storage heketi-storage-secret default-token-t7qfx]: timed out waiting for the condition

I'm using Ubuntu Server 18.04, running kubectl version yields this:

Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-14T16:45:05Z", GoVersion:"go1.13.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:48:36Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

Any help would be appreciated.

grzegorzgg commented 4 years ago

I have the same problem I also use Ubuntu 18.04

mewais commented 4 years ago

I moved on, I'm now using rook to install a ceph storage system. You can find a nice guide here

aravindavk commented 4 years ago

I moved on, I'm now using rook to install a ceph storage system. You can find a nice guide here

If you are still interested in GlusterFS based solution, please try https://kadalu.io (Project page: https://github.com/kadalu/kadalu)

mewais commented 4 years ago

@aravindavk I will, thanks for bringing this up.