gluster / gluster-kubernetes

GlusterFS Native Storage Service for Kubernetes
Apache License 2.0
875 stars 389 forks source link

Load heketi topology: Unable to create node: Unable to execute command on glusterfs-storage #472

Open tusharsanas opened 6 years ago

tusharsanas commented 6 years ago

Hi Team,

Please can anyone help me to resolve the same below issue occurred during the deployment of Openshift 3.9 through rpm. Please find below error details.

TASK [openshift_storage_glusterfs : Load heketi topology] *** fatal: [amos002-master1.amosdemo.io]: FAILED! => {"changed": true, "cmd": ["oc", "rsh", "--namespace=glusterfs", "deploy-heketi-storage-1-2xhcx", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "s/MjovFj4JqjT7e7w1PU8ANWObI+QpaYK8+/db7Oj8w=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-wNuhyg/topology.json", "2>&1"], "delta": "0:00:05.917287", "end": "2018-05-08 12:25:32.210006", "failed": true, "failed_when_result": true, "rc": 0, "start": "2018-05-08 12:25:26.292719", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: e8344982fcb3cd0456d1d743f76a895f\n\tAllowing file volumes on cluster.\n\tAllowing block volumes on cluster.\n\tCreating node ip-172-31-40-34.eu-west-1.compute.internal ... ID: d9b7d7a13947630f48c222e5e8b3ecb7\n\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-vbrp2: Can't initialize physical volume \"/dev/xvde\" of volume group \"vg_bf68be87d4cbe356048c5b597e8419db\" without -ff\n /dev/xvde: physical volume not initialized.\n\tCreating node ip-172-31-9-10.eu-west-1.compute.internal ... ID: af818b6b502fe5622e5cf513d88fb693\n\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-m9mh7: Can't initialize physical volume \"/dev/xvde\" of volume group \"vg_0be1184a536298b24e7d5100a3a80259\" without -ff\n /dev/xvde: physical volume not initialized.\n\tCreating node ip-172-31-22-228.eu-west-1.compute.internal ... ID: 4fa1bddc2838b8d0d919681fa6244453\n\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-6ccsr: Can't initialize physical volume \"/dev/xvde\" of volume group \"vg_750a39dc99c1f8f784b72c617882bef6\" without -ff\n /dev/xvde: physical volume not initialized.", "stdout_lines": ["Creating cluster ... ID: e8344982fcb3cd0456d1d743f76a895f", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node ip-172-31-40-34.eu-west-1.compute.internal ... ID: d9b7d7a13947630f48c222e5e8b3ecb7", "\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-vbrp2: Can't initialize physical volume \"/dev/xvde\" of volume group \"vg_bf68be87d4cbe356048c5b597e8419db\" without -ff", " /dev/xvde: physical volume not initialized.", "\tCreating node ip-172-31-9-10.eu-west-1.compute.internal ... ID: af818b6b502fe5622e5cf513d88fb693", "\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-m9mh7: Can't initialize physical volume \"/dev/xvde\" of volume group \"vg_0be1184a536298b24e7d5100a3a80259\" without -ff", " /dev/xvde: physical volume not initialized.", "\tCreating node ip-172-31-22-228.eu-west-1.compute.internal ... ID: 4fa1bddc2838b8d0d919681fa6244453", "\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-6ccsr: Can't initialize physical volume \"/dev/xvde\" of volume group \"vg_750a39dc99c1f8f784b72c617882bef6\" without -ff", " /dev/xvde: physical volume not initialized."]} to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry

phlogistonjohn commented 6 years ago

Hello. Heketi requires empty block devices and will not work if you've already created lvm pvs or filesystem, or partitions, etc on those block devices. If you know those devices are unused you can use a command like wipefs -a to clear them first before trying to initialize heketi.

tusharsanas commented 6 years ago

Hi @phlogistonjohn ,

Thanks for your reply. We run the wipefs -a command, at first the issues was resolved, but further due to some invalid entry in the inventory the deploy-cluster.yml got failed. So corrected that issues with proper inputs in the inventory file.

We deleted existing volume and created new and attached to the instances. So, that means current we have new blank volume. We trigger a deploy-cluster.yml again but is giving the same error which was occured previously.

TASK [openshift_storage_glusterfs : Load heketi topology] *** fatal: [amos002-master1.amosdemo.io]: FAILED! => {"changed": true, "cmd": ["oc", "rsh", "--namespace=glusterfs", "deploy-heketi-storage-1-2pwjs", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "9rUSKlsCAZ3lnNTWYUeiLBT8+mlio8P7uWE5cyuO8uc=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-OerXNy/topology.json", "2>&1"], "delta": "0:00:05.892155", "end": "2018-05-09 11:53:42.168630", "failed": true, "failed_when_result": true, "rc": 0, "start": "2018-05-09 11:53:36.276475", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: 0d5cccc47838961711be10540a8b6f17\n\tAllowing file volumes on cluster.\n\tAllowing block volumes on cluster.\n\tCreating node ip-172-31-40-34.eu-west-1.compute.internal ... ID: f3f5042020ad2e7e2150ff15e75b368f\n\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-mf67w: Can't open /dev/xvde exclusively. Mounted filesystem?\n\tCreating node ip-172-31-9-10.eu-west-1.compute.internal ... ID: 6609cdb15bb772cf6b2f59c13334d2f6\n\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-5q7k7: Can't open /dev/xvde exclusively. Mounted filesystem?\n\tCreating node ip-172-31-22-228.eu-west-1.compute.internal ... ID: d7889ac31efb3d48656cf3d8139d5632\n\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-zpfxv: Can't open /dev/xvde exclusively. Mounted filesystem?", "stdout_lines": ["Creating cluster ... ID: 0d5cccc47838961711be10540a8b6f17", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node ip-172-31-40-34.eu-west-1.compute.internal ... ID: f3f5042020ad2e7e2150ff15e75b368f", "\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-mf67w: Can't open /dev/xvde exclusively. Mounted filesystem?", "\tCreating node ip-172-31-9-10.eu-west-1.compute.internal ... ID: 6609cdb15bb772cf6b2f59c13334d2f6", "\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-5q7k7: Can't open /dev/xvde exclusively. Mounted filesystem?", "\tCreating node ip-172-31-22-228.eu-west-1.compute.internal ... ID: d7889ac31efb3d48656cf3d8139d5632", "\t\tAdding device /dev/xvde ... Unable to add device: Unable to execute command on glusterfs-storage-zpfxv: Can't open /dev/xvde exclusively. Mounted filesystem?"]} to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry

Please let me know if you need any more details from our side.

jarrpa commented 6 years ago

@tusharsanas Can you verify that the new storage devices don't show as having partitions in either lsblk or pvs?

tusharsanas commented 6 years ago

Hi @jarrpa ,

Please find below details after running the command mentioned by you.

[ec2-user@ip-172-31-10-110 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT xvda 202:0 0 10G 0 disk ├─xvda1 202:1 0 1M 0 part └─xvda2 202:2 0 10G 0 part / xvde 202:64 0 10G 0 disk [ec2-user@ip-172-31-10-110 ~]$ pvs -bash: pvs: command not found [ec2-user@ip-172-31-10-110 ~]$

jarrpa commented 6 years ago

Hi! Sorry for the delay, I was traveling last week.

Somehow I only just now realize you were using openshift-ansible. While I'll help you out here for now, in the future please open any further issues at https://github.com/openshift/openshift-ansible . :)

If you can still reproduce the problem, could you have a look at the output of oc logs <heketi_pod> and see what you can find? We're looking to see what commands actually failed. Also, please enclose any log output in ``` marks so I can get some better formatting.

netoralves commented 5 years ago

Hello, I had the same problem, i solved cleaning up all disk, verify block devices, pvs, vgs.

my problem was in block device, when i listed pvs, don't existed any phisical volume with my sdc disk, but when i listed blocks:

lsblk /dev/sdc NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdc 8:32 0 80G 0 disk |-vg_203beec1f50d7ada10751d0718f99ccd-tp_80bdfdfe7bbe7300ad466e87fd9f828f_tmeta 253:4 0 12M 0 lvm | -vg_203beec1f50d7ada10751d0718f99ccd-tp_80bdfdfe7bbe7300ad466e87fd9f828f-tpool 253:6 0 2G 0 lvm | |-vg_203beec1f50d7ada10751d0718f99ccd-tp_80bdfdfe7bbe7300ad466e87fd9f828f 253:7 0 2G 0 lvm |-vg_203beec1f50d7ada10751d0718f99ccd-brick_80bdfdfe7bbe7300ad466e87fd9f828f 253:8 0 2G 0 lvm -vg_203beec1f50d7ada10751d0718f99ccd-tp_80bdfdfe7bbe7300ad466e87fd9f828f_tdata 253:5 0 2G 0 lvm -vg_203beec1f50d7ada10751d0718f99ccd-tp_80bdfdfe7bbe7300ad466e87fd9f828f-tpool 253:6 0 2G 0 lvm |-vg_203beec1f50d7ada10751d0718f99ccd-tp_80bdfdfe7bbe7300ad466e87fd9f828f 253:7 0 2G 0 lvm `-vg_203beec1f50d7ada10751d0718f99ccd-brick_80bdfdfe7bbe7300ad466e87fd9f828f 253:8 0 2G 0 lvm

ps.: Verify all glusterfs disks

My problem solved with one vm reboot(vsphere environment), cleaning up disk (wipe -a DEVICE) and execute playbook deploy_cluster.yml again.