Glusterfs config.yml deploy produce: Wait for heketi pod Message Failed without returning a message

am-titan commented 4 years ago

Hi Everyone, I'm trying to overcome this issue in my deployment for around a week, I ran the deployment scripts prerequisites.yml, deploy_cluster.yml, uninstall.yml, again and again still same issue.

* ansible 2.9.2
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Aug  7 2019, 00:51:29) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]

* The output of `git describe`
`git version 1.8.3.1`
If you're running from playbooks installed via RPM

* The output of `rpm -q openshift-ansible`
package openshift-ansible is not installed (but I'm sure it is!)
[root@os-master openshift-ansible]# ls
ansible.cfg  CONTRIBUTING.md      examples  images     meta                    playbooks                  README_CONTAINERIZED_INSTALLATION.md  roles      test
BUILD.md     DEPLOYMENT_TYPES.md  hack      inventory  openshift-ansible.spec  pytest.ini                 README.md                             setup.cfg  test-requirements.txt
conftest.py  docs                 HOOKS.md  LICENSE    OWNERS                  README_CONTAINER_IMAGE.md  requirements.txt                      setup.py   tox.ini
Place the output between the code block below:

Steps To Reproduce

Run ansible-playbook /root/openshift-ansible/playbooks/prerequisites.yml
Deploy new single master cluster using ansible-playbook /root/openshift-ansible/playbooks/openshift-glusterfs/config.yml
Same happen also with ansible-playbook /root/openshift-ansible/playbooks/deploy_cluster.yml 100% reproducible, even wit a minimal inventory file and new VMs.

Expected Results

GlusterFS is deployed and ready for PV and PVC

Observed Results

Describe what is actually happening.

PLAY RECAP **** localhost : ok=12 changed=0 unreachable=0 failed=0 skipped=4 rescued=0 ignored=0 os-infra.mydomain.com : ok=144 changed=36 unreachable=0 failed=0 skipped=163 rescued=0 ignored=0 os-master.mydomain.com : ok=471 changed=193 unreachable=0 failed=1 skipped=589 rescued=0 ignored=0 os-node.mydomain.com : ok=129 changed=36 unreachable=0 failed=0 skipped=159 rescued=0 ignored=0 os-storage.mydomain.com : ok=129 changed=36 unreachable=0 failed=0 skipped=159 rescued=0 ignored=0

INSTALLER STATUS ** Initialization : Complete (0:00:26) Health Check : Complete (0:00:07) Node Bootstrap Preparation : Complete (0:03:14) etcd Install : Complete (0:00:41) Master Install : Complete (0:04:26) Master Additional Install : Complete (0:00:40) Node Join : Complete (0:00:43) GlusterFS Install : In Progress (0:13:46) This phase can be restarted by running: playbooks/openshift-glusterfs/new_install.yml

Failure summary:

Hosts: os-master.mydomain.com Play: Configure GlusterFS Task: Wait for heketi pod Message: Failed without returning a message.`

Always when deploying Glusterfs it failed Failed without returning a message and after Wait for heketi pod FAILED - RETRYING: Wait for heketi pod (1 retries left).

fatal: [os-master.example.comthedawn.com]: FAILED! => {"attempts": 30, "changed": false, "module_results": {"cmd": "/usr/bin/oc get pod --selector=glusterfs=heketi-storage-pod -o json -n glusterfs", "results": [{"apiVersion": "v1", "items": [{"apiVersion": "v1", "kind": "Pod", "metadata": {"annotations": {"openshift.io/deployment-config.latest-version": "1", "openshift.io/deployment-config.name": "heketi-storage", "openshift.io/deployment.name": "heketi-storage-1", "openshift.io/scc": "privileged"}, "creationTimestamp": "2020-01-24T11:26:08Z", "generateName": "heketi-storage-1-", "labels": {"deployment": "heketi-storage-1", "deploymentconfig": "heketi-storage", "glusterfs": "heketi-storage-pod", "heketi": "storage-pod"}, "name": "heketi-storage-1-bvn52", "namespace": "glusterfs", "ownerReferences": [{"apiVersion": "v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicationController", "name": "heketi-storage-1", "uid": "4fa4ac45-3e9c-11ea-a06a-000c29fad897"}], "resourceVersion": "8520", "selfLink": "/api/v1/namespaces/glusterfs/pods/heketi-storage-1-bvn52", "uid": "51645ba4-3e9c-11ea-a06a-000c29fad897"}, "spec": {"containers": [{"env": [{"name": "HEKETI_USER_KEY", "value": "kumZhoQMqSxUcCGAeIXiiZfYSnxOIQrQkGp2T1ev6AM="}, {"name": "HEKETI_ADMIN_KEY", "value": "Wcct2uvr6AI8bXUtFIJ9IdgJxyqdW+P1qKWSntk9MCg="}, {"name": "HEKETI_CLI_USER", "value": "admin"}, {"name": "HEKETI_CLI_KEY", "value": "Wcct2uvr6AI8bXUtFIJ9IdgJxyqdW+P1qKWSntk9MCg="}, {"name": "HEKETI_EXECUTOR", "value": "kubernetes"}, {"name": "HEKETI_FSTAB", "value": "/var/lib/heketi/fstab"}, {"name": "HEKETI_SNAPSHOT_LIMIT", "value": "14"}, {"name": "HEKETI_KUBE_GLUSTER_DAEMONSET", "value": "1"}, {"name": "HEKETI_IGNORE_STALE_OPERATIONS", "value": "true"}, {"name": "HEKETI_DEBUG_UMOUNT_FAILURES", "value": "true"}], "image": "docker.io/heketi/heketi:latest", "imagePullPolicy": "IfNotPresent", "livenessProbe": {"failureThreshold": 3, "httpGet": {"path": "/hello", "port": 8080, "scheme": "HTTP"}, "initialDelaySeconds": 30, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 3}, "name": "heketi", "ports": [{"containerPort": 8080, "protocol": "TCP"}], "readinessProbe": {"failureThreshold": 3, "httpGet": {"path": "/hello", "port": 8080, "scheme": "HTTP"}, "initialDelaySeconds": 3, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 3}, "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/lib/heketi", "name": "db"}, {"mountPath": "/etc/heketi", "name": "config"}, {"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "heketi-storage-service-account-token-kmvc8", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "imagePullSecrets": [{"name": "heketi-storage-service-account-dockercfg-xf57n"}], "nodeName": "os-master.example.comthedawn.com", "priority": 0, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "heketi-storage-service-account", "serviceAccountName": "heketi-storage-service-account", "terminationGracePeriodSeconds": 30, "volumes": [{"glusterfs": {"endpoints": "heketi-db-storage-endpoints", "path": "heketidbstorage"}, "name": "db"}, {"name": "config", "secret": {"defaultMode": 420, "secretName": "heketi-storage-config-secret"}}, {"name": "heketi-storage-service-account-token-kmvc8", "secret": {"defaultMode": 420, "secretName": "heketi-storage-service-account-token-kmvc8"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2020-01-24T11:26:08Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2020-01-24T11:26:08Z", "message": "containers with unready status: [heketi]", "reason": "ContainersNotReady", "status": "False", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": null, "message": "containers with unready status: [heketi]", "reason": "ContainersNotReady", "status": "False", "type": "ContainersReady"}, {"lastProbeTime": null, "lastTransitionTime": "2020-01-24T11:26:08Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"image": "docker.io/heketi/heketi:latest", "imageID": "", "lastState": {}, "name": "heketi", "ready": false, "restartCount": 0, "state": {"waiting": {"reason": "ContainerCreating"}}}], "hostIP": "192.168.1.212", "phase": "Pending", "qosClass": "BestEffort", "startTime": "2020-01-24T11:26:08Z"}}], "kind": "List", "metadata": {"resourceVersion": "", "selfLink": ""}}], "returncode": 0}, "state": "list"}

For long output or logs, consider using a gist

Additional Information

I believe it's related to this bug, but Maybe I'm missing the is the workaround?


I'm using as standalone ESXi VMware as the hypervisor, and an RPM install of Origin.
ansible 2.9.2
Origin 3.11
Centos 7 as the OS for the nodes

`[root@os-master ~]# docker version
Client:
Version: 1.13.1
API version: 1.26
Package version: docker-1.13.1-103.git7f2769b.el7.centos.x86_64
Go version: go1.10.3
Git commit: 7f2769b/1.13.1
Built: Sun Sep 15 14:06:47 2019
OS/Arch: linux/amd64

Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version: docker-1.13.1-103.git7f2769b.el7.centos.x86_64
Go version: go1.10.3
Git commit: 7f2769b/1.13.1
Built: Sun Sep 15 14:06:47 2019
OS/Arch: linux/amd64
Experimental: false`

Here is my inventory:
`[OSEv3:children]
masters
etcd
nodes
glusterfs

[OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=origin
openshift_release="3.11"
openshift_image_tag="v3.11"
openshift_master_default_subdomain=apps.mydomain.com
openshift_docker_selinux_enabled=True
openshift_check_min_host_memory_gb=16
openshift_check_min_host_disk_gb=50
openshift_disable_check=docker_image_availability
openshift_master_dynamic_provisioning_enabled=true
openshift_registry_selector="role=infra"
openshift_hosted_registry_storage_kind=glusterfs

openshift_metrics_install_metrics=true
openshift_metrics_cassandra_storage_type=pv
openshift_metrics_hawkular_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_cassandra_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_heapster_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_storage_volume_size=20Gi
openshift_metrics_cassandra_pvc_storage_class_name="glusterfs-registry-block"

openshift_logging_install_logging=true
openshift_logging_es_pvc_dynamic=true openshift_logging_storage_kind=dynamic
openshift_logging_kibana_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_pvc_size=20Gi
openshift_logging_es_pvc_storage_class_name="glusterfs-registry-block"

openshift_storage_glusterfs_registry_namespace=infra-storage
openshift_storage_glusterfs_registry_storageclass=false
openshift_storage_glusterfs_registry_storageclass_default=false
openshift_storage_glusterfs_registry_block_deploy=true
openshift_storage_glusterfs_registry_block_host_vol_create=true
openshift_storage_glusterfs_registry_block_host_vol_size=100
openshift_storage_glusterfs_registry_block_storageclass=true
openshift_storage_glusterfs_registry_block_storageclass_default=false

[masters]
os-master.mydomain.com

[etcd]
os-master.mydomain.com

[nodes]
os-master.mydomain.com openshift_node_group_name="node-config-master"
os-infra.mydomain.com openshift_node_group_name="node-config-infra"
os-storage.mydomain.com openshift_node_group_name="node-config-compute"
os-node.mydomain.com openshift_node_group_name="node-config-compute"

[glusterfs]
os-infra.mydomain.com glusterfs_ip='192.168.1.213' glusterfs_devices='["/dev/sdb"]'
os-node.mydomain.com glusterfs_ip='192.168.1.214' glusterfs_devices='["/dev/sdb"]'
os-storage.mydomain.com glusterfs_ip='192.168.1.215' glusterfs_devices='["/dev/sdb"]'`

Can someone please advice what should I do to be able to successfully deploy?
Many Thanks on advance.

am-titan commented 4 years ago

If someone could help, here is some additional info:

[root@os-master ~]# oc describe pod heketi-storage-1-deploy --namespace=glusterfs Name: heketi-storage-1-deploy Namespace: glusterfs Priority: 0 PriorityClassName: Node: os-master.example.com/192.168.1.212 Start Time: Fri, 24 Jan 2020 12:26:06 +0100 Labels: openshift.io/deployer-pod-for.name=heketi-storage-1 Annotations: openshift.io/deployment-config.name=heketi-storage openshift.io/deployment.name=heketi-storage-1 openshift.io/scc=restricted Status: Failed IP: 10.128.0.44 Containers: deployment: Container ID: docker://e5bbcf7a8750cc69d1c398259515f92dcde199ba0ffa431d6bdda1c706e3f17b Image: docker.io/openshift/origin-deployer:v3.11.0 Image ID: docker-pullable://docker.io/openshift/origin-deployer@sha256:b88bd9c072b78a903dc087e1beec3cec9969edb6552f3e73ed7f3108ff6205c0 Port: Host Port: State: Terminated Reason: Error Exit Code: 1 Started: Fri, 24 Jan 2020 12:26:08 +0100 Finished: Fri, 24 Jan 2020 12:36:10 +0100 Ready: False Restart Count: 0 Environment: OPENSHIFT_DEPLOYMENT_NAME: heketi-storage-1 OPENSHIFT_DEPLOYMENT_NAMESPACE: glusterfs Mounts: /var/run/secrets/kubernetes.io/serviceaccount from deployer-token-vwdkh (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: deployer-token-vwdkh: Type: Secret (a volume populated by a Secret) SecretName: deployer-token-vwdkh Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: Events: Type Reason Age From Message

Normal Scheduled 6h default-scheduler Successfully assigned glusterfs/heketi-storage-1-deploy to os-master.example.com Normal Pulled 6h kubelet, os-master.example.com Container image "docker.io/openshift/origin-deployer:v3.11.0" already present on machine Normal Created 6h kubelet, os-master.example.com Created container Normal Started 6h kubelet, os-master.example.com Started container

Thanks

am-titan commented 4 years ago

someone, please help? I have tired the same on top of Centos 7 with VirtualBox as hyper. Im'm getting the same result, so it looks like an pure origin case.

`FAILED! => {"attempts": 90, "changed": false, "module_results": {"cmd": "/usr/bin/oc get pod --selector=glusterfs=heketi-storage-pod -o json -n glusterfs", "results": [{"apiVersion": "v1", "items": [], "kind": "List", "metadata": {"resourceVersion": "", "selfLink": ""}}], "returncode": 0}, "state": "list"} NAMESPACE NAME READY STATUS RESTARTS AGE glusterfs heketi-storage-1-deploy 0/1 Error 0 3h

The pod will deploy pod will stay on error status

imranrazakhan commented 4 years ago

@am-titan Have you tried by providing following section for heketi in inventory

openshift_storage_glusterfs_heketi_is_native=true
openshift_storage_glusterfs_heketi_executor=ssh
openshift_storage_glusterfs_heketi_ssh_port=22
openshift_storage_glusterfs_heketi_ssh_user=root
openshift_storage_glusterfs_heketi_ssh_sudo=false
openshift_storage_glusterfs_heketi_ssh_keyfile="/root/.ssh/id_rsa"

am-titan commented 4 years ago

Hi imranrazakhan,

Thanks for your answer. AFAIK, I did try, but I'm not sure anymore. I will give it another try and let you know.

Thanks

am-titan commented 4 years ago

openshift_storage_glusterfs_heketi_is_native=true openshift_storage_glusterfs_heketi_executor=ssh openshift_storage_glusterfs_heketi_ssh_port=22 openshift_storage_glusterfs_heketi_ssh_user=root openshift_storage_glusterfs_heketi_ssh_sudo=false openshift_storage_glusterfs_heketi_ssh_keyfile="/root/.ssh/id_rsa"

Thanks imranrazakhan,

it seems that implementing according to your advice had a very positive effect. Unfortunately now I get the "New Node doesn't have glusterd running". But it is running and installed, I've checked again all all nodes.

fatal: [os-master.example.com]: FAILED! => {"changed": true, "cmd": ["oc", "--config=/tmp/openshift-glusterfs-ansible-tidGZg/admin.kubeconfig", "rsh", "--namespace=glusterfs", "deploy-heketi-storage-1-r6hhd", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-tidGZg/topology.json", "2>&1"], "delta": "0:00:01.103858", "end": "2020-01-27 15:41:53.157740", "failed_when_result": true, "rc": 0, "start": "2020-01-27 15:41:52.053882", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: e\r\n\tAllowing file volumes on cluster.\r\n\tAllowing block volumes on cluster.\r\n\tCreating node os-master.example.com ... Unable to create node: New Node doesn't have glusterd running\r\n\tCreating node os-infra.example.com ... Unable to create node: New Node doesn't have glusterd running\r\n\tCreating node os-node.example.com ... Unable to create node: New Node doesn't have glusterd running\r\n\tCreating node os-storage.example.com ... Unable to create node: New Node doesn't have glusterd running", "stdout_lines": ["Creating cluster ... ID: ", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node os-master.example.com ... Unable to create node: New Node doesn't have glusterd running", "\tCreating node os-infra.example.com ... Unable to create node: New Node doesn't have glusterd running", "\tCreating node os-node.example.com ... Unable to create node: New Node doesn't have glusterd running", "\tCreating node os-storage.example.com ... Unable to create node: New Node doesn't have glusterd running"]}

It's maybe worth mentioning that I have implemented the following commands according to the official old guide:

setsebool -P virt_sandbox_use_fusefs on 
setsebool -P virt_use_fusefs on

and I installed the:
yum install glusterfs-fuse
yum update glusterfs-fuse
 and I run a containerized GlusterFS

Thanks

imranrazakhan commented 4 years ago

What's glusterfs version do u have installed? I think default version in centos is 6 and latest heketi is expecting glusterfs 7

am-titan commented 4 years ago

I am using lentos 7: CentOS Linux release 7.7.1908 (Core)

and the version is: glusterfs --version glusterfs 3.12.2

imranrazakhan commented 4 years ago

I have below version and its working fine

# glusterfs --version
glusterfs 6.1

am-titan commented 4 years ago

Now I'm a bit confused, I have the latest updates on Centos 7 but still having version 3?

# glusterfs --version
glusterfs 3.12.2

Maybe I missed something in the guide?

Thanks :)

am-titan commented 4 years ago

Ok, so I assume that need the glisters-server on the master at least (maybe it's not in the official openshift guide or I missed it?)

I have glusterfs-fuse installed on all nodes also on the master.
Now I installed the glusterfs-server, output: # glusterfs --version glusterfs 6.7 Repository revision: git://git.gluster.org/glusterfs.git Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/> GlusterFS comes with ABSOLUTELY NO WARRANTY. It is licensed to you under your choice of the GNU Lesser General Public License, version 3 or any later version (LGPLv3 or later), or the GNU General Public License, version 2 (GPLv2), in all cases as published by the Free Software Foundation.

I hope It's Ok?

am-titan commented 4 years ago

I did the following but still no luck.

Removed on all nodes + master:

yum remove glusterfs-fuse (3.12.2)

Installed on all nodes + master:

```
yum search centos-release-gluster
yum install centos-release-gluster6
yum install glusterfs gluster-cli glusterfs-libs glusterfs-server
```
Now I'm getting "containers with unready status: [glusterfs]", "reason": "ContainersNotReady", "status": "False", "type": "ContainersReady"}, {"lastProbeTime": null, "lastTransitionTime":

am-titan commented 4 years ago

I have below version and its working fine
# glusterfs --version
glusterfs 6.1

Still getting Unable to create node: New Node doesn't have glusterd running.

looking into logs and listening ports all nodes, and Glusterd works:

[root@os-storage ~]# ss -tlpn | grep 24007 LISTEN 0 128 *:24007 *:* users:(("glusterd",pid=61472,fd=10)) [root@os-storage ~]# tailf /var/log/glusterfs/glusterd.log 12: option transport.socket.listen-port 24007 13: option transport.socket.read-fail-log off 14: option transport.socket.keepalive-interval 2 15: option transport.socket.keepalive-time 10 16: option transport-type rdma 17: option working-directory /var/lib/glusterd 18: end-volume 19:

Could you please help with directing me on how to update glusterfs version?

I did try the following without success:

yum search centos-release-gluster yum install centos-release-gluster6 yum install glusterfs gluster-cli glusterfs-libs glusterfs-server

I have seen this thread and it seems to be related also seems that there is an inconsistency with the OKD guide and the Glusterfs version...https://github.com/openshift/openshift-ansible/issues/12087

imranrazakhan commented 4 years ago

Please share your complete inventory file, i will compare with mine.

am-titan commented 4 years ago

Thanks :), here it is

# Backup of all inventory data
[OSEv3:children]
masters
etcd
nodes
glusterfs

[OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=origin
openshift_release="3.11"
openshift_image_tag="v3.11"
openshift_master_default_subdomain=apps.example.com
openshift_docker_selinux_enabled=false
openshift_check_min_host_memory_gb=16
openshift_check_min_host_disk_gb=50
openshift_disable_check=docker_image_availability
openshift_master_dynamic_provisioning_enabled=true
openshift_registry_selector="role=infra"
openshift_hosted_registry_storage_kind=glusterfs

openshift_metrics_install_metrics=true
openshift_metrics_cassandra_storage_type=pv
openshift_logging_elasticsearch_storage_type=pv
openshift_metrics_hawkular_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_cassandra_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_heapster_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_storage_volume_size=20Gi
openshift_metrics_cassandra_pvc_storage_class_name="gluster-infra-storage"

openshift_logging_install_logging=true
openshift_logging_es_pvc_dynamic=true openshift_logging_storage_kind=dynamic
openshift_logging_kibana_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_pvc_size=20Gi
openshift_logging_es_pvc_storage_class_name="gluster-infra-storage"

openshift_storage_glusterfs_heketi_is_native=true
openshift_storage_glusterfs_heketi_executor=ssh
openshift_storage_glusterfs_heketi_ssh_port=22
openshift_storage_glusterfs_heketi_ssh_user=root
openshift_storage_glusterfs_heketi_ssh_sudo=false
openshift_storage_glusterfs_heketi_ssh_keyfile="/root/.ssh/id_rsa"

openshift_storage_glusterfs_timeout=900
openshift_storage_glusterfs_registry_namespace=infra-storage
openshift_storage_glusterfs_registry_storageclass=false
openshift_storage_glusterfs_registry_storageclass_default=false

[masters]
os-master.example.com

[etcd]
os-master.example.com

[nodes]
os-master.example.com openshift_node_group_name="node-config-master"
os-infra.example.com openshift_node_group_name="node-config-infra"
os-storage.example.com openshift_node_group_name="node-config-compute"
os-node.example.com openshift_node_group_name="node-config-compute"

[glusterfs]
os-master.example.com glusterfs_ip='192.168.1.212' glusterfs_devices='["/dev/sdb"]'
os-infra.example.com glusterfs_ip='192.168.1.213' glusterfs_devices='["/dev/sdb"]'
os-node.example.com glusterfs_ip='192.168.1.214' glusterfs_devices='["/dev/sdb"]'
os-storage.example.com glusterfs_ip='192.168.1.215' glusterfs_devices='["/dev/sdb"]'

When I add [glusterfs_registry] the glusterfs will not even be created.

imranrazakhan commented 4 years ago

retry by adding below line

openshift_storage_glusterfs_is_native=false

am-titan commented 4 years ago

retry by adding below line

openshift_storage_glusterfs_is_native=false

After adding this line, it's not deploying, that's the error I'm getting:

failed: [os-master.example.com -> os-master.example.com] (item=os-master.example.com) => {"ansible_loop_var": "item", "changed": false, "item": "os-master.example.com", "msg": "Could not find the requested service glusterd: host"} failed: [os-master.example.com -> os-infra.example.com] (item=os-infra.example.com) => {"ansible_loop_var": "item", "changed": false, "item": "os-infra.example.com", "msg": "Could not find the requested service glusterd: host"} failed: [os-master.example.com -> os-node.example.com] (item=os-node.example.com) => {"ansible_loop_var": "item", "changed": false, "item": "os-node.example.com", "msg": "Could not find the requested service glusterd: host"} failed: [os-master.example.com -> os-storage.example.com] (item=os-storage.example.com) => {"ansible_loop_var": "item", "changed": false, "item": "os-storage.example.com", "msg": "Could not find the requested service glusterd: host"}

And I get the following with openshift_storage_glusterfs_is_native=true:

fatal: [os-master.example.com]: FAILED! => {"changed": true, "cmd": ["oc", "--config=/tmp/openshift-glusterfs-ansible-FRM2my/admin.kubeconfig", "rsh", "--namespace=glusterfs", "deploy-heketi-storage-1-rfdfl", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--example.com", "xxxx=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-FRM2my/topology.json", "2>&1"], "delta": "0:00:01.147419", "end": "2020-01-30 10:08:35.075266", "failed_when_result": true, "rc": 0, "start": "2020-01-30 10:08:33.927847", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: xxx\r\n\tAllowing file volumes on cluster.\r\n\tAllowing block volumes on cluster.\r\n\tCreating node os-master.example.com ... Unable to create node: New Node doesn't have glusterd running\r\n\tCreating node os-infra.example.com ... Unable to create node: New Node doesn't have glusterd running\r\n\tCreating node os-node.example.com ... Unable to create node: New Node doesn't have glusterd running\r\n\tCreating node os-storage.example.com ... Unable to create node: New Node doesn't have glusterd running", "stdout_lines": ["Creating cluster ... ID: xxxxxx", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node os-master.example.com ... Unable to create node: New Node doesn't have glusterd running", "\tCreating node os-infra.example.com ... Unable to create node: New Node doesn't have glusterd running", "\tCreating node os-node.example.com ... Unable to create node: New Node doesn't have glusterd running", "\tCreating node os-storage.example.com ... Unable to create node: New Node doesn't have glusterd running"]} And in this case it will also deploy all nodes correctly, but not glusterfs:

[root@os-master ~]# oc get pod --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE default docker-registry-1-d9ghv 1/1 Running 2 5d 10.130.0.38 os-infra.example.com <none> default registry-console-1-4z76t 1/1 Running 2 5d 10.128.0.37 os-master.example.com <none> default router-1-n8mwz 1/1 Running 2 5d 192.168.1.213 os-infra.example.com <none> glusterfs deploy-heketi-storage-1-rfdfl 1/1 Running 0 15m 10.128.0.43 os-master.example.com <none> glusterfs glusterfs-storage-7lkrb 1/1 Running 0 17m 192.168.1.215 os-storage.example.com <none> glusterfs glusterfs-storage-gg7wp 1/1 Running 0 17m 192.168.1.212 os-master.example.com <none> glusterfs glusterfs-storage-kjdbx 1/1 Running 0 17m 192.168.1.214 os-node.example.com <none> glusterfs glusterfs-storage-pcnh8 1/1 Running 0 17m 192.168.1.213 os-infra.example.com <none> kube-system master-api-os-master.example.com 1/1 Running 2 5d 192.168.1.212 os-master.example.com <none> kube-system master-controllers-os-master.example.com 1/1 Running 2 5d 192.168.1.212 os-master.example.com <none> kube-system master-etcd-os-master.example.com 1/1 Running 2 5d 192.168.1.212 os-master.example.com <none> openshift-console console-5dbffd9df-rrt4v 1/1 Running 2 5d 10.128.0.38 os-master.example.com <none> openshift-infra hawkular-cassandra-1-n5k86 0/1 Pending 0 5d <none> <none> <none> openshift-infra hawkular-metrics-p655z 0/1 Running 5 5d 10.130.0.37 os-infra.example.com <none> openshift-infra hawkular-metrics-schema-qh9z4 1/1 Running 2 5d 10.128.0.39 os-master.example.com <none> openshift-infra heapster-42nr8 0/1 Running 5 5d 10.130.0.46 os-infra.example.com <none> openshift-logging logging-curator-1580351400-zclct 0/1 Error 0 34m 10.130.0.39 os-infra.example.com <none> openshift-logging logging-fluentd-5wr5d 1/1 Running 2 5d 10.130.0.45 os-infra.example.com <none> openshift-logging logging-fluentd-89tn7 1/1 Running 2 5d 10.131.0.4 os-storage.example.com <none> openshift-logging logging-fluentd-g2dcg 1/1 Running 2 5d 10.128.0.40 os-master.example.com <none> openshift-logging logging-fluentd-vn9fh 1/1 Running 2 5d 10.129.0.4 os-node.example.com <none> openshift-logging logging-kibana-1-c4l8m 2/2 Running 4 5d 10.130.0.51 os-infra.example.com <none> openshift-metrics-server metrics-server-56cd9bfcf-x42s2 1/1 Running 2 5d 10.130.0.49 os-infra.example.com <none> openshift-monitoring alertmanager-main-0 3/3 Running 6 5d 10.130.0.40 os-infra.example.com <none> openshift-monitoring alertmanager-main-1 3/3 Running 6 5d 10.130.0.42 os-infra.example.com <none> openshift-monitoring alertmanager-main-2 3/3 Running 6 5d 10.130.0.43 os-infra.example.com <none> openshift-monitoring cluster-monitoring-operator-8578656f6f-c5qt5 1/1 Running 2 5d 10.130.0.47 os-infra.example.com <none> openshift-monitoring grafana-6b9f85786f-fmq4g 2/2 Running 4 5d 10.130.0.52 os-infra.example.com <none> openshift-monitoring kube-state-metrics-c4f86b5f8-ksjrv 3/3 Running 6 5d 10.130.0.41 os-infra.example.com <none> openshift-monitoring node-exporter-kgg7t 2/2 Running 4 5d 192.168.1.213 os-infra.example.com <none> openshift-monitoring node-exporter-p7psp 2/2 Running 4 5d 192.168.1.215 os-storage.example.com <none> openshift-monitoring node-exporter-p9cfw 2/2 Running 4 5d 192.168.1.212 os-master.example.com <none> openshift-monitoring node-exporter-skzx4 2/2 Running 4 5d 192.168.1.214 os-node.example.com <none> openshift-monitoring prometheus-k8s-0 4/4 Running 9 5d 10.130.0.44 os-infra.example.com <none> openshift-monitoring prometheus-k8s-1 4/4 Running 9 5d 10.130.0.48 os-infra.example.com <none> openshift-monitoring prometheus-operator-6644b8cd54-98qxm 1/1 Running 2 5d 10.130.0.50 os-infra.example.com <none> openshift-node sync-mv2rk 1/1 Running 2 5d 192.168.1.212 os-master.example.com <none> openshift-node sync-qfdrb 1/1 Running 2 5d 192.168.1.213 os-infra.example.com <none> openshift-node sync-rp2jd 1/1 Running 2 5d 192.168.1.214 os-node.example.com <none> openshift-node sync-z9nc9 1/1 Running 2 5d 192.168.1.215 os-storage.example.com <none> openshift-sdn ovs-bfvpl 1/1 Running 2 5d 192.168.1.214 os-node.example.com <none> openshift-sdn ovs-f9mmg 1/1 Running 2 5d 192.168.1.213 os-infra.example.com <none> openshift-sdn ovs-gv55s 1/1 Running 2 5d 192.168.1.215 os-storage.example.com <none> openshift-sdn ovs-ms8xs 1/1 Running 2 5d 192.168.1.212 os-master.example.com <none> openshift-sdn sdn-64q7d 1/1 Running 2 5d 192.168.1.213 os-infra.example.com <none> openshift-sdn sdn-79bjn 1/1 Running 2 5d 192.168.1.214 os-node.example.com <none> openshift-sdn sdn-jxj4n 1/1 Running 2 5d 192.168.1.212 os-master.example.com <none> openshift-sdn sdn-pl65n 1/1 Running 2 5d 192.168.1.215 os-storage.example.com <none> openshift-web-console webconsole-7fc8759f7b-tm9rl 1/1 Running 4 5d 10.128.0.41 os-master.example.com <none> [root@os-master ~]#

thx

zaheer965 commented 4 years ago

Do you want to use all nodes as storage??

[glusterfs]
os-master.example.com glusterfs_ip='192.168.1.212' glusterfs_devices='["/dev/sdb"]'
os-infra.example.com glusterfs_ip='192.168.1.213' glusterfs_devices='["/dev/sdb"]'
os-node.example.com glusterfs_ip='192.168.1.214' glusterfs_devices='["/dev/sdb"]'
os-storage.example.com glusterfs_ip='192.168.1.215' glusterfs_devices='["/dev/sdb"]'

if yes does all has gluster installed and /dev/sdb available? mostly it expect 3 nodes for storage

am-titan commented 4 years ago

Thanks zaheer965,

I don't really need Glusterfs on all nodes, if using the master as a Gluster node too is recommended, or best practice? if not I can remove the glusterfs from the master, no problem. but I get the error glusterd not found on all nodes. and glusterfs 3.12.2 according to the official docs is installed.

Yes /dev/sdb is wiped available and ready.

[root@os-master ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 120G 0 disk ├─sda1 8:1 0 200M 0 part /boot/efi ├─sda2 8:2 0 1G 0 part /boot └─sda3 8:3 0 111.1G 0 part ├─centos-root 253:0 0 20G 0 lvm / ├─centos-swap 253:1 0 20G 0 lvm
├─centos-home 253:2 0 30.1G 0 lvm /home └─centos-var 253:3 0 41G 0 lvm /var sdb 8:16 0 200G 0 disk sr0 11:0 1 942M 0 rom

am-titan commented 4 years ago

I have below version and its working fine

# glusterfs --version
glusterfs 6.1

Got some information about how to get version 6, from @kanadaj in #12087 will try it tomorrow.

am-titan commented 4 years ago

Thanks guys but still no luck.

Looks like the error is the same as before also with glusterfs version 6.0. Getting this: , "\tCreating node os-storage.example.com ... Unable to create node: New Node doesn't have glusterd

Log output:

[root@os-master ~]# tailf /var/log/glusterfs/glustersd.log tailf: stat failed /var/log/glusterfs/glustersd.log: No such file or directory [root@os-master ~]# tailf /var/log/glusterfs/ bricks/ cmd_history.log container/ geo-replication/ geo-replication-slaves/ gluster-block/ glusterd.log [root@os-master ~]# tailf /var/log/glusterfs/glusterd.log 12: option transport.socket.listen-port 24007 13: option transport.socket.read-fail-log off 14: option transport.socket.keepalive-interval 2 15: option transport.socket.keepalive-time 10 16: option transport-type rdma 17: option working-directory /var/lib/glusterd 18: end-volume 19: +------------------------------------------------------------------------------+ [2020-01-31 12:34:57.108079] I [MSGID: 101190] [event-epoll.c:682:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0

Thx

kanadaj commented 4 years ago

You need GlusterFS installed manually on ALL hosts I believe.

am-titan commented 4 years ago

You need GlusterFS installed manually on ALL hosts I believe.

I did this with according to your recommendation, with both version 6 and 7, on all hosts.

Do you mean that glusterfs should be installed on the host OS itself (In my case CentOs 7)?

When I execute:

#systemctl status glusterd
Unit glusterd.service could not be found.

But I have this:

# glusterfs --version glusterfs 6.0

And grep give me this:

# rpm -qa |grep gluster
glusterfs-libs-6.0-1.el6.x86_64
glusterfs-6.0-1.el6.x86_64
glusterfs-fuse-6.0-1.el6.x86_64
glusterfs-client-xlators-6.0-1.el6.x86_64

`wget https://www.mirrorservice.org/sites/mirror.centos.org/6/storage/x86_64/gluster-6/glusterfs-6.0-1.el6.x86_64.rpm wget https://www.mirrorservice.org/sites/mirror.centos.org/6/storage/x86_64/gluster-6/glusterfs-libs-6.0-1.el6.x86_64.rpm wget https://www.mirrorservice.org/sites/mirror.centos.org/6/storage/x86_64/gluster-6/glusterfs-client-xlators-6.0-1.el6.x86_64.rpm wget https://www.mirrorservice.org/sites/mirror.centos.org/6/storage/x86_64/gluster-6/glusterfs-fuse-6.0-1.el6.x86_64.rpm

rpm -i glusterfs-libs-6.0-1.el6.x86_64.rpm rpm -i glusterfs-client-xlators-6.0-1.el6.x86_64.rpm rpm -i glusterfs-6.0-1.el6.x86_64.rpm rpm -i glusterfs-fuse-6.0-1.el6.x86_64.rpm`

Should I include the master in the inventory as a glusterfs node? Maybe that's what causes the problem?

[glusterfs] os-master.example.com glusterfs_ip='192.168.1.212' glusterfs_devices='["/dev/sdb"]' os-infra.example.com glusterfs_ip='192.168.1.213' glusterfs_devices='["/dev/sdb"]' os-node.example.com glusterfs_ip='192.168.1.214' glusterfs_devices='["/dev/sdb"]' os-storage.example.com glusterfs_ip='192.168.1.215' glusterfs_devices='["/dev/sdb"]'

kanadaj commented 4 years ago

It shouldn't matter whether the master is included or not in the gluster list.

Do keep in mind that with the upgrade to gluster-fuse, the gluster pod might fail to launch altogether due to mismatching file versions. To fix this, you need to wipefs -a all gluster drives and delete all glusterfs data due to compatibility issues:

rm -rf /var/lib/glusterd

Then make sure all the gluster related pods are removed, then try again. I've had an issue like that after fixing the glusterfs versions where the /var/lib/glusterd/glusterd.info contained a different version and thus glusterd refused to start.

It's honestly not a straightforward thing to install.

am-titan commented 4 years ago

Thanks for the tips, I’ll try these later on.

The good thing I did are two snapshots, one for fresh install of centos7 And another after the cluster_deploy with minimal host file (without, logging, monitoring, glusterfs) just the basic webinterface, etc.

So I can revert everything instead of uninstalling. In the second snapshot glusterfs 3 is already installed, so I need to try this.

am-titan commented 4 years ago

I think I found the root cause. even after full removal the /var/lib/glusterd will go away but version 3 will stay!

root@os-node ~]# yum install glusterfs-server glusterfs-fuse glusterfs-libs
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.alpix.eu
 * extras: mirror.alpix.eu
 * updates: mirror.alpix.eu
base                                                                                                                                                                                                                      | 3.6 kB  00:00:00     
centos-openshift-origin311                                                                                                                                                                                                | 2.9 kB  00:00:00     
extras                                                                                                                                                                                                                    | 2.9 kB  00:00:00     
updates                                                                                                                                                                                                                   | 2.9 kB  00:00:00     
No package glusterfs-server available.
Package matching **glusterfs-fuse-3.12.2-47.2.el7.x86_64** already installed. Checking for update.
Package matching **glusterfs-libs-3.12.2-47.2.el7.x86_64** already installed. Checking for update.
Nothing to do
[root@os-node ~]# glusterfs --version
glusterfs 6.0
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.

Dependencies Resolved

=================================================================================================================================================================================================================================================
 Package                                                               Arch                                                Version                                                       Repository                                         Size
=================================================================================================================================================================================================================================================
Installing:
 glusterfs-fuse                                                        x86_64                                              3.12.2-47.2.el7                                               base                                              126 k
 glusterfs-libs                                                        x86_64                                              3.12.2-47.2.el7                                               base                                              387 k
Installing for dependencies:
 glusterfs                                                             x86_64                                              3.12.2-47.2.el7                                               base                                              512 k
 glusterfs-client-xlators                                              x86_64                                              3.12.2-47.2.el7                                               base                                              883 k

Transaction Summary
=================================================================================================================================================================================================================================================
Install  2 Packages (+2 Dependent packages)

Total download size: 1.9 M
Installed size: 8.3 M
Is this ok [y/d/N]: y
Downloading packages:
(1/4): glusterfs-fuse-3.12.2-47.2.el7.x86_64.rpm                                                                                                                                                                          | 126 kB  00:00:00     
(2/4): glusterfs-client-xlators-3.12.2-47.2.el7.x86_64.rpm                                                                                                                                                                | 883 kB  00:00:00     
(3/4): glusterfs-libs-3.12.2-47.2.el7.x86_64.rpm                                                                                                                                                                          | 387 kB  00:00:00     
(4/4): glusterfs-3.12.2-47.2.el7.x86_64.rpm

even after: glusterfs --version glusterfs 6.0 Repository revision: git://git.gluster.org/glusterfs.git Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/> GlusterFS comes with ABSOLUTELY NO WARRANTY. It is licensed to you under your choice of the GNU Lesser General Public License, version 3 or any later version (LGPLv3 or later), or the GNU General Public License, version 2 (GPLv2), in all cases as published by the Free Software Foundation.

Now reinstalling the hosts+master from scratch to be sure.

am-titan commented 4 years ago

With a fresh new install it seems that glusterfs 3 is still inside, even with the most minimal core install.


Packages skipped because of dependency problems:
    attr-2.4.46-13.el7.x86_64 from base
    glusterfs-fuse-3.12.2-47.2.el7.x86_64 from base
    psmisc-22.20-16.el7.x86_64 from base
[root@os-master ~]# rpm -i glusterfs-fuse-6.0-1.el6.x86_64.rpm
warning: glusterfs-fuse-6.0-1.el6.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID e451e5b5: NOKEY
error: Failed dependencies:
        attr is needed by glusterfs-fuse-6.0-1.el6.x86_64
        psmisc is needed by glusterfs-fuse-6.0-1.el6.x86_64

Error: Package: glusterfs-fuse-3.12.2-47.2.el7.x86_64 (base)
           Requires: glusterfs(x86-64) = 3.12.2-47.2.el7
           Installed: glusterfs-6.0-1.el6.x86_64 (installed)
               glusterfs(x86-64) = 6.0-1.el6
           Available: glusterfs-3.12.2-47.2.el7.x86_64 (base)
               glusterfs(x86-64) = 3.12.2-47.2.el7
Error: Package: glusterfs-fuse-3.12.2-47.2.el7.x86_64 (base)
           Requires: glusterfs-client-xlators(x86-64) = 3.12.2-47.2.el7
           Installed: glusterfs-client-xlators-6.0-1.el6.x86_64 (installed)
               glusterfs-client-xlators(x86-64) = 6.0-1.el6
           Available: glusterfs-client-xlators-3.12.2-47.2.el7.x86_64 (base)
               glusterfs-client-xlators(x86-64) = 3.12.2-47.2.el7
Error: Package: glusterfs-fuse-3.12.2-47.2.el7.x86_64 (base)
           Requires: glusterfs-libs(x86-64) = 3.12.2-47.2.el7
           Installed: glusterfs-libs-6.0-1.el6.x86_64 (installed)
               glusterfs-libs(x86-64) = 6.0-1.el6
           Available: glusterfs-libs-3.12.2-47.2.el7.x86_64 (base)
               glusterfs-libs(x86-64) = 3.12.2-47.2.el7
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

kanadaj commented 4 years ago

Before installing the glusterfs 6 or 7 libs you want to uninstall glusterfs3:

yum remove glusterfs-fuse glusterfs-libs

am-titan commented 4 years ago

Thanks again.

I have summed it up into another test that I made, starting with clean core install really minimal.
Verified that no Glusterfs comments are installed.

With a fresh Centos 7 core minimal install, Here are the details:
——————————————installing ———————————————
1.
# yum remove glusterfs-fuse glusterfs-libs

Loaded plugins: fastestmirror
No Match for argument: glusterfs-fuse
No Match for argument: glusterfs-libs
No Packages marked for removal

2.
# yum install epel-release -y
Loaded plugins: fastestmirror
Determining fastest mirrors
 * base: linux.darkpenguin.net
 * extras: mirror.ratiokontakt.de
 * updates: centos.mirror.net-d-sign.de
base                                                                                                                                                                                                                      | 3.6 kB  00:00:00     
extras                                                                                                                                                                                                                    | 2.9 kB  00:00:00     
updates                                                                                                                                                                                                                   | 2.9 kB  00:00:00     
(1/4): base/7/x86_64/group_gz                                                                                                                                                                                             | 165 kB  00:00:00     
(2/4): extras/7/x86_64/primary_db                                                                                                                                                                                         | 159 kB  00:00:00     
(3/4): base/7/x86_64/primary_db                                                                                                                                                                                           | 6.0 MB  00:00:02     
(4/4): updates/7/x86_64/primary_db                                                                                                                                                                                        | 5.9 MB  00:00:02     
Resolving Dependencies
--> Running transaction check
---> Package epel-release.noarch 0:7-11 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=================================================================================================================================================================================================================================================
 Package                                                        Arch                                                     Version                                                  Repository                                                Size
=================================================================================================================================================================================================================================================
Installing:
 epel-release                                                   noarch                                                   7-11                                                     extras                                                    15 k

Transaction Summary
=================================================================================================================================================================================================================================================
Install  1 Package

Total download size: 15 k
Installed size: 24 k
Downloading packages:
warning: /var/cache/yum/x86_64/7/extras/packages/epel-release-7-11.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID f4a80eb5: NOKEY
Public key for epel-release-7-11.noarch.rpm is not installed
epel-release-7-11.noarch.rpm                                                                                                                                                                                              |  15 kB  00:00:00     
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
Importing GPG key 0xF4A80EB5:
 Userid     : "CentOS-7 Key (CentOS 7 Official Signing Key) <security@centos.org>"
 Fingerprint: 6341 ab27 53d7 8a78 a7c2 7bb1 24c6 a8a7 f4a8 0eb5
 Package    : centos-release-7-7.1908.0.el7.centos.x86_64 (@anaconda)
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : epel-release-7-11.noarch                                                                                                                                                                                                      1/1 
  Verifying  : epel-release-7-11.noarch                                                                                                                                                                                                      1/1 

Installed:
  epel-release.noarch 0:7-11                                                                                                                                                                                                                     

Complete!

3.
# yum install wget

# yum install attr       (to overcome dependencies error see notes at the end of thread) 

# yum install psmisc (to overcome dependencies error see notes at the end of thread) 

4. 
# wget http://mirror.centos.org/centos/7/storage/x86_64/gluster-6/glusterfs-6.7-1.el7.x86_64.rpm ;
# wget http://mirror.centos.org/centos/7/storage/x86_64/gluster-6/glusterfs-libs-6.7-1.el7.x86_64.rpm ;
# wget http://mirror.centos.org/centos/7/storage/x86_64/gluster-6/glusterfs-client-xlators-6.7-1.el7.x86_64.rpm ;
# wget http://mirror.centos.org/centos/7/storage/x86_64/gluster-6/glusterfs-fuse-6.7-1.el7.x86_64.rpm ;

rpm -i glusterfs-libs-6.7-1.el7.x86_64.rpm ;

rpm -i glusterfs-client-xlators-6.7-1.el7.x86_64.rpm ;

rpm -i glusterfs-6.7-1.el7.x86_64.rpm ;

rpm -i glusterfs-fuse-6.7-1.el7.x86_64.rpm ;

————————Testing————————

5.
# rpm -qa |grep glusterfs
**glusterfs-libs-6.7-1.el7.x86_64
glusterfs-6.7-1.el7.x86_64
glusterfs-client-xlators-6.7-1.el7.x86_64
glusterfs-fuse-6.7-1.el7.x86_64**

6. 
# glusterfs --version
glusterfs 6.7
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.

7.——————————most annoying part, after all version 3 is still there, can’t understand why———————————————
# yum install glusterfs-fuse glusterfs-libs
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.imt-systems.com
 * epel: ftp.uni-stuttgart.de
 * extras: ftp.hosteurope.de
 * updates: mirror.imt-systems.com
**Package matching glusterfs-fuse-3.12.2-47.2.el7.x86_64 already installed. Checking for update.
Package matching glusterfs-libs-3.12.2-47.2.el7.x86_64 already installed. Checking for update.**
Nothing to do
[root@os-storage ~]# yum remove glusterfs-fuse glusterfs-libs
Loaded plugins: fastestmirror
Resolving Dependencies
--> Running transaction check
---> Package glusterfs-fuse.x86_64 0:6.7-1.el7 will be erased
---> Package glusterfs-libs.x86_64 0:6.7-1.el7 will be erased
--> Processing Dependency: glusterfs-libs = 6.7-1.el7 for package: glusterfs-6.7-1.el7.x86_64
--> Processing Dependency: libgfrpc.so.0()(64bit) for package: glusterfs-6.7-1.el7.x86_64
--> Processing Dependency: libgfrpc.so.0()(64bit) for package: glusterfs-client-xlators-6.7-1.el7.x86_64
--> Processing Dependency: libgfxdr.so.0()(64bit) for package: glusterfs-6.7-1.el7.x86_64
--> Processing Dependency: libgfxdr.so.0()(64bit) for package: glusterfs-client-xlators-6.7-1.el7.x86_64
--> Processing Dependency: libglusterfs.so.0()(64bit) for package: glusterfs-6.7-1.el7.x86_64
--> Processing Dependency: libglusterfs.so.0()(64bit) for package: glusterfs-client-xlators-6.7-1.el7.x86_64
--> Running transaction check
---> Package glusterfs.x86_64 0:6.7-1.el7 will be erased
---> Package glusterfs-client-xlators.x86_64 0:6.7-1.el7 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

=================================================================================================================================================================================================================================================
 Package                                                               Arch                                                Version                                                  Repository                                              Size
=================================================================================================================================================================================================================================================
Removing:
 glusterfs-fuse                                                        x86_64                                              6.7-1.el7                                                installed                                              530 k
 glusterfs-libs                                                        x86_64                                              6.7-1.el7                                                installed                                              1.6 M
Removing for dependencies:
 glusterfs                                                             x86_64                                              6.7-1.el7                                                installed                                              2.6 M
 glusterfs-client-xlators                                              x86_64                                              6.7-1.el7                                                installed                                              4.0 M

Transaction Summary
=================================================================================================================================================================================================================================================
Remove  2 Packages (+2 Dependent packages)

——————————Notes————————————————————————

Notes errors I came into all the time:

# rpm -i glusterfs-fuse-6.0-1.el6.x86_64.rpm
warning: glusterfs-fuse-6.0-1.el6.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID e451e5b5: NOKEY
error: Failed dependencies:
        attr is needed by glusterfs-fuse-6.0-1.el6.x86_64
        psmisc is needed by glusterfs-fuse-6.0-1.el6.x86_64

Now I'm really cannot understand why version 3 is so stubbern!

Thanks again

am-titan commented 4 years ago

After all with new version getting:

containers with unready status: [glusterfs]", "reason": "ContainersNotReady",

glusterfs on origin 3.11 is not simple to get installed

kanadaj commented 4 years ago

To debug that one you need the glusterfs and glustershd logs from /var/log/glusterfs. Yes I agree with the sentiment. Consider installing CephFS instead?

am-titan commented 4 years ago

To debug that one you need the glusterfs and glustershd logs from /var/log/glusterfs. Yes I agree with the sentiment. Consider installing CephFS instead?

Thanks, I looked into the logs before but found nothing that points me somewhere, the playbook break in the install phase, while everything else completed well. I wanted to get the feel of working with glusterfs. having worked with central Storage solution before instead of SDS . The one advantage I wanted to see in SDS this is being able bind few servers running local disks into a one cluster.

I guess you’re right, I’ll use ceph as I understand that ceph is more robust solution.

am-titan commented 4 years ago

Thanks guys, you have been right, after upgrading the Gluster version and choosing a minimal inventory (step by step) it's working :)


INSTALLER STATUS ********************************************************************************************************************************************************************************************************************************
Initialization               : Complete (0:00:25)
Health Check                 : Complete (0:00:28)
Node Bootstrap Preparation   : Complete (0:03:43)
etcd Install                 : Complete (0:00:39)
Master Install               : Complete (0:04:06)
Master Additional Install    : Complete (0:00:38)
Node Join                    : Complete (0:00:38)
GlusterFS Install            : Complete (0:04:36)
Hosted Install               : Complete (0:00:57)
Cluster Monitoring Operator  : Complete (0:00:46)
Web Console Install          : Complete (0:00:43)
Console Install              : Complete (0:00:32)
Service Catalog Install      : Complete (0:03:26)

I'll sum it all up and post the details for whoever runs into a similar situation.

Many Thanks, 👍💯

am-titan commented 4 years ago

So here is the solution:

First remove Glusterfs 3 like suggested by kanadaj: yum remove glusterfs-fuse glusterfs-libs
Install Glusterfs like suggested by kanadaj and imranrazakhan:

wget http://mirror.centos.org/centos/7/storage/x86_64/gluster-6/glusterfs-6.7-1.el7.x86_64.rpm ;
wget http://mirror.centos.org/centos/7/storage/x86_64/gluster-6/glusterfs-libs-6.7-1.el7.x86_64.rpm ;
wget http://mirror.centos.org/centos/7/storage/x86_64/gluster-6/glusterfs-client-xlators-6.7-1.el7.x86_64.rpm ;
wget http://mirror.centos.org/centos/7/storage/x86_64/gluster-6/glusterfs-fuse-6.7-1.el7.x86_64.rpm ;

rpm -i glusterfs-6.7-1.el7.x86_64.rpm ;
rpm -i glusterfs-libs-6.7-1.el7.x86_64.rpm ;
rpm -i glusterfs-client-xlators-6.7-1.el7.x86_64.rpm ;
rpm -i glusterfs-fuse-6.7-1.el7.x86_64.rpm ;

Uninstall any version from before using the uninstall.yml.
Deploy prerqusits.yml and then deploy_cluster.yml with the RIGHT INVENTORY FILE (one for the containerised deploy and one for the external deploy, this is important to remember)
My working containerised inventory:

# Containerized GlusterFS inventory
[OSEv3:children]
masters
nodes
glusterfs

[OSEv3:vars]
install_method=rpm
os_update=false
install_update_docker=true
ansible_ssh_user=root
openshift_deployment_type=origin
openshift_release="3.11"

openshift_storage_glusterfs_namespace=app-storage
openshift_storage_glusterfs_storageclass=true
openshift_storage_glusterfs_storageclass_default=false
openshift_storage_glusterfs_block_deploy=true
openshift_storage_glusterfs_block_host_vol_size=100
openshift_storage_glusterfs_block_storageclass=true
openshift_storage_glusterfs_block_storageclass_default=false

openshift_storage_glusterfs_heketi_admin_key='xxxxxxx'
openshift_storage_glusterfs_heketi_user_key='xxxxxxx'

[masters]
os-master.example.com

[etcd]
os-master.example.com

[nodes]
os-master.example.com openshift_node_group_name="node-config-master"
os-infra.example.com openshift_node_group_name="node-config-infra"
os-storage.example.com openshift_node_group_name="node-config-compute"
os-node.example.com openshift_node_group_name="node-config-compute"

[glusterfs]
os-master.example.com glusterfs_ip='192.168.1.212' glusterfs_devices='["/dev/sdb"]'
os-infra.example.com glusterfs_ip='192.168.1.213' glusterfs_devices='["/dev/sdb"]'
os-node.example.com glusterfs_ip='192.168.1.214' glusterfs_devices='["/dev/sdb"]'
os-storage.example.com glusterfs_ip='192.168.1.215' glusterfs_devices='["/dev/sdb"]'

all the setup I had here is going to be deleted and redeploy with new details (ip's, secrets, hostname, etc.)

Many Thanks kanadaj and imranrazakhan

openshift-bot commented 4 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot commented 4 years ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift / openshift-ansible