smossber commented 6 years ago

Description

After a failed GlusterFS deployment I wanted to wipe everything as I've understood that the config playbook is not idempotent.

But when running /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-glusterfs/config.yml with openshift_storage_glusterfs_wipe set to true, I get an error similar to https://github.com/openshift/openshift-ansible/issues/5548 on the task "Unlabel any existing GlusterFS nodes". It complains that the dict object openshift can't be found.

Version

Please put the following version information in the code block indicated below.

Your ansible version per ansible --version

If you're running from playbooks installed via RPM or atomic-openshift-utils

The output of rpm -q atomic-openshift-utils openshift-ansible

Place the output between the code block below:

# ansible --version
ansible 2.4.1.0

# rpm -q atomic-openshift-utils openshift-ansible
atomic-openshift-utils-3.7.14-1.git.0.4b35b2d.el7.noarch
openshift-ansible-3.7.14-1.git.0.4b35b2d.el7.noarch

Steps To Reproduce

[step 1] ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-glusterfs/config.yml -e openshift_storage_glusterfs_wipe=true
[step 2]

Expected Results

Describe what you expected to happen.

Ansible playbook to be able to perform the step.

Observed Results

Describe what is actually happening.

TASK [openshift_storage_glusterfs : Delete pre-existing heketi resources] *****************************************************************************************************************************************
ok: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'template,route,service,dc,jobs,secret', u'selector': u'deploy-heketi'})
changed: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'svc', u'name': u'heketi-storage-endpoints'})
changed: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'secret', u'name': u'heketi-storage-topology-secret'})
changed: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'secret', u'name': u'heketi-storage-config-secret'})
ok: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'template,route,service,dc', u'name': u'heketi-storage'})
changed: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'svc', u'name': u'heketi-db-storage-endpoints'})
changed: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'sa', u'name': u'heketi-storage-service-account'})
changed: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'secret', u'name': u'heketi-storage-admin-secret'})

TASK [openshift_storage_glusterfs : Wait for deploy-heketi pods to terminate] *************************************************************************************************************************************
ok: [master1.openshift.mitzicom.int.m0sslab.org]

TASK [openshift_storage_glusterfs : Wait for heketi pods to terminate] ********************************************************************************************************************************************
ok: [master1.openshift.mitzicom.int.m0sslab.org]

TASK [openshift_storage_glusterfs : assert] ***********************************************************************************************************************************************************************
ok: [master1.openshift.mitzicom.int.m0sslab.org] => {
    "changed": false, 
    "failed": false, 
    "msg": "All assertions passed"
}

TASK [openshift_storage_glusterfs : Delete pre-existing GlusterFS resources] **************************************************************************************************************************************
changed: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'template', u'name': u'glusterfs'})
changed: [master1.openshift.mitzicom.int.m0sslab.org] => (item={u'kind': u'daemonset', u'name': u'glusterfs-storage'})

TASK [openshift_storage_glusterfs : Unlabel any existing GlusterFS nodes] *****************************************************************************************************************************************
fatal: [master1.openshift.mitzicom.int.m0sslab.org]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'openshift'\n\nThe error appears to have been in '/usr/share/ansible/openshift-ansible/roles/openshift_storage_glusterfs/tasks/glusterfs_deploy.yml': line 19, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Unlabel any existing GlusterFS nodes\n  ^ here\n\nexception type: <class 'ansible.errors.AnsibleUndefinedVariable'>\nexception: 'dict object' has no attribute 'openshift'"}
        to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-glusterfs/config.retry

PLAY RECAP ********************************************************************************************************************************************************************************************************
infra1.openshift.mitzicom.int.m0sslab.org : ok=50   changed=2    unreachable=0    failed=0
infra2.openshift.mitzicom.int.m0sslab.org : ok=50   changed=2    unreachable=0    failed=0
localhost                  : ok=12   changed=0    unreachable=0    failed=0
master1.openshift.mitzicom.int.m0sslab.org : ok=57   changed=4    unreachable=0    failed=1   
master2.openshift.mitzicom.int.m0sslab.org : ok=46   changed=2    unreachable=0    failed=0
master3.openshift.mitzicom.int.m0sslab.org : ok=46   changed=2    unreachable=0    failed=0
node1.openshift.mitzicom.int.m0sslab.org : ok=50   changed=2    unreachable=0    failed=0
node2.openshift.mitzicom.int.m0sslab.org : ok=45   changed=2    unreachable=0    failed=0

INSTALLER STATUS **************************************************************************************************************************************************************************************************
Initialization             : Complete
GlusterFS Install          : In Progress
        This phase can be restarted by running: playbooks/byo/openshift-glusterfs/config.yml

For long output or logs, consider using a gist

Additional Information

Provide any additional information which may help us diagnose the issue.

Your operating system and version, ie: RHEL 7.2, Fedora 23 ($ cat /etc/redhat-release)
Your inventory file (especially any non-standard configuration parameters)

Sample code, etc

cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.4 (Maipo)

Snippet from inventory


---
[OSEv3:children]
masters
nodes
etcd
#lb
glusterfs

openshift_deployment_type=openshift-enterprise openshift_release=v3.7

openshift_master_dynamic_provisioning_enabled=true ...

host group for masters

[masters] master1.openshift.mitzicom.int.m0sslab.org master2.openshift.mitzicom.int.m0sslab.org master3.openshift.mitzicom.int.m0sslab.org

host group for etcd

[etcd] master1.openshift.mitzicom.int.m0sslab.org master2.openshift.mitzicom.int.m0sslab.org master3.openshift.mitzicom.int.m0sslab.org

[lb]

lb.openshift.mitzicom.int.m0sslab.org

host group for nodes, includes region info

[nodes] master[1:3].openshift.mitzicom.int.m0sslab.org openshift_node_labels="{'region': 'primary', 'zone': 'management'}" openshift_schedulable=false infra1.openshift.mitzicom.int.m0sslab.org openshift_node_labels="{'region':'infra', 'zone': 'management'}" infra2.openshift.mitzicom.int.m0sslab.org openshift_node_labels="{'region':'infra', 'zone': 'management'}" node1.openshift.mitzicom.int.m0sslab.org openshift_node_labels="{'region': 'primary', 'zone': 'app'}" node2.openshift.mitzicom.int.m0sslab.org openshift_node_labels="{'region': 'primary', 'zone': 'app'}"

[glusterfs] infra2.openshift.mitzicom.int.m0sslab.org glusterfs_devices="[ '/dev/vdc' ]" infra1.openshift.mitzicom.int.m0sslab.org glusterfs_devices="[ '/dev/vdc' ]" node1.openshift.mitzicom.int.m0sslab.org glusterfs_devices="[ '/dev/vdc' ]"`



Rerunning the playbook results in the same error.

DanyC97 commented 6 years ago

@smossber can you try with a newer openshift-ansible tag version ?

I tried with latest openshift-ansible-3.7.29-1 and no longer have that error however i have a new one

2018-02-11 11:14:57,659 p=29824 u=root |  TASK [openshift_storage_glusterfs : Load heketi topology] ************************************************************************************************************************************************
2018-02-11 11:15:02,789 p=29824 u=root |  fatal: [370-master1]: FAILED! => {"changed": true, "cmd": ["oc", "rsh", "--namespace=glusterfs", "deploy-heketi-storage-1-2pq9l", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "hGg3nkBKHQAURiCnOlto2VjZBun/lSYKixb+TT5LEoE=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-ls3tKn/topology.json", "2>&1"], "delta": "0:00:04.809425", "end": "2018-02-11 11:14:32.737780", "failed_when_result": true, "rc": 0, "start": "2018-02-11 11:14:27.928355", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: f197eab8f8508e7bd392398674229539\n\tCreating node 370-gluster1 ... ID: a540ed1d70f0a92991edea422007f1a5\n\t\tAdding device /dev/sdd ... OK\n\tCreating node 370-gluster2 ... Unable to create node: Unable to execute command on glusterfs-storage-chwv4:\n\tCreating node 370-gluster3 ... Unable to create node: Unable to execute command on glusterfs-storage-chwv4:", "stdout_lines": ["Creating cluster ... ID: f197eab8f8508e7bd392398674229539", "\tCreating node 370-gluster1 ... ID: a540ed1d70f0a92991edea422007f1a5", "\t\tAdding device /dev/sdd ... OK", "\tCreating node 370-gluster2 ... Unable to create node: Unable to execute command on glusterfs-storage-chwv4:", "\tCreating node 370-gluster3 ... Unable to create node: Unable to execute command on glusterfs-storage-chwv4:"]}

for which i'll open a new issue.