redhat-cop / openshift-applier

Used to apply OpenShift objects to an OpenShift Cluster
Apache License 2.0
102 stars 61 forks source link

oc client for 3.11 changes handling of URLs in oc process #91

Closed etsauer closed 5 years ago

etsauer commented 5 years ago

After upgrading to oc version 3.11, I'm getting lots of failures in existing applier inventories like:

TASK [openshift-applier : Create OpenShift objects based on template with params for 'Environment Setup : Create Projects'] ****************
failed: [localhost] (item={'oc_path': ''}) => {"changed": true, "cmd": "oc process   https://raw.githubusercontent.com/redhat-cop/cluster-lifecycle/v3.9.0/files/projectrequest/template.yml   --param='NAMESPACE=tool-box' --param='NAMESPACE_DISPLAY_NAME=tool-box'  --ignore-unknown-parameters | oc create  -f - ", "delta": "0:00:00.135276", "end": "2019-01-14 16:46:19.676869", "failed_when_result": true, "msg": "non-zero return code", "oc_param_file_item": {"oc_path": ""}, "rc": 1, "start": "2019-01-14 16:46:19.541593", "stderr": "error: invalid argument \"https://raw.githubusercontent.com/redhat-cop/cluster-lifecycle/v3.9.0/files/projectrequest/template.yml\"\nerror: no objects passed to create", "stderr_lines": ["error: invalid argument \"https://raw.githubusercontent.com/redhat-cop/cluster-lifecycle/v3.9.0/files/projectrequest/template.yml\"", "error: no objects passed to create"], "stdout": "", "stdout_lines": []}
    to retry, use: --limit @/home/esauer/src/containers-quickstarts/tool-box/galaxy/openshift-applier/playbooks/openshift-cluster-seed.retry

PLAY RECAP *********************************************************************************************************************************
localhost                  : ok=21   changed=1    unreachable=0    failed=1   

This appears to be because oc now considers a URL (string starting with http/s://) to a a file, meaning that we have to pass -f with URLs now.

oybed commented 5 years ago

Don't think this is due to a v3.11 change as the openshift-applier has always been adding the -f for URLs. However, this is done conditionally and it does a check to detect what kind of "source" the entry is. As long as the URL returns a 200 OK, the -f will be added to the execution. So, this issue is most likely due to something with the URL isn't resolving correctly in the execution environment or not returning a 200 OK.

Here's the filter plugin that does the detection and sets the -f option: https://github.com/redhat-cop/openshift-applier/blob/master/roles/openshift-applier/filter_plugins/applier-filters.py#L70-L72

etsauer commented 5 years ago

@oybed yeah, you're right. It looks like the difference could be the ansible version. When I run the above using the applier container image, it works fine.

Local versions

$ python --version
Python 2.7.15
$ ansible --version
ansible 2.7.5
  config file = /home/esauer/src/openshift-applier/ansible.cfg
  configured module search path = ['/home/esauer/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 3.7.2 (default, Jan  3 2019, 09:14:01) [GCC 8.2.1 20181215 (Red Hat 8.2.1-6)]
$ oc version
oc v3.11.0+0cbc58b
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://console.d2.casl.rht-labs.com:443
openshift v3.11.51
kubernetes v1.11.0+d4cacc0

Container versions:

bash-4.2$ python --version
Python 2.7.5
bash-4.2$ ansible --version
ansible 2.6.8
  config file = /openshift-applier/ansible.cfg
  configured module search path = [u'/openshift-applier/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Oct 30 2018, 23:45:53) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
bash-4.2$ oc version
oc v3.11.16
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Do we want to try to make ansible 2.7.x work?

roridedi commented 5 years ago

I was able to get around this issue by downgrading to ansible 2.6.5

sherl0cks commented 5 years ago

Just tried down grading ansible to 2.6.5, which failed. Python 2 v 3 may be to play here, especially if this is a check on the network.

$ ansible --version
ansible 2.6.5
  config file = /tmp/labs-ci-cd/ansible.cfg
  configured module search path = ['/home/jholmes/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/jholmes/.local/lib/python3.7/site-packages/ansible
  executable location = /home/jholmes/.local/bin/ansible
  python version = 3.7.2 (default, Jan 16 2019, 19:49:22) [GCC 8.2.1 20181215 (Red Hat 8.2.1-6)]
oybed commented 5 years ago

@sherl0cks which version of the openshift-applier did your galaxy run pull in? It should be v2.0.6 (and I believe the issue you are seeing is related to not having the latest).

sherl0cks commented 5 years ago
[jholmes@fedora29 labs-ci-cd]$ ansible-galaxy install -r requirements.yml --roles-path=roles
- extracting openshift-applier to /tmp/labs-ci-cd/roles/openshift-applier
- openshift-applier (v2.0.6) was installed successfully
- extracting infra-ansible to /tmp/labs-ci-cd/roles/infra-ansible
- infra-ansible (v1.0.6) was installed successfully
etsauer commented 5 years ago

@oybed @sherl0cks @roridedi turns out that the crux of the issue was that ansible 2.7.5 defaults to python 3, and we don't import the modules for that. I've opened a PR to fix it.

etsauer commented 5 years ago

@sherl0cks @roridedi we think we've fixed the issue. Can you guys confirm?

sherl0cks commented 5 years ago

@etsauer do I need to test against master, or is there a branch/tag/release?

oybed commented 5 years ago

@sherl0cks I just cut a new release, please give v2.0.7 a try

rdebeasi commented 5 years ago

I'm experiencing this problem as well, and v2.0.7 fixes it. :muscle: :rocket: :pizza:

sherl0cks commented 5 years ago

lgtm as well. see https://github.com/rht-labs/labs-ci-cd/pull/252.

sherl0cks commented 5 years ago

Appears this may introduce another regression - ci for labs ci/cd failed with

TASK [openshift-applier/roles/openshift-applier : Determine location for the file] ***

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: IOError: [Errno 2] No such file or directory: 'openshift//postgresql-persistent'

fatal: [ci-cd-tooling]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""}

    to retry, use: --limit @/tmp/workspace/labs-ci-cd/labs-ci-cd-ci-for-labs-ci-cd-pipeline/site.retry
oybed commented 5 years ago

Ouch - that's the 3rd option that's no longer working ... :-( ... alright, need another fix for that.

etsauer commented 5 years ago

And a test

oybed commented 5 years ago

@etsauer tests added: https://github.com/redhat-cop/openshift-applier/pull/101

... also, please validate that I captured the tests run for patch correctly in the README.

@sherl0cks I was not able to reproduce the issue with these tests (which mimics the Labs CI/CD functionality). I'll have to give it a go with Labs CI/CD next.

sherl0cks commented 5 years ago

Can we close this now with #106 merged?

oybed commented 5 years ago

Yes, with the many validations, I think we are good to close. Please open a new issue for anything new.