ci-for-labs-ci-cd-pipeline is currently broken

jacobsee commented 6 years ago

The pipeline is failing for all PRs attempted with

TASK [openshift-applier/roles/openshift-applier : Include the pre/post step role] ***
ERROR! 'tmp_dep_dir' is undefined

while trying to apply ci-cd-deployments : nexus. @oybed and I looked at it for a while and it doesn't appear to be grabbing the code in the PR, rather just running on master... Potential issue with the stash/unstash steps (lines 144, 162) in the Jenkinsfile. Build 8 in Jenkins on S10 is an example of this occurring.

jacobsee commented 6 years ago

Justin asked me to put this in - doesn't look like I can make assignments but this should be for @springdo. Let me know if there are any questions about the issue!

oybed commented 6 years ago

@jacobsee I think the tmp_dep_dir result is just a side-effect, but won't know for sure till this issue has been fixed. Attempting another test with PR #203 also shows that it doesn't pick up the source/changes from the PR.

My suspicion is that the stash/unstash steps that Jacob mentioned just isn't "aware" of the many git commands run in the sh module and hence stashes the wrong source (i.e.: from the Jenkins job rather than the labs robot source prepared as part of the sh/git steps).

springdo commented 6 years ago

Yo guys - just looking at the output of latest ci build now..... Last one was run for #203 and seems to be some issue with an ansible.cfg

[labs-ci-cd-ci-for-labs-ci-cd-pipeline] Running shell script
+ ansible-galaxy install -r requirements.yml --roles-path=roles
- extracting openshift-applier to /tmp/workspace/labs-ci-cd/labs-ci-cd-ci-for-labs-ci-cd-pipeline/roles/openshift-applier
- openshift-applier (v2.0.0) was installed successfully
[Pipeline] sh
[labs-ci-cd-ci-for-labs-ci-cd-pipeline] Running shell script
+ ansible-playbook ci-playbook.yml -vvv -i inventory/ -e 'target=bootstrap project_name_postfix=-pr-203 scm_ref=pr-203'
Post stage
ansible-playbook 2.4.2.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/jenkins/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible-playbook
  python version = 2.7.5 (default, May 31 2018, 09:41:32) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]
Using /etc/ansible/ansible.cfg as config file
Parsed /tmp/workspace/labs-ci-cd/labs-ci-cd-ci-for-labs-ci-cd-pipeline/inventory/hosts inventory source with ini plugin
ERROR! Unexpected Exception, this is probably a bug: No module named requests
the full traceback was:

Traceback (most recent call last):
  File "/usr/bin/ansible-playbook", line 106, in <module>
    exit_code = cli.run()
  File "/usr/lib/python2.7/site-packages/ansible/cli/playbook.py", line 130, in run
....stack trace here....
  File "/tmp/workspace/labs-ci-cd/labs-ci-cd-ci-for-labs-ci-cd-pipeline/roles/openshift-applier/roles/openshift-applier/filter_plugins/applier-filters.py", line 2, in <module>
    import requests
ImportError: No module named requests

Seems like ImportError: No module named requests is related to some other problem from the applier?

oybed commented 6 years ago

@springdo that last run was me manually pointing to the correct source to validate the issue, so please ignore that build.

oybed commented 6 years ago

BTW: just as a FYI, and not meant to distract from this issue - the output you are seeing above re: requests will be fixed with this openshift-applier issue: https://github.com/redhat-cop/openshift-applier/issues/47

springdo commented 6 years ago

This is not to do with code being grabbed from the PR from my quick scan. I just re-ran the pipeline with the previously passing PR (#197 ) and it fails for the same reason as originally stated.

TASK [openshift-applier/roles/openshift-applier : Include the pre/post step role] ***
ERROR! 'tmp_dep_dir' is undefined

Is this some new thing that's needed in the applier or should this tmp_dep_dir just be included in here somewhere? https://github.com/rht-labs/labs-ci-cd/blob/master/roles/configure-nexus/defaults/main.yml

oybed commented 6 years ago

@springdo yes, as mentioned earlier, there may still be an issue with that (and it needs to be investigated), but before we go too deep on that, let's get the CI fixed to where it actually picks up the correct source - right now it doesn't.

oybed commented 6 years ago

Here's another evidence that it doesn't pick up the correct source. My PR #203 changes the openshift-applier to v2.0.0, but as seen from this output it is still using v3.9.0 when testing that PR:

[labs-ci-cd-ci-for-labs-ci-cd-pipeline] Running shell script
+ ansible-galaxy install -r requirements.yml --roles-path=roles
- extracting openshift-applier to /tmp/workspace/labs-ci-cd/labs-ci-cd-ci-for-labs-ci-cd-pipeline/roles/openshift-applier
- openshift-applier (v3.9.0) was installed successfully

oybed commented 6 years ago

Further investigating the tmp_dep_dir, I found that the jenkins-slave-ansible has the wrong Ansible version. It is using 2.4.x, while the requirement for anything using the pre/post steps of the openshift-applier requires >=2.5.x. A fix should be submitted to the containers-quickstarts repo. (ansible 2.6 is now GA, so maybe go straight to that release) I wrote a new issue for this: https://github.com/redhat-cop/containers-quickstarts/issues/123

However, the main problem of this issue as outlined before is that the incorrect source is used for the CI build, so that still needs to be fixed.

oybed commented 6 years ago

Fix for containers-quickstarts: https://github.com/redhat-cop/containers-quickstarts/pull/124

Still need a fix for this repo (Labs CI/CD) for the CI aspect.

sherl0cks commented 6 years ago

Thanks @oybed @jacobsee for looking into this. It seems we moved to declarative, but before declarative supported some important features. Specifically, early versions of declarative required that each stage run on a unique slave, the requirement for stash. It looks like 1.3 of declarative has resolved this issue - we are currently running 1.2.8. I'm going to do a plugin update, move to the new sequential stage syntax and see what happens.

https://jenkins.io/blog/2018/07/02/whats-new-declarative-piepline-13x-sequential-stages/

sherl0cks commented 6 years ago

Just so it documented (we should get this in a README) - the best way to do these sorts of tests is to leverage OCP's ability to put the Jenkinsfile in the BuildConfig directly. So I can go into the project page and then navigate Builds->Pipelines->ci-for-labs-ci-cd, and then Actions->Edit and then switch Jenkinsfile Type from SCM to inline. This will give me a editor window in which one can test pipeline changes without pushing to the git repo. I would suggest using your normal editor and then copy / paste in to the window.

Not perfect, but that makes a slow process a little faster.

screencast_09-02-2018_12_25_47 pm

sherl0cks commented 6 years ago

https://github.com/sherl0cks/labs-ci-cd/commit/ac25188f132393ef31a39f4d801cf5a6ae0b5bc4 resolves the primary issue in this ticket, but it looks like I need to pull in redhat-cop/containers-quickstarts#124 to send a reasonable PR. Will get to that next

rht-labs / labs-ci-cd

ci-for-labs-ci-cd-pipeline is currently broken #204