Closed a-zcomp closed 4 years ago
Upgrade from 3.7 to 3.9 requires both 3.8 and 3.9 repos to be enabled. Make sure you have https://cbs.centos.org/repos/paas7-openshift-origin38-release/x86_64/os/Packages/ repo enabled
@a-zcomp can we close this issue now that @vrutkovs provided the correct answer ?
Tried the above, added the above repo to the servers however it still fails for the same reason.
BTW - is the above in the documentation? The documentation for origin also seems to have a line that says:
"Ensure the openshift_deployment_type parameter in your inventory file is set to openshift-enterprise."
Is this accurate?
@ubwa regarding the documentation for origin, best place to bring the issue is on docs repo itself. In this particular case deployment_type takes 2 values as mentioned here
if you got the same erro as @a-zcomp can you please share your inventory file ?
I get the same problem. I definitely have both repos enabled, and 3.7 disabled.
I also believe the docs are incorrect regarding openshift-enterprise.
[root@master2 ~]# yum repolist
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: anorien.csc.warwick.ac.uk
* epel: ftp.nluug.nl
* extras: mirror.as29550.net
* updates: centos.mirroring.pulsant.co.uk
repo id repo name status
base/7/x86_64 CentOS-7 - Base 9,911
centos-openshift-origin CentOS OpenShift Origin 185
centos-openshift-origin38 CentOS OpenShift Origin 37
centos-openshift-origin39 CentOS OpenShift Origin 36
epel/x86_64 Extra Packages for Enterprise Linux 7 - x86_64 12,561
extras/7/x86_64 CentOS-7 - Extras 291
updates/7/x86_64 CentOS-7 - Updates 626
repolist: 23,647
error
Failure summary:
1. Hosts: master2.vagrant.test, master3.vagrant.test
Play: Verify upgrade targets
Task: Fail when openshift version does not meet minium requirement for Origin upgrade
Message: This upgrade playbook must be run against OpenShift 3.8 or later
2. Hosts: localhost
Play: Gate on etcd backup
Task: fail
Message: Upgrade cannot continue. The following hosts did not complete etcd backup: master2.vagrant.test,master3.vagrant.test
After some debugging it seems that the run is failing because master1 is on 3.8 due to a previously failed upgrade, so the scripts think that all the servers are on that version.
I assume that the recipes can't handle a failed upgrade, therefore. I'm not sure what I'm supposed to do to resolve this. Either hack the ansible code directly, or downgrade by hand?
@ianmiell i think your problem is same to https://github.com/openshift/openshift-ansible/issues/8467
FWIW I hacked the ansible code directly by forcing the 3.8 upgrade everywhere:
/usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/v3_9/upgrade_control_plane.yml
specifically, changing < 3.8 in the test to < 3.9
After some debugging it seems that the run is failing because master1 is on 3.8 due to a previously failed upgrade, so the scripts think that all the servers are on that version.
You might need to clear the facts locally and on the hosts to get the actual versions. The code from release-3.9
should be able to upgrade from 3.8 to 3.9, but there is not much we can help with, since no full ansible-playbook -vvv
output and inventory were provided
How do I clear the facts?
I have a fully reproducible environment on vagrant in code - if you request it, I can provide it.
On Fri, May 25, 2018 at 2:23 PM, Vadim Rutkovsky notifications@github.com wrote:
After some debugging it seems that the run is failing because master1 is on 3.8 due to a previously failed upgrade, so the scripts think that all the servers are on that version.
You might need to clear the facts locally and on the hosts to get the actual versions. The code from release-3.9 should be able to upgrade from 3.8 to 3.9, but there is not much we can help with, since no full ansible-playbook -vvv output and inventory were provided
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openshift/openshift-ansible/issues/8363#issuecomment-392056314, or mute the thread https://github.com/notifications/unsubscribe-auth/AGrczfQQufWqIRHpZ12Elmg3EXyKvHUHks5t2AXigaJpZM4T9nHD .
Remove $HOME/ansible/facts
on the control host and /etc/ansible/facts.d
on the VMs
@vrutkovs I believe the problem with "clearing facts" (at least with the containerized install) is that the facts are programmatically generated using roles/openshift_facts/library/openshift_facts.py
Basically, it will detect the version of the containerized install by reading /etc/sysconfig/origin-master-controllers and parsing out current version from the IMAGE_VERSION variable. In my case, I had to edit that file by hand on the master that failed the upgrade from 3.7.2 to 3.8 and manually define the tag to v3.8.0 and then I was able to re-run the 3.9 upgrade playbook. Of course, the relabeling of the master to include the new labels required by 3.9 (node-role.kubernetes.io/master) weren't applied to that one failed master either and I had to manually oc label
that node . See https://github.com/openshift/openshift-ansible/issues/8467
for details.
This is, of course, in the release-3.9 tag... It seems that this has been changed in master? I've not had a chance to take a deeper look at what has been changed. But this "failsafe" parsing of the /etc/sysconfig/origin-master-controllers file to get the version had also bit me in a partially failed 3.7.0 -> 3.7.2 upgrade.
it will detect the version of the containerized install by reading /etc/sysconfig/origin-master-controllers and parsing out current version from the IMAGE_VERSION variable
Yes, version detection is tricky here. Does it get updated earlier than actual containers land?
It seems that this has been changed in master?
Master now uses static pod deployment, its entirely different now
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten /remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen
.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Exclude this issue from closing again by commenting /lifecycle frozen
.
/close
@openshift-bot: Closing this issue.
Description
When trying to perform an upgrade from Origin 3.7 to 3.9 using playbooks/byo/openshift-cluster/upgrades/v3_9/upgrade.yml an error saying I need to upgrade from 3.8 is shown
Version
Please put the following version information in the code block indicated below.
Steps To Reproduce
Expected Results
Openshift Origin is upgraded to 3.9
Observed Results
Playbook error