okd-project / okd

The self-managing, auto-upgrading, Kubernetes distribution for everyone
https://okd.io
Apache License 2.0
1.76k stars 297 forks source link

Update 4.16.0-0.okd-scos-2024-09-24-151747 stuck. APIServicesAvailable: PreconditionNotReady #2036

Closed alexminder closed 1 month ago

alexminder commented 1 month ago

Describe the bug OKD update stuck from 4.15.0-0.okd-scos-2024-01-18-223523 to 4.16.0-0.okd-scos-2024-09-24-151747

$ oc get co
NAME                                       VERSION                               AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.16.0-0.okd-scos-2024-09-24-151747   False       False         False      4d21h   APIServicesAvailable: PreconditionNotReady
baremetal                                  4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
cloud-controller-manager                   4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
cloud-credential                           4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
cluster-autoscaler                         4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
config-operator                            4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
console                                    4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      9d
control-plane-machine-set                  4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
csi-snapshot-controller                    4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      52d
dns                                        4.15.0-0.okd-scos-2024-01-18-223523   True        False         False      51d
etcd                                       4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      7d1h
image-registry                             4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      52d
ingress                                    4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      4d
insights                                   4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
kube-apiserver                             4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
kube-controller-manager                    4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
kube-scheduler                             4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
kube-storage-version-migrator              4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      17d
machine-api                                4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
machine-approver                           4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
machine-config                             4.15.0-0.okd-scos-2024-01-18-223523   True        False         False      3d
marketplace                                4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
monitoring                                 4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      3d
network                                    4.15.0-0.okd-scos-2024-01-18-223523   True        False         False      503d
node-tuning                                4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      3d23h
openshift-apiserver                        4.16.0-0.okd-scos-2024-09-24-151747   False       False         False      4d21h   APIServicesAvailable: PreconditionNotReady
openshift-controller-manager               4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      52d
openshift-samples                          4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      3d23h
operator-lifecycle-manager                 4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
operator-lifecycle-manager-catalog         4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
operator-lifecycle-manager-packageserver   4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      3d1h
service-ca                                 4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
storage                                    4.16.0-0.okd-scos-2024-09-24-151747   True        False         False      503d
$ oc adm upgrade
info: An upgrade is in progress. Unable to apply 4.16.0-0.okd-scos-2024-09-24-151747: some cluster operators are not available

Upgradeable=False

  Reason: KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade
  Message: Cluster operator kube-apiserver should not be upgraded between minor versions: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes hw-testing2-okd-01.tutu.ru, hw-testing2-okd-02.tutu.ru, and hw-testing2-okd-03.tutu.ru will not be supported in the next OpenShift minor version upgrade.

Upstream: https://amd64.origin.releases.ci.openshift.org/graph
Channel: stable-scos-4
No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and may result in downtime or data loss.

openshift-apiserver-operator log have:

connectivity_check_controller.go:169] ConnectivityCheckController is waiting for transition to desired version (4.16.0-0.okd-scos-2024-09-24-151747) to be completed.

Version 4.15.0-0.okd-scos-2024-01-18-223523

How reproducible oc adm upgrade --force --allow-explicit-upgrade --to-image=quay.io/okd/scos-release:4.16.0-0.okd-scos-2024-09-24-151747

Log bundle must-gather

JaimeMagiera commented 1 month ago

Hi,

We’ve not tested any upgrades from the older SCOS builds from January, nor do we have the ability to troubleshoot those. They were nightlies, which generally aren’t necessarily expected to interact with updates. The forced upgrade process is meant as a way to get from official 4.15 FCOS releases -> 4.16 SCOS — all of which is still just in testing. There is now official 4.16 SCOS release yet. We’ll have an update on the project in the next few days.

Thanks.

alexminder commented 1 month ago

@JaimeMagiera

They were nightlies

The release was in stable channel. How can I understand where nightly release and where prod ready?