stormshift / support

This repo should serve as a central source for reporting issues with stormshift
GNU General Public License v3.0
3 stars 0 forks source link

Update rhacm cluster from 4.8.24 to 4.9.11 #60

Closed rbo closed 2 years ago

github-actions[bot] commented 2 years ago

Heads up @cluster/rhacm-admin - the "cluster/rhacm" label was applied to this issue.

rbo commented 2 years ago

Can not update because:

Warning alert: This cluster should not be updated to 4.9. You can continue to update to patch releases in 4.8.
Kubernetes 1.22 and therefore OpenShift 4.9 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/6329921 for details and instructions.
rbo commented 2 years ago

Run:

$ oc get apirequestcounts -o jsonpath='{range .items[?(@.status.removedInRelease!="")]}{.status.removedInRelease}{"\t"}{.metadata.name}{"\t"}currentHour:{.status.currentHour.requestCount}{"\t"}last24h:{.status.requestCount}{"\n"}{end}'
1.22    certificatesigningrequests.v1beta1.certificates.k8s.io  currentHour:1   last24h:173
1.22    customresourcedefinitions.v1beta1.apiextensions.k8s.io  currentHour:32  last24h:7038
1.21    flowschemas.v1alpha1.flowcontrol.apiserver.k8s.io   currentHour:16  last24h:16
1.22    ingresses.v1beta1.extensions    currentHour:18  last24h:1120
1.22    ingresses.v1beta1.networking.k8s.io currentHour:0   last24h:8
1.22    mutatingwebhookconfigurations.v1beta1.admissionregistration.k8s.io  currentHour:0   last24h:3348
1.22    validatingwebhookconfigurations.v1beta1.admissionregistration.k8s.io    currentHour:728 last24h:16926
DanielFroehlich commented 2 years ago

Please make sure RHACM V2.4 is fully supported against OCP 4.9

rbo commented 2 years ago

Checked accounts who use the "old" apis:

oc get apirequestcounts -o jsonpath='{range .items[?(@.status.removedInRelease!="")]}{.metadata.name}{"\n"}{end}' | xargs -I % oc get apirequestcounts % -o jsonpath='{range ..username}{$}{"\n"}{end}' | sort | uniq
system:kube-controller-manager
system:serviceaccount:default:rbohne-admin
system:serviceaccount:kube-system:generic-garbage-collector
system:serviceaccount:kube-system:namespace-controller
system:serviceaccount:open-cluster-management-agent-addon:klusterlet-addon-appmgr
system:serviceaccount:open-cluster-management-agent:klusterlet-work-sa
system:serviceaccount:open-cluster-management:cert-manager-cainjector
system:serviceaccount:open-cluster-management:console-chart-2bc96
system:serviceaccount:open-cluster-management:hive-operator
system:serviceaccount:open-cluster-management:managedcluster-import-controller-v2
system:serviceaccount:open-cluster-management:management-ingress-2ced2-sa
system:serviceaccount:open-cluster-management:multiclusterhub-operator
system:serviceaccount:open-cluster-management:multicluster-operators
system:serviceaccount:open-cluster-management:search-collector
system:serviceaccount:open-cluster-management:submariner-addon
system:serviceaccount:openshift-controller-manager:openshift-controller-manager-sa
system:serviceaccount:openshift-network-operator:default
system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount
system:serviceaccount:openshift-operators:gitops-operator
system:serviceaccount:sealed-secrets:sealed-secrets-operator-helm

A lot of ACM... Good point @DanielFroehlich

rbo commented 2 years ago

Please make sure RHACM V2.4 is fully supported against OCP 4.9

OCP 4.9 is supported. Source: Red Hat Advanced Cluster Management for Kubernetes 2.4 Support Matrix

rbo commented 2 years ago

Other core compontent should support 4.9. We can ignore sealed secret operator, not in use yet.

Let's ack:oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-4.8-kube-1.22-api-removals-in-4.9":"true"}}' --type=merge and hit the update button :-) Let's what's happen and learn... (and debug?)

DanielFroehlich commented 2 years ago

FYI: I am seeing frequent etcd leader changes in RHACM cluster and realized we are also low on memory with the masters. Bumped them from 16GB to 24GB, change will affect on next reboot with your OCP upgrade.

rbo commented 2 years ago

Cluster update completed. I have to check a couple of pods:

image

DanielFroehlich commented 2 years ago

Iirc, Thanos is the Time series database for obersability. It's configured to use object storage from @.*** Maybe something is wrong with that, e g the bucket is full?

On Mon, Jan 3, 2022, 20:04 Robert Bohne @.***> wrote:

Cluster update completed. I have to check a couple of pods:

[image: image] https://user-images.githubusercontent.com/36604/147969306-c9b221b2-d9c0-4488-bec4-9fcd1a9181b7.png

— Reply to this email directly, view it on GitHub https://github.com/stormshift/support/issues/60#issuecomment-1004293556, or unsubscribe https://github.com/notifications/unsubscribe-auth/AENMQW7QBT3HWCUJ3JI2GTLUUHXLPANCNFSM5LFNI4OA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

DanielFroehlich commented 2 years ago

yeah, as expected, it is a problem with the Object store hosted on OCP4 OCS. Opened issue #62 for this, as it is probably and old problem, not related to the ocp update. closing this issue, as the update is completed successfully.