Closed AverageMarcus closed 1 year ago
Attempts to replicate the upgrade issue using a Workload Cluster have failed.
Using as similar values as possible, I've tried the same upgrade from 0.18.0
to 0.20.2
that caused the issues last week with the MC and noticed a couple differences:
This makes me think the issue may be related to some TTL expiring or similar so have created several test WCs to leave for a few days before attempting the same.
EDIT: The worker nodes rolling is because I mistakenly upgraded to 0.20.2
(which includes the change to ubuntu version) whereas last week I performed an upgrade from 0.18.0
-> 0.20.1
. Performing the same upgrade on a WC does just roll the control plane nodes as expected.
Some implementation details discovered while debugging:
kube-controller-manager
, acts on ServiceAccounts and secrets - watching for deletions of each and cleaning up the references in the other. This does not manage generating / refreshing of tokens..spec.serviceAccountName
and handling the token volume mount)serviceAccountToken
projected volume set up by the kubelet. This API creates tokens that expire either when the pod is deleted or after a defined lifespan that is defaulted to 1 hour.kubectl create token
command.Based on the above, I believe the issue lies in the kubelet. Unfortunately we don't have logs persisted from the kubelet so can't investigate until we can replicate the issue. When we next perform an MC upgrade we should tail the kubelet logs on each of the (old and new) control plane nodes to look out for relevant error messages.
The current kubelet logs available on both the tornado
MC (where the issue presented last week but contains new control plane nodes) and the keep1
WC I created yesterday don't show anything out of the ordinary.
keep1
:Cluster age: 22h
Action performed: Upgrade App from 0.18.0
to 0.20.1
Upgraded cleanly: Yes
Notes: The apiserver logs contained a small number of authentication logs (below) but they were temporary and it doesn't seem to have effected the health of the cluster at all.
kube-apiserver-ip-10-0-212-164.eu-west-2.compute.internal kube-apiserver E1213 15:29:38.634005 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-212-164.eu-west-2.compute.internal kube-apiserver E1213 15:29:41.378151 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-212-164.eu-west-2.compute.internal kube-apiserver E1213 15:29:43.636165 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-212-164.eu-west-2.compute.internal kube-apiserver E1213 15:29:46.380304 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
While this error is similar to what was seen during the previous MC upgrades it wasn't printed anywhere near as often or as many times and the error includes an additional note of service account token has been invalidated
.
keep2
:Cluster age: 2d14h
Action performed: Upgrade App from 0.18.0
to 0.20.1
Upgraded cleanly: Yes
Notes:
Some similar errors in the logs to keep1
but nothing persistent.
kube-apiserver-ip-10-0-80-61.eu-west-2.compute.internal kube-apiserver E1215 08:03:37.736541 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-80-61.eu-west-2.compute.internal kube-apiserver E1215 08:03:42.738910 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-78-60.eu-west-2.compute.internal kube-apiserver E1215 08:03:21.578948 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, context deadline exceeded]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:39.771686 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:44.774180 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:47.800197 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:47.800999 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
kube-apiserver-ip-10-0-228-80.eu-west-2.compute.internal kube-apiserver E1215 08:03:47.801661 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, service account token has been invalidated]"
keep2
- take 2:Cluster age: 2d14h Action performed:
keep2-sa
secret from the MC0.20.1
to 0.20.2
Upgraded cleanly: NoNotes:
This upgrade causes all nodes to cycle, including workers.
Some new logs in this situation:
kube-apiserver-ip-10-0-252-80.eu-west-2.compute.internal kube-apiserver E1215 08:16:43.535948 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, square/go-jose: error in cryptographic primitive]"
kube-apiserver-ip-10-0-252-80.eu-west-2.compute.internal kube-apiserver E1215 08:16:43.535934 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, square/go-jose: error in cryptographic primitive]"
kube-apiserver-ip-10-0-252-80.eu-west-2.compute.internal kube-apiserver E1215 08:16:47.776249 1 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, square/go-jose: error in cryptographic primitive]"
These logs are repeated a lot.
This causes similar issues to what we've seen during upgrades but in a very different way and with different logs.
I think it's safe to say the [cluster]-sa
secret didn't change during the faulty upgrades
We hit this issue again today with the upgrade of goat
.
Root cause still unknown but I managed to rule out some possible reasons:
service-account-token
secrets don't change during the upgrade so the service accounts still have the correct token assigned to themgoat-ca
and goat-sa
secrets didn't change so the keys used to generate tokens remains the sameSome differences seen compared to tests performed on workload clusters:
The kubelet logs on the first control plane node being removed shows etcd
crash looping because it has been removed from the etcd members. I would assume this also happens with WCs but I believe they are deleted quicker so doesn't get to this point.
Dec 16 09:25:01 ip-10-224-128-148 kubelet[1401]: I1216 09:25:01.699484 1401 scope.go:110] "RemoveContainer" containerID="84522cbabffe063f63f45729ccd5fd1e841bc29f640678ecad0ebe3ff448ae3a"
Dec 16 09:25:04 ip-10-224-128-148 kubelet[1401]: I1216 09:25:04.707774 1401 scope.go:110] "RemoveContainer" containerID="84522cbabffe063f63f45729ccd5fd1e841bc29f640678ecad0ebe3ff448ae3a"
Dec 16 09:25:04 ip-10-224-128-148 kubelet[1401]: I1216 09:25:04.708189 1401 scope.go:110] "RemoveContainer" containerID="1e5118c36253eb247e1a42db466f5c10854c90410a47ada01c9ee5a7821c3459"
Dec 16 09:25:04 ip-10-224-128-148 kubelet[1401]: E1216 09:25:04.708633 1401 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 10s restarting failed container=etcd pod=etcd-ip-10-224-128-148.eu-north-1.compute.internal_kube-system(8f8e2f81e748ac114db703b7124de587)\"" pod="kube-system/etcd-ip-10-224-128-148.eu-north-1.compute.internal" podUID=8f8e2f81e748ac114db703b7124de587
Dec 16 09:25:05 ip-10-224-128-148 kubelet[1401]: I1216 09:25:05.711469 1401 scope.go:110] "RemoveContainer" containerID="1e5118c36253eb247e1a42db466f5c10854c90410a47ada01c9ee5a7821c3459"
Dec 16 09:25:05 ip-10-224-128-148 kubelet[1401]: E1216 09:25:05.711814 1401 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with CrashLoopBackOff: \"back-off 10s restarting failed container=etcd pod=etcd-ip-10-224-128-148.eu-north-1.compute.internal_kube-system(8f8e2f81e748ac114db703b7124de587)\"" pod="kube-system/etcd-ip-10-224-128-148.eu-north-1.compute.internal" podUID=8f8e2f81e748ac114db703b7124de587
The syslog from the first new control plane node includes the following:
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:15] [certs] Generating "apiserver-kubelet-client" certificate and key
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:16] [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:16] [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:16] [kubelet-start] Starting the kubelet
Dec 16 09:24:34 ip-10-224-148-177 cloud-init[1078]: [2022-12-16 09:24:16] [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
which is shortly followed by:
Dec 16 09:24:47 ip-10-224-148-177 kubelet[1484]: I1216 09:24:47.034920 1484 kubelet.go:1724] "Trying to delete pod" pod="kube-system/kube-apiserver-ip-10-224-148-177.eu-north-1.compute.internal" podUID=958cc32c-5119-48db-82a8-a03f90f6a89e
Dec 16 09:24:47 ip-10-224-148-177 kubelet[1484]: I1216 09:24:47.060101 1484 kubelet.go:1729] "Deleted mirror pod because it is outdated" pod="kube-system/kube-apiserver-ip-10-224-148-177.eu-north-1.compute.internal"
It looks like kubelet is set up before the apiserver has had the placeholder replaced.
I tried systemctl restart kubelet
but that didn't seem to help.
There are many errors in the logs of the new control plane node related to not being able to access the api server ELB:
Dec 16 09:25:23 ip-10-224-148-177 kubelet[1484]: E1216 09:25:23.257848 1484 controller.go:187] failed to update lease, error: Put "https://internal-goat-apiserver-18847525199.eu-north-1.elb.amazonaws.com:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ip-10-224-148-177.eu-north-1.compute.internal?timeout=10s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
This is possibly caused by the ELB including non-ready instances (such as the currently initialising node) which can't respond to API requests.
keep3
:Cluster age: 3d18h
Action performed: Upgrade App from 0.18.0
to 0.20.1
Upgraded cleanly: Yes
Notes:
Checking the last of the possible default TTLs (72hour).
No issues.
keep4
:Cluster age: 3d20h
Action performed: Upgrade App from 0.18.0
to 0.20.1
Upgraded cleanly: Yes
Notes:
Installed Kyverno
, kyverno-policies
, keep4-kyverno-policies-connectivity
& keep4-kyverno-policies-dx
Kyverno didn't cause any issue with the upgrade.
I have been unable to replicate the issue using workload clusters. Whatever is causing the issue seems to only be there on management clusters.
It is also entirely possible that whatever cause the problem is now already fixed in the latest versions of cluster-aws and default-apps-aws. We wont know for sure until we next upgrade an MC.
Plan for testing MC upgrades:
grizzly
). If no current MCs fit this then we may want to create a new ephemeral MC for testing this issue.cluster-aws
that introduces a change to the control plane without changing any functionality. Easiest way to I can think to ensure this is to add a new empty file to the files
property of the KubeadmControlPlane
. This will cause the control plane nodes only to be replaced but with functionally identical config.
We (@tuladhar and @bdehri) upgraded grizzly and golem, and didn't encounter this issue. Closing it for now, but if we encounter the issue in future we'll re-open and investigate further.
Need to investigate exactly how the
service-account-issuer
flag works in api-server when using a custom URL.Upgrades to CAPA clusters are very unstable due to pods becoming unauthorized during rolling of controlplane nodes. We need to understand why the existing service account tokens become invalid during this process even though the URL used for the issuer doesn't change.