Closed rancherbot closed 1 year ago
VERSION=v1.23.16+rke2r1
Infrastructure
Node(s) CPU architecture, OS, and version:
Linux 5.4.0-1041-aws x86_64 GNU/Linux
PRETTY_NAME="Ubuntu 20.04.2 LTS"
Cluster Configuration: From Rancher provisioned node
$ sudo cat /etc/rancher/rke2/config.yaml.d/50-rancher.yaml //intentionally falsified
{
"advertise-address": "1.1.1.9",
"agent-token": "WORD_SALAD_BANSHEE",
"cni": "calico",
"disable-kube-proxy": false,
"etcd-expose-metrics": false,
"etcd-snapshot-retention": 5,
"etcd-snapshot-schedule-cron": "0 */5 * * *",
"kube-controller-manager-arg": [
"cert-dir=/var/lib/rancher/rke2/server/tls/kube-controller-manager",
"secure-port=10257"
],
"kube-controller-manager-extra-mount": [
"/var/lib/rancher/rke2/server/tls/kube-controller-manager:/var/lib/rancher/rke2/server/tls/kube-controller-manager"
],
"kube-scheduler-arg": [
"cert-dir=/var/lib/rancher/rke2/server/tls/kube-scheduler",
"secure-port=19"
],
"kube-scheduler-extra-mount": [
"/var/lib/rancher/rke2/server/tls/kube-scheduler:/var/lib/rancher/rke2/server/tls/kube-scheduler"
],
"node-external-ip": [
"1.2.2.3"
],
"node-ip": [
"1.1.1.9"
],
"node-label": [
"super.frog.cattle.io/machine=184534de-8346-40a8-aa72-1eeeed3ad97"
],
"private-registry": "/etc/rancher/rke2/registries.yaml",
"protect-kernel-defaults": false,
"tls-san": [
"18.223.2.2"
],
"token": "BANSHEE_OF_WORLDS_SUSHI"
}
ATTENTION TO: "I plan to test a potential fix, but need time to set up a reliably reproducible environment due to the raciness of this issue, I've only seen this issue twice out of ~ 25 attempts."
v1.23.17-rc1+rke2r1
Infrastructure
Node(s) CPU architecture, OS, and Version:
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
Cluster Configuration:
3 etcd only, 2 cp only, 2 workers
1. Create Rancher cluster using that conf
Deploy a secret: "kubectl create secret generic secret1 -n default --from-literal=mykey=mydata"
On the etcd node, confirm the secret is present and get the secrets encryption key using etcdctl:
# Install etcdctl
# Run command to get encryption of the secret:
$ sudo ETCDCTL_API=3 etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt get /registry/secrets/default/secret1 | hexdump -C
# Result should include something like:
k8s:enc:aescbc:v1:
On CP node
# Get initial status
$ sudo rke2 secrets-encrypt status
# Run prepare step
$ sudo rke2 secrets-encrypt prepare
# Restart all nodes -- restart ETCD first, then CP NODES, then AGENT NODES
$ sudo systemctl restart rke2-server
# Run rotate
sudo rke2 secrets-encrypt rotate
# Restart all nodes -- restart ETCD first, then CP NODES, then AGENT NODES
$ sudo systemctl restart rke2-server
# Run reencrypt
$ sudo rke2 secrets-encrypt reencrypt
# Restart all nodes -- restart ETCD first, then CP NODES, then AGENT NODES
$ sudo systemctl restart rke2-server
On the etcd node, confirm the secret encryption key has changed using etcdctl:
# Run command to get encryption of the secret:
$ sudo ETCDCTL_API=3 etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt get /registry/secrets/default/secret1 | hexdump -C
# Result should be different than the initial one and include the timestamp. Something like:
k8s:enc:aescbckey-2021-12-08T21:34:03Z:
Replication Results:
$ rke2 -v
rke2 version v1.23.16+rke2r1 (0124f3b3a88575a77ae75778eb82ef87a7302fc7)
go version go1.19.5 X:boringcrypto
$ sudo ETCDCTLAPI=3 etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt get /registry/secrets/default/secret1 | hexdump -C
00000000 2f 72 65 67 69 73 74 72 79 2f 73 65 63 72 65 74 |/registry/secret|
00000010 73 2f 64 65 66 61 75 6c 74 2f 73 65 63 72 65 74 |s/default/secret|
00000020 31 0a 6b 38 73 3a 65 6e 63 3a 61 65 73 63 62 63 |1.k8s:enc:aescbc|
00000030 3a 76 31 3a 61 65 73 63 62 63 6b 65 79 3a 7a b2 |:v1:aescbckey:z.|
00000040 de 26 0c 69 bd 76 38 bb f6 48 be 96 08 f5 de 95 |.&.i.v8..H......|
00000050 60 16 59 9d c7 94 b5 1f bb 04 ec 03 22 27 25 56 |.Y........."'%V| 00000060 63 40 3c 25 16 65 4a f5 c2 7f 85 75 b4 ba 9a fe |c@<%.eJ....u....| 00000070 58 ec e0 88 f0 c3 5c 00 f6 da ca c6 33 e5 74 86 |X.....\.....3.t.| 00000080 b8 09 86 c4 07 9e b3 9f 78 d9 e0 21 d0 3f ba d9 |........x..!.?..| 00000090 b6 14 29 86 32 15 ba 16 c8 0e 26 a7 e6 fc dc 36 |..).2.....&....6| 000000a0 f5 63 44 30 55 ee 8a 8e 69 06 6d c3 7a d3 d7 65 |.cD0U...i.m.z..e| 000000b0 47 a1 67 67 d9 6d 13 da 40 f4 84 b9 69 6b d0 57 |G.gg.m..@...ik.W| 000000c0 60 03 61 9b 61 b5 83 51 ea ab c6 d6 da fa a7 ef |
.a.a..Q........|
000000d0 24 da 36 bb 1b 86 87 66 3c 1a f2 4a 27 7a cf b9 |$.6....f<..J'z..|
000000e0 74 25 f0 00 96 59 ea c0 ea 13 9d 40 b2 26 4f 16 |t%...Y.....@.&O.|
000000f0 7b f7 d9 4b 5c fa 14 97 43 38 93 bd 15 f5 be dc |{..K...C8......|
00000100 cc 22 dd dc 4b ca f0 1c 92 ac 2a b6 08 b2 f7 9e |."..K.....*.....|
00000110 67 a9 16 90 b6 95 94 0f da 90 c2 e7 57 cf 2d 16 |g...........W.-.|
00000120 38 61 a1 68 c4 0a be 97 44 c0 d1 b4 38 5f 7e 0b |8a.h....D...8~.|
00000130 b8 9d ad 19 01 71 bc 7b 79 12 86 16 32 1a 0a |.....q.{y...2..|
0000013f
After full flow:
$ sudo ETCDCTLAPI=3 etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt get /registry/secrets/default/secret1 | hexdump -C 00000000 2f 72 65 67 69 73 74 72 79 2f 73 65 63 72 65 74 |/registry/secret| 00000010 73 2f 64 65 66 61 75 6c 74 2f 73 65 63 72 65 74 |s/default/secret| 00000020 31 0a 6b 38 73 3a 65 6e 63 3a 61 65 73 63 62 63 |1.k8s:enc:aescbc| 00000030 3a 76 31 3a 61 65 73 63 62 63 6b 65 79 2d 32 30 |:v1:aescbckey-20| 00000040 32 33 2d 30 33 2d 30 39 54 31 36 3a 31 31 3a 34 |23-03-09T16:11:4| 00000050 31 5a 3a dc 3a 0a 51 e1 06 93 6b 3b f4 b5 92 2a |1Z:.:.Q...k;...*| 00000060 c7 be b0 4e 35 5d 6e bf 74 8b 97 c8 ae 1f e0 43 |...N5]n.t......C| 00000070 b3 13 90 fa d4 d7 da 4b 33 01 17 a7 8b 33 b5 df |.......K3....3..| 00000080 5c da 73 93 60 a6 c5 66 21 2f e4 74 fa 61 ea 06 |.s.`..f!/.t.a..| 00000090 29 23 7f e5 59 71 b5 cc 45 9d 2c 73 87 c8 8a 90 |)#..Yq..E.,s....| 000000a0 4d 16 22 5b 29 f3 01 82 4e d7 e3 09 5f 92 06 35 |M."[)...N.....5| 000000b0 0c 16 5c 9b bd 71 ef e4 a7 7b b9 5c 3f f0 84 37 |....q...{.\?..7| 000000c0 d4 18 9b 6f 00 80 90 db 0b 1a 04 c8 75 39 77 a0 |...o........u9w.| 000000d0 a7 37 12 a5 bd 4e 70 67 24 75 93 41 b4 d7 1f f3 |.7...Npg$u.A....| 000000e0 02 41 92 38 c9 54 7d 78 00 19 7c a8 da 9a 27 95 |.A.8.T}x..|...'.| 000000f0 26 49 ef 71 d7 f1 c6 4a 6a a1 dd 65 d5 3c a5 05 |&I.q...Jj..e.<..| 00000100 b3 53 93 44 1d a8 91 9e d3 ce 34 c5 7a 90 9c 22 |.S.D......4.z.."| 00000110 41 3d a8 aa b0 3e e9 36 81 c1 27 4a 27 d3 a8 0c |A=...>.6..'J'...| 00000120 ba 83 83 6a 01 cf 4e c7 16 5e 8a 59 2c 73 5d 78 |...j..N..^.Y,s]x| 00000130 a9 ad f0 d5 34 1b 60 06 b0 2c ca 59 ab 48 e9 31 |....4.`..,.Y.H.1| 00000140 34 ce a9 19 1b f3 4a b0 5c 88 32 3a 5a d0 a2 19 |4.....J..2:Z...| 00000150 7e 11 7a 0a |~.z.| 00000154
$ kubectl get node,pod -A NAME STATUS ROLES AGE VERSION node/reproissue123-cp-a9be6a9b-24vlb Ready control-plane,master 28m v1.23.16+rke2r1 node/reproissue123-cp-a9be6a9b-bvqlq Ready control-plane,master 28m v1.23.16+rke2r1 node/reproissue123-etcd-b7e5190e-d4g62 Ready etcd 28m v1.23.16+rke2r1 node/reproissue123-etcd-b7e5190e-qxfmd Ready etcd 28m v1.23.16+rke2r1 node/reproissue123-etcd-b7e5190e-rq7pb Ready etcd 28m v1.23.16+rke2r1 node/reproissue123-worker-35978ff3-qlcnq Ready worker 25m v1.23.16+rke2r1 node/reproissue123-worker-35978ff3-xc25h Ready worker 25m v1.23.16+rke2r1
NAMESPACE NAME READY STATUS RESTARTS AGE calico-system pod/calico-kube-controllers-84d87c5854-cj4qr 1/1 Running 0 28m calico-system pod/calico-node-5294k 1/1 Running 0 25m calico-system pod/calico-node-6jmjd 1/1 Running 0 25m calico-system pod/calico-node-pbvz8 1/1 Running 0 28m calico-system pod/calico-node-rbfzc 1/1 Running 0 28m calico-system pod/calico-node-sctn7 1/1 Running 0 28m calico-system pod/calico-node-whw97 1/1 Running 0 28m calico-system pod/calico-node-zwppm 1/1 Running 0 28m calico-system pod/calico-typha-66fdff5b4d-d2tgx 1/1 Running 0 27m calico-system pod/calico-typha-66fdff5b4d-mvzqd 1/1 Running 0 28m calico-system pod/calico-typha-66fdff5b4d-njmfj 1/1 Running 0 27m cattle-fleet-system pod/fleet-agent-d7c5f79fd-448h9 1/1 Running 0 25m cattle-system pod/cattle-cluster-agent-5cf5bdbb8f-9sf9q 1/1 Running 0 12m cattle-system pod/cattle-cluster-agent-5cf5bdbb8f-mdwfc 1/1 Running 0 13m cattle-system pod/system-upgrade-controller-7f9f559b4f-dj55j 1/1 Running 0 25m kube-system pod/cloud-controller-manager-reproissue123-cp-a9be6a9b-24vlb 1/1 Running 0 28m kube-system pod/cloud-controller-manager-reproissue123-cp-a9be6a9b-bvqlq 1/1 Running 2 (13m ago) 28m kube-system pod/cloud-controller-manager-reproissue123-etcd-b7e5190e-d4g62 1/1 Running 0 28m kube-system pod/cloud-controller-manager-reproissue123-etcd-b7e5190e-qxfmd 1/1 Running 0 28m kube-system pod/cloud-controller-manager-reproissue123-etcd-b7e5190e-rq7pb 1/1 Running 0 28m kube-system pod/etcd-reproissue123-etcd-b7e5190e-d4g62 1/1 Running 0 28m kube-system pod/etcd-reproissue123-etcd-b7e5190e-qxfmd 1/1 Running 0 28m kube-system pod/etcd-reproissue123-etcd-b7e5190e-rq7pb 1/1 Running 0 28m kube-system pod/helm-install-rke2-calico-crd-pfz2x 0/1 Completed 0 28m kube-system pod/helm-install-rke2-calico-wpsbz 0/1 Completed 2 28m kube-system pod/helm-install-rke2-coredns-vgftr 0/1 Completed 0 28m kube-system pod/helm-install-rke2-ingress-nginx-w8zkw 0/1 Completed 0 28m kube-system pod/helm-install-rke2-metrics-server-xg99v 0/1 Completed 0 28m kube-system pod/kube-apiserver-reproissue123-cp-a9be6a9b-24vlb 1/1 Running 3 (6m39s ago) 28m kube-system pod/kube-apiserver-reproissue123-cp-a9be6a9b-bvqlq 1/1 Running 3 (7m23s ago) 28m kube-system pod/kube-controller-manager-reproissue123-cp-a9be6a9b-24vlb 1/1 Running 6 (6m40s ago) 28m kube-system pod/kube-controller-manager-reproissue123-cp-a9be6a9b-bvqlq 1/1 Running 5 (9m57s ago) 28m kube-system pod/kube-proxy-reproissue123-cp-a9be6a9b-24vlb 1/1 Running 0 28m kube-system pod/kube-proxy-reproissue123-cp-a9be6a9b-bvqlq 1/1 Running 0 28m kube-system pod/kube-proxy-reproissue123-etcd-b7e5190e-d4g62 1/1 Running 0 28m kube-system pod/kube-proxy-reproissue123-etcd-b7e5190e-qxfmd 1/1 Running 3 (8m20s ago) 28m kube-system pod/kube-proxy-reproissue123-etcd-b7e5190e-rq7pb 1/1 Running 2 (11m ago) 28m kube-system pod/kube-proxy-reproissue123-worker-35978ff3-qlcnq 1/1 Running 3 25m kube-system pod/kube-proxy-reproissue123-worker-35978ff3-xc25h 1/1 Running 3 25m kube-system pod/kube-scheduler-reproissue123-cp-a9be6a9b-24vlb 1/1 Running 2 (6m58s ago) 28m kube-system pod/kube-scheduler-reproissue123-cp-a9be6a9b-bvqlq 1/1 Running 2 (10m ago) 28m kube-system pod/rke2-coredns-rke2-coredns-775c5b4bb4-755fk 1/1 Running 0 27m kube-system pod/rke2-coredns-rke2-coredns-775c5b4bb4-lb65g 1/1 Running 0 28m kube-system pod/rke2-coredns-rke2-coredns-autoscaler-695fc554c9-vvq26 1/1 Running 0 28m kube-system pod/rke2-ingress-nginx-controller-b5x8z 1/1 Running 0 24m kube-system pod/rke2-ingress-nginx-controller-jjnqj 1/1 Running 0 24m kube-system pod/rke2-metrics-server-644f588b5-rdsfg 1/1 Running 0 24m tigera-operator pod/tigera-operator-b77ddd45f-zp6fq 1/1 Running 0 28m
<!-- Provide all the observations -->
**Validation Results:**
<!-- Provide the result of rke2 -v -->
$ rke2 -v rke2 version v1.23.17-rc1+rke2r1 (f34ca7b29816a317bf3311279aabdc70b442bf2a) go version go1.19.6 X:boringcrypto
//Once all flow is done, there is the diff between key Before:
ubuntu@123issue-cp-e4ce1772-ldcpn:~$ sudo rke2 secrets-encrypt status Encryption Status: Enabled Current Rotation Stage: start Server Encryption Hashes: All hashes match
Active Key Type Name
ubuntu@123issue-cp-e4ce1772-ldcpn:~$ ubuntu@123issue-cp-e4ce1772-ldcpn:~$ ubuntu@123issue-cp-e4ce1772-ldcpn:~$ sudo rke2 secrets-encrypt prepare prepare completed successfully ubuntu@123issue-cp-e4ce1772-ldcpn:~$ ubuntu@123issue-cp-e4ce1772-ldcpn:~$ ubuntu@123issue-cp-e4ce1772-ldcpn:~$ sudo systemctl restart rke2-server ubuntu@123issue-cp-e4ce1772-ldcpn:~$ sudo rke2 secrets-encrypt rotate rotate completed successfully ubuntu@123issue-cp-e4ce1772-ldcpn:~$ sudo systemctl restart rke2-server ubuntu@123issue-cp-e4ce1772-ldcpn:~$ sudo rke2 secrets-encrypt reencrypt reencryption started ubuntu@123issue-cp-e4ce1772-ldcpn:~$ sudo systemctl restart rke2-server ubuntu@123issue-cp-e4ce1772-ldcpn:~$
$ sudo ETCDCTL_API=3 etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt get /registry/secrets/default/secret1 | hexdump -C 00000000 2f 72 65 67 69 73 74 72 79 2f 73 65 63 72 65 74 |/registry/secret| 00000010 73 2f 64 65 66 61 75 6c 74 2f 73 65 63 72 65 74 |s/default/secret| 00000020 31 0a 6b 38 73 3a 65 6e 63 3a 61 65 73 63 62 63 |1.k8s:enc:aescbc| 00000030 3a 76 31 3a 61 65 73 63 62 63 6b 65 79 3a 8d ec |:v1:aescbckey:..| 00000040 4f 76 56 9f c8 4b bc 76 19 e0 ed 39 9a 8c 44 af |OvV..K.v...9..D.| 00000050 89 fa bb 89 1c e4 db 42 ea 0e e5 49 35 e3 4c 71 |.......B...I5.Lq| 00000060 e9 52 e2 5a b1 a5 df 87 8c 6a 37 28 b0 84 7d 46 |.R.Z.....j7(..}F| 00000070 26 b3 b4 82 bf ff f9 e4 28 22 4c 03 7d d7 b0 9d |&.......("L.}...| 00000080 24 1c 68 84 04 dd 28 41 75 5a c9 2b de d3 6f 74 |$.h...(AuZ.+..ot| 00000090 9f 37 46 b5 db 7e d6 5b b3 69 a5 a9 e8 2f 2f 27 |.7F..~.[.i...//'| 000000a0 61 e9 90 6d 95 d7 ae 69 aa f1 17 d3 57 22 a4 a3 |a..m...i....W"..| 000000b0 a2 ce cc b0 4f 82 d0 bf bc 01 10 b5 65 33 8d 2c |....O.......e3.,| 000000c0 a0 6d e7 1a b7 f2 cd 3f f8 97 ac 44 b3 28 70 88 |.m.....?...D.(p.| 000000d0 52 2a 61 96 b9 0e 23 55 17 d5 e4 13 31 42 26 9a |Ra...#U....1B&.| 000000e0 59 48 2d aa 21 1d 9b 3b 48 70 83 06 4d 69 2c 4f |YH-.!..;Hp..Mi,O| 000000f0 0a f8 3b 32 9a be 5c 80 1f 3e 06 96 77 52 e8 43 |..;2....>..wR.C| 00000100 9d 2a af f2 6b ae 77 e7 a7 df 06 20 08 6f 7e 90 |...k.w.... .o~.| 00000110 bb 10 80 ff 21 14 d2 b5 2f 89 04 77 a2 c5 e7 dd |....!.../..w....| 00000120 82 bc ed 4c 54 4e 0d d7 9a 22 84 6b 01 43 bd 56 |...LTN...".k.C.V| 00000130 75 e0 52 a6 15 85 59 0e 6a fe 5c 6e 3a 63 0a |u.R...Y.j.\n:c.| 0000013f
//After $ sudo ETCDCTL_API=3 etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt get /registry/secrets/default/secret1 | hexdump -C 00000000 2f 72 65 67 69 73 74 72 79 2f 73 65 63 72 65 74 |/registry/secret| 00000010 73 2f 64 65 66 61 75 6c 74 2f 73 65 63 72 65 74 |s/default/secret| 00000020 31 0a 6b 38 73 3a 65 6e 63 3a 61 65 73 63 62 63 |1.k8s:enc:aescbc| 00000030 3a 76 31 3a 61 65 73 63 62 63 6b 65 79 2d 32 30 |:v1:aescbckey-20| 00000040 32 33 2d 30 33 2d 30 39 54 30 30 3a 30 36 3a 35 |23-03-09T00:06:5| 00000050 33 5a 3a c7 74 7a 1f 22 34 10 18 fb 63 65 9c b0 |3Z:.tz."4...ce..| 00000060 48 28 f3 3a e8 2f ef 78 df 2c 58 92 71 f6 6f 1c |H(.:./.x.,X.q.o.| 00000070 b9 34 10 55 5a c6 29 e7 34 0e d8 75 0c 83 3e a9 |.4.UZ.).4..u..>.| 00000080 5c 20 e0 97 f8 17 83 b1 7f fe 4a cd fb 2a de b7 |\ ........J....| 00000090 5c 5c 2e 53 eb ad e5 53 42 b2 ed 7d d5 e0 54 75 |\.S...SB..}..Tu| 000000a0 a6 db a1 55 12 6e 0d ba 56 64 6d da 9b e3 bf 2a |...U.n..Vdm....| 000000b0 17 c7 3e b0 f5 8c b1 e5 0a 4b 7f e4 d7 56 c2 56 |..>......K...V.V| 000000c0 7d d7 46 83 d0 ea e7 d0 38 5d 1b 9f 8e c1 2c 35 |}.F.....8]....,5| 000000d0 17 35 22 13 f9 ba c8 38 63 4b c6 67 cb b4 b0 c5 |.5"....8cK.g....| 000000e0 fc c8 47 0f ff b2 bd 2f 3e fa 36 a8 5d d0 c6 71 |..G..../>.6.]..q| 000000f0 60 0a 12 0c fe 07 01 0a fc a9 b6 9c 0b e0 73 b1 |`.............s.| 00000100 2b ae d4 4d 68 56 b0 04 bb 66 40 20 a5 7f 22 4e |+..MhV...f@ .."N| 00000110 be fe 3c f0 ab 40 89 ec 3d e1 5d 25 95 b5 65 71 |..<..@..=.]%..eq| 00000120 cb 45 1f 20 9d 49 ed 48 0b 88 62 f7 9d a5 1d ea |.E. .I.H..b.....| 00000130 96 9f 57 37 19 04 55 e2 74 03 90 61 1c 87 f7 92 |..W7..U.t..a....| 00000140 7c 8a 9a 9c 89 9f 87 6a f7 2b ff 23 a4 71 2e fd ||......j.+.#.q..| 00000150 d3 b7 df
$ kubectl get node,pod -A NAME STATUS ROLES AGE VERSION node/123issue-cp-e4ce1772-fs5q9 Ready control-plane,master 16h v1.23.17+rke2r1 node/123issue-cp-e4ce1772-ldcpn Ready control-plane,master 16h v1.23.17+rke2r1 node/123issue-etcd-7645398d-7bdfw Ready etcd 16h v1.23.17+rke2r1 node/123issue-etcd-7645398d-gz7q4 Ready etcd 16h v1.23.17+rke2r1 node/123issue-etcd-7645398d-r52sx Ready etcd 16h v1.23.17+rke2r1 node/123issue-worker-ce426224-7ws2n Ready worker 16h v1.23.17+rke2r1 node/123issue-worker-ce426224-8t2bl Ready worker 16h v1.23.17+rke2r1
NAMESPACE NAME READY STATUS RESTARTS AGE calico-system pod/calico-kube-controllers-554d474486-8mxst 1/1 Running 0 16h calico-system pod/calico-node-6t8qw 1/1 Running 0 16h calico-system pod/calico-node-8p92m 1/1 Running 0 16h calico-system pod/calico-node-9p9xg 1/1 Running 0 16h calico-system pod/calico-node-cn4dh 1/1 Running 0 16h calico-system pod/calico-node-stjrz 1/1 Running 0 16h calico-system pod/calico-node-vjrrh 1/1 Running 0 16h calico-system pod/calico-node-xhzm5 1/1 Running 0 16h calico-system pod/calico-typha-775766fb65-c2h4s 1/1 Running 0 16h calico-system pod/calico-typha-775766fb65-m7qh9 1/1 Running 0 16h calico-system pod/calico-typha-775766fb65-rgqkz 1/1 Running 0 16h cattle-fleet-system pod/fleet-agent-d7c5f79fd-7pk2f 1/1 Running 0 16h cattle-system pod/cattle-cluster-agent-7db9479c87-hj9kw 1/1 Running 0 16h cattle-system pod/cattle-cluster-agent-7db9479c87-p2w9b 1/1 Running 0 16h cattle-system pod/system-upgrade-controller-7f9f559b4f-zjbcv 1/1 Running 0 16h kube-system pod/cloud-controller-manager-123issue-cp-e4ce1772-fs5q9 1/1 Running 2 (16h ago) 16h kube-system pod/cloud-controller-manager-123issue-cp-e4ce1772-ldcpn 1/1 Running 2 (16h ago) 16h kube-system pod/cloud-controller-manager-123issue-etcd-7645398d-7bdfw 1/1 Running 0 16h kube-system pod/cloud-controller-manager-123issue-etcd-7645398d-gz7q4 1/1 Running 0 16h kube-system pod/cloud-controller-manager-123issue-etcd-7645398d-r52sx 1/1 Running 0 16h kube-system pod/etcd-123issue-etcd-7645398d-7bdfw 1/1 Running 0 16h kube-system pod/etcd-123issue-etcd-7645398d-gz7q4 1/1 Running 0 16h kube-system pod/etcd-123issue-etcd-7645398d-r52sx 1/1 Running 0 16h kube-system pod/helm-install-rke2-calico-crd-q4pcx 0/1 Completed 0 16h kube-system pod/helm-install-rke2-calico-sq8nn 0/1 Completed 2 16h kube-system pod/helm-install-rke2-coredns-5d2c9 0/1 Completed 0 16h kube-system pod/helm-install-rke2-ingress-nginx-wpf6f 0/1 Completed 0 16h kube-system pod/helm-install-rke2-metrics-server-l949v 0/1 Completed 0 16h kube-system pod/kube-apiserver-123issue-cp-e4ce1772-fs5q9 1/1 Running 3 (16h ago) 16h kube-system pod/kube-apiserver-123issue-cp-e4ce1772-ldcpn 1/1 Running 3 (16h ago) 16h kube-system pod/kube-controller-manager-123issue-cp-e4ce1772-fs5q9 1/1 Running 8 (16h ago) 16h kube-system pod/kube-controller-manager-123issue-cp-e4ce1772-ldcpn 1/1 Running 7 (16h ago) 16h kube-system pod/kube-proxy-123issue-cp-e4ce1772-fs5q9 1/1 Running 0 16h kube-system pod/kube-proxy-123issue-cp-e4ce1772-ldcpn 1/1 Running 0 16h kube-system pod/kube-proxy-123issue-etcd-7645398d-7bdfw 1/1 Running 1 (16h ago) 16h kube-system pod/kube-proxy-123issue-etcd-7645398d-gz7q4 1/1 Running 1 (16h ago) 16h kube-system pod/kube-proxy-123issue-etcd-7645398d-r52sx 1/1 Running 0 16h kube-system pod/kube-proxy-123issue-worker-ce426224-7ws2n 1/1 Running 1 (16h ago) 16h kube-system pod/kube-proxy-123issue-worker-ce426224-8t2bl 1/1 Running 1 (16h ago) 16h kube-system pod/kube-scheduler-123issue-cp-e4ce1772-fs5q9 1/1 Running 3 (16h ago) 16h kube-system pod/kube-scheduler-123issue-cp-e4ce1772-ldcpn 1/1 Running 3 (16h ago) 16h kube-system pod/rke2-coredns-rke2-coredns-775c5b4bb4-bh2bm 1/1 Running 0 16h kube-system pod/rke2-coredns-rke2-coredns-775c5b4bb4-dssf4 1/1 Running 0 16h kube-system pod/rke2-coredns-rke2-coredns-autoscaler-695fc554c9-l9t4l 1/1 Running 0 16h kube-system pod/rke2-ingress-nginx-controller-lwx6z 1/1 Running 0 16h kube-system pod/rke2-ingress-nginx-controller-mgz4c 1/1 Running 0 16h kube-system pod/rke2-metrics-server-644f588b5-gd7jp 1/1 Running 0 16h tigera-operator pod/tigera-operator-59558cd85f-vnp5b 1/1 Running 0 16h
This is a backport issue for https://github.com/rancher/rke2/issues/3801, automatically created via rancherbot by @brandond
Original issue description:
Environmental Info: RKE2 Version: v1.25.5+rke2r2
Cluster Configuration:
1 server (rancher provisioned, but should be reproducible standalone)
Describe the bug:
Ocassionally,
rke2 secrets-encrypt reencrypt
will fail to update a secret as it may be updated outside of rke2, which cause secrets encryption to fail.Steps To Reproduce:
rke2 secrets encrypt prepare
,systemctl restart rke2-server
,rke2 secrets-encrypt rotate
,systemctl restart rke2-server
)rke2 secrets-encrypt reencrypt
commandExpected behavior:
rke2 secrets-encrypt reencrypt
completes successfully.Actual behavior:
The following error is recorded:
Failed to reencrypted secret: Operation cannot be fulfilled on secrets "serving-cert": the object has been modified; please apply your changes to the latest version and try again
.Additional context / logs:
The secret in question is the dynamiclistener
serving-cert
, which is likely being updated asynchronously.I've diagnosed the issue as coming from here: https://github.com/k3s-io/k3s/blob/1c17f05b8ee669ad309ad344dc443b0ae919328a/pkg/secretsencrypt/controller.go#L224-L228
I plan to test a potential fix, but need time to set up a reliably reproducible environment due to the raciness of this issue, I've only seen this issue twice out of ~ 25 attempts.