Closed aganesh-suse closed 5 months ago
Infrastructure
Node(s) CPU architecture, OS, and Version:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"
$ uname -m
x86_64
Cluster Configuration:
HA : 1 etcd , 2 cp, 1 agent node
Config.yaml:
ETCD server config:
token: xxxx
disable-apiserver: true
disable-controller-manager: true
disable-scheduler: true
write-kubeconfig-mode: "0644"
secrets-encryption: true
node-external-ip: 1.1.1.1
debug: true
CP only node configs:
token: xxxx
server: https://1.1.1.1:9345
disable-etcd: true
write-kubeconfig-mode: "0644"
secrets-encryption: true
node-external-ip: 1.2.3.4
debug: true
$ sudo mkdir -p /etc/rancher/rke2 && sudo cp config.yaml /etc/rancher/rke2
curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_COMMIT='eb2d438a2fe6b426ecd00cb8e829ddc728a246b7' INSTALL_RKE2_TYPE='server' INSTALL_RKE2_METHOD=tar sh -
$ sudo systemctl enable --now rke2-server
or
$ sudo systemctl enable --now rke2-agent
kubectl get nodes -o wide
kubectl get pods -A
sudo rke2 secrets-encrypt rotate-keys
sudo rke2 secrets-encrypt status
Validation Results:
$ rke2 -v
rke2 version v1.29.3+dev.eb2d438a (eb2d438a2fe6b426ecd00cb8e829ddc728a246b7)
go version go1.21.8 X:boringcrypto
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-31-17-195 Ready control-plane,master 22m v1.29.3+rke2r1
ip-172-31-19-236 Ready etcd 22m v1.29.3+rke2r1
ip-172-31-25-125 Ready control-plane,master 20m v1.29.3+rke2r1
ip-172-31-28-204 Ready <none> 20m v1.29.3+rke2r1
Rotate-keys:
$ sudo rke2 secrets-encrypt rotate-keys
keys rotated, reencryption started
Reboot rke2 services and get status:
$ sudo rke2 secrets-encrypt status
Encryption Status: Enabled
Current Rotation Stage: reencrypt_finished
Server Encryption Hashes: All hashes match
Active Key Type Name
------ -------- ----
* AES-CBC aescbckey-2024-04-08T21:19:13Z
Issue found on master branch with version v1.29.2-rc3+rke2r1
Environment Details
Infrastructure
Node(s) CPU architecture, OS, and Version:
Cluster Configuration:
Config.yaml:
ETCD server config:
CP only node configs:
Steps to reproduce:
Reproducing Results/Observations:
The file /var/lib/rancher/rke2/server/cred/encryption-config.json seems to get out of sync with the datastore.
The metrics server does not produce the right result and hence the rotate-keys operation never completes:
P.S: Another file to keep an eye on: /var/lib/rancher/rke2/server/cred/encryption-state.json
Expected behavior:
the reencrypt_finished stage should occur on a successful command completion of the same, when we retry the
sudo rke2 secrets-encrypt status
command after a few seconds. reboot the nodes in order - etcd nodes then cp nodes - and all hashes should match.