k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
27.82k stars 2.33k forks source link

[Release-1.28] - Improve performance on K3s secrets-encrypt reencrypt #10639

Closed brandond closed 1 month ago

brandond commented 2 months ago

Backport fix for Improve performance on K3s secrets-encrypt reencrypt

aganesh-suse commented 1 month ago

Validated on release-1.28 branch with 2701d8fca45cf675b481e927827dd1dceb51b01c

Environment Details

Infrastructure

Node(s) CPU architecture, OS, and Version:

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"

$ uname -m
x86_64

Cluster Configuration:

HA: 3 server/ 1 agent
1 etcd, 2 cp nodes and 1 agent

Config.yaml:

Etcd Node/Server1:

cat /etc/rancher/k3s/config.yaml 
token: xxxx
disable-apiserver: true
disable-controller-manager: true
disable-scheduler: true
node-taint:
- node-role.kubernetes.io/etcd:NoExecute
cluster-init: true
write-kubeconfig-mode: "0644"
secrets-encryption: true
node-external-ip: 1.1.1.1
node-label:
- k3s-upgrade=server
debug: true

CP Nodes:

$ cat /etc/rancher/k3s/config.yaml 
token: secret
server: https://1.1.1.1:6443
disable-etcd: true
node-taint:
- node-role.kubernetes.io/control-plane:NoSchedule
write-kubeconfig-mode: "0644"
secrets-encryption: true
node-external-ip: 2.2.2.2
node-label:
- k3s-upgrade=server
debug: true

Agent node:

$ cat /etc/rancher/k3s/config.yaml 
token: secret
server: https://1.1.1.1:6443
node-external-ip: 4.4.4.4
node-label:
- k3s-upgrade=agent
debug: true

Testing Steps

  1. Copy config.yaml
    $ sudo mkdir -p /etc/rancher/k3s && sudo cp config.yaml /etc/rancher/k3s
  2. Install k3s
    curl -sfL https://get.k3s.io | sudo INSTALL_K3S_COMMIT='2701d8fca45cf675b481e927827dd1dceb51b01c' sh -s - server
  3. Verify Cluster Status:
    kubectl get nodes -o wide
    kubectl get pods -A
  4. Refer: https://github.com/k3s-io/k3s/pull/10571 Test reencryption via: a) Traditional method: prepare/reboot, rotate/reboot, reencrypt reboot. b) New method: rotate-keys option for rencryption Test 1: with 1001 basic secrets Test 2: With 150 large secrets at the size of 1000k each. (plus 1 basic secret) Note: The large secrets is highly memory intensive. Use minimum 8G memory for each node while testing this.

Compare the time taken for reencryption by monitoring the journal logs for secrets processed time.

Replication Results:

Basic secrets time taken: Traditional method: 3 min 18 sec Rotate_keys method: 3 min 20 sec Example logs:

journalctl -xeu k3s | grep 'SecretsProgress' 
Aug 15 05:19:41 ip-172-31-30-123 k3s[41236]: I0815 05:19:41.159777   41236 event.go:307] "Event occurred" object="ip-172-31-30-123" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 10 secrets"
.
.
Aug 15 05:22:59 ip-172-31-30-123 k3s[41236]: I0815 05:22:59.563794   41236 event.go:307] "Event occurred" object="ip-172-31-30-123" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 1000 secrets"

Large secrets time taken: Traditional method: 27 seconds Rotate_keys method: 30 seconds

Aug 15 04:02:14 ip-172-31-30-123 k3s[28877]: I0815 04:02:14.692690   28877 event.go:307] "Event occurred" object="ip-172-31-30-123" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 10 secrets"
.
.
Aug 15 04:02:44 ip-172-31-30-123 k3s[28877]: I0815 04:02:44.332523   28877 event.go:307] "Event occurred" object="ip-172-31-30-123" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 160 secrets"

Validation Results:

Basic secrets time taken for rencryption: Traditional method: 9 secs Rotate_keys method: 10 secs Example logs:

Aug 15 05:20:41 ip-172-31-22-87 k3s[41263]: I0815 05:20:41.303298   41263 event.go:307] "Event occurred" object="ip-172-31-22-87" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 50 secrets"
.
.
Aug 15 05:20:50 ip-172-31-22-87 k3s[41263]: I0815 05:20:50.362642   41263 event.go:307] "Event occurred" object="ip-172-31-22-87" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 1000 secrets"

Large secrets time taken for reencryption: Traditional method: 9 secs Rotate_keys method: 10 secs

Aug 15 03:50:59 ip-172-31-22-87 k3s[26965]: I0815 03:50:59.055379   26965 event.go:307] "Event occurred" object="ip-172-31-22-87" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 50 secrets"
Aug 15 03:51:03 ip-172-31-22-87 k3s[26965]: I0815 03:51:03.841977   26965 event.go:307] "Event occurred" object="ip-172-31-22-87" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 100 secrets"
Aug 15 03:51:08 ip-172-31-22-87 k3s[26965]: I0815 03:51:08.606684   26965 event.go:307] "Event occurred" object="ip-172-31-22-87" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 150 secrets"