rancher / dashboard

The Rancher UI
https://rancher.com
Apache License 2.0
453 stars 257 forks source link

[BUG] Cant edit a cluster with encryption provider by yaml , corrupts encryption.yaml #6269

Open BobVanB opened 2 years ago

BobVanB commented 2 years ago

Still an issue in v2.6.5

https://github.com/rancher/rancher/issues/36197

[Adding context from that ticket in case things get lost. For engineering, this should be easy to replicate.]

Rancher Server Setup

Information about the Cluster

rancher_cluster/resource definition:

{
    "name": name,
    "enableNetworkPolicy": False,
    "rancherKubernetesEngineConfig": {
        "kubernetesVersion": "v1.21.8-rancher1-1",
        "services": {
            "etcd": {
                "extraArgs": {
                    "election-timeout": "5000",
                    "heartbeat-interval": "500"
                }
            },
            "kubeApi": {
                "secretsEncryptionConfig": {
                    "customConfig": {
                        "apiVersion": "apiserver.config.k8s.io/v1",
                        "kind": "EncryptionConfiguration",
                        "resources": [
                            {
                                "providers": [
                                    {
                                        "aescbc": {
                                            "keys": [
                                                {
                                                    "name": "key1",
                                                    "secret": "c2VjcmV0IGlzIHNlY3VyZQ=="
                                                }
                                            ]
                                        }
                                    }
                                ],
                                "resources": [
                                    "secrets"
                                ]
                            }
                        ]
                    },
                    "enabled": True
                }
            }
        },
        "ingress": {
            "extraArgs": {
                "default-ssl-certificate": "ingress-nginx/ingress-default-cert"
            },
            "options": {
                "error-log-level": "crit"
            }
        }
    }
}

User Information

Default rancher container, nothing special.

Describe the bug

Editing a cluster with edit as yaml in the rancher ui cluster management, will add defaults to the encryption.yaml. This will lead to a kube-apiserver that will not continue to start and a cluster state that ends up with a error.

To Reproduce

  1. Start rancher docker run -d --rm -p 443:443 --privileged --name rancher "rancher/rancher:v2.6.2

  2. Create api token for admin.

  3. Set the hostname to http://rancher in the global settings.

  4. We create a cluster through the api and register a encryption provider. This will create a encryption.yaml without aesgcm, kms or secretbox You can probably get the same result to create a cluster through the ui and go to step 7.

  5. Register a machine with all roles docker run --privileged -d --name test-cluster --link rancher:rancher docker:dind

  6. Take note of the local EncryptionConfiguration on the machine. /etc/kubernetes/ssl/encryption.yaml

    # /etc/kubernetes/ssl/encryption.yaml 
    apiVersion: apiserver.config.k8s.io/v1
    kind: EncryptionConfiguration
    apiVersion: apiserver.config.k8s.io/v1
    kind: EncryptionConfiguration
    resources:
    - providers:
    - aescbc:
        keys:
        - name: key1
          secret: c2VjcmV0IGlzIHNlY3VyZQ==
    resources:
    - secrets
  7. Open the edit as yaml on the cluster.

  8. Hit Safe, with no changes.

  9. Wait for the new encryption config.

    #/etc/kubernetes/ssl/encryption.yaml
    apiVersion: apiserver.config.k8s.io/v1
    kind: EncryptionConfiguration
    apiVersion: apiserver.config.k8s.io/v1
    kind: EncryptionConfiguration
    resources:
    - providers:
    - aescbc:
        keys:
        - name: key1
          secret: c2VjcmV0IGlzIHNlY3VyZQ==
      aesgcm:
        keys: []
      identity: {}
      kms:
        endpoint: ""
        name: ""
      secretbox:
        keys: []
    resources:
    - secrets

Result

EncryptionConfiguration with default providers that are empty. Rancher/Kubernetes does not like this and goes into an error state.

Error: error while parsing encryption provider configuration file "/etc/kubernetes/ssl/encryption.yaml":
error while parsing file: [resources[0].providers[0]: Invalid value: config.ProviderConfiguration  {AESGCM:(*config.AESConfiguration)(0xc0006bfd40), AESCBC:(*config.AESConfiguration)(0xc0006bfd28),
Secretbox:(*config.SecretboxConfiguration)(0xc0006bfd58), Identity:(*config.IdentityConfiguration)  (0x78d2290),
KMS:(*config.KMSConfiguration)(0xc000c15020)}: more than one provider specified in a single element,
should split into different list elements, resources[0].providers[0].kms.name:
Required   value: name is a mandatory field for a provider, resources[0].providers[0].kms.endpoint:
Invalid value: "": endpoint is a mandatory field for a kms]

Expected Result

The kube-apiservers are restarted without errors.

Additional context

Workaround 1 to fix it:

  1. Edit the cluster in the ui
  2. Click on the button 'Edit as yaml'
  3. Clean up the security yaml (remove empty values)
  4. Hit Save
  5. Never hit the save button in the rancher UI again.

Workaround 2 to fix it:

  1. Edit the cluster crd from the management cluster with kubectl
  2. Never hit the save button in the rancher UI again.
export KUBECONFIG=<RANCHER_LOCAL_CLUSTER>
kubectl edit clusters.management.cattle.io <CLUSTER_ID>
catherineluse commented 1 year ago

I think this issue may be resolved by a fix in v2.7.0. https://github.com/rancher/dashboard/issues/6881 Previously, the UI was changing YAML values even when the user had not made any changes, but it should no longer do that since the fix.

gaktive commented 1 year ago

@bobvanb have you tried this in a newer version of Rancher? As noted, we may have fixed this in 2.7.0.

BobVanB commented 1 year ago

Yup, we tested it in version 2.7.6 and it is also broken in a different way. https://github.com/rancher/dashboard/issues/10330

mantis-toboggan-md commented 8 months ago

Blocked by https://github.com/rancher/rancher/issues/44264

gaktive commented 6 months ago

For internal coordination, SURE-8094 is our reference.

gaktive commented 3 months ago

Pushing to 2.10 since this depends on 2 blocked backend tix. Candidate for 2.9.x back port.