kcp-dev / helm-charts

Helm chart repo for KCP
Apache License 2.0
5 stars 21 forks source link

Allow for specifying storageClassName for etcd PVC #98

Closed tnthornton closed 2 weeks ago

tnthornton commented 2 weeks ago

Currently, consumers of the KCP Helm chart are unable to specify a StorageClass to use with the Etcd StatefulSet. This isn't a big deal in local dev environments. However with larger deployments where you need different storage types (e.g. SSDs), this can be problematic unless you change the default storageclass for the cluster - which has it's own impacts.

This changeset enables specifying the storageClassName for the underlying volumeClaimTemplate via the values.yaml or Helm override arg. I've explicitly opted to only use the storageClassName property if it's been set in the values.yaml, rather than specifying a default value in the values.yaml file. That way the current behavior continues to work and operators that need a more advanced setting are enabled to specify the target storageClass via the values.yaml.

How has this been tested

I tested these changes against two clusters, kind and GKE. With the kind cluster, there's simply just the 1 StorageClass that is included and available. Due to this, kind was used to test the basic case of not overriding the default behavior.

Kind

  1. Created the cluster and installed cert-manager, which resulted in the following resources in the cluster:
    k get pods -A                                                           ○ kind-kind
    NAMESPACE            NAME                                         READY   STATUS    RESTARTS   AGE
    cert-manager         cert-manager-654496659-grxcb                 1/1     Running   0          18s
    cert-manager         cert-manager-cainjector-6cbfdd7697-t2srl     1/1     Running   0          18s
    cert-manager         cert-manager-webhook-59786477cf-xpsmm        1/1     Running   0          18s
    kube-system          coredns-6f6b679f8f-mf75j                     1/1     Running   0          3m41s
    kube-system          coredns-6f6b679f8f-v298v                     1/1     Running   0          3m41s
    kube-system          etcd-kind-control-plane                      1/1     Running   0          3m48s
    kube-system          kindnet-6g2hd                                1/1     Running   0          3m41s
    kube-system          kube-apiserver-kind-control-plane            1/1     Running   0          3m48s
    kube-system          kube-controller-manager-kind-control-plane   1/1     Running   0          3m48s
    kube-system          kube-proxy-mvgkg                             1/1     Running   0          3m41s
    kube-system          kube-scheduler-kind-control-plane            1/1     Running   0          3m48s
    local-path-storage   local-path-provisioner-57c5987fd4-5ldpq      1/1     Running   0          3m41s
    k get storageclass                             11s ○ kind-kind
    NAME                 PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
    standard (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  11s
  2. Created the following myvalues.yaml in order to closely following the local dev instructions
    bat myvalues.yaml
    ───────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: myvalues.yaml
    ───────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    1   │ externalHostname: kcp-front-proxy.upbound-system.svc
  3. Installed KCP into the kind cluster using the command from the README
    helm install kcp ./charts/kcp --values ./myvalues.yaml --namespace kcp --create-name
    space
    NAME: kcp
    LAST DEPLOYED: Wed Aug 28 12:29:52 2024
    NAMESPACE: kcp
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
  4. Once pods were up and healthy, verified the deployment
    k -n kcp get pods                                                                     ○ kind-kind
    NAME                               READY   STATUS    RESTARTS      AGE
    kcp-644f6bb8df-qhztm               1/1     Running   0             3m
    kcp-etcd-0                         1/1     Running   0             3m
    kcp-etcd-1                         1/1     Running   1 (73s ago)   106s
    kcp-etcd-2                         1/1     Running   0             102s
    kcp-front-proxy-7677c784fd-z2mf8   1/1     Running   0             3m
    k get storageclass                                                                    ○ kind-kind
    NAME                 PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
    standard (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  5m18s
    k -n kcp get pvc                                                                       ○ kind-kind
    NAME                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
    etcd-data-kcp-etcd-0   Bound    pvc-81887157-5ed6-4056-96db-8eb465a7bc94   8Gi        RWO            standard       <unset>                 79s
    etcd-data-kcp-etcd-1   Bound    pvc-9344a510-d378-4b0a-8a44-b005e95d3d4c   8Gi        RWO            standard       <unset>                 10s
    etcd-data-kcp-etcd-2   Bound    pvc-6744606c-f4a9-4c36-88d7-685cf62d76e6   8Gi        RWO            standard       <unset>                 5s

GKE

  1. Created the cluster and installed cert-manager, which resulted in the following resources in the cluster:
    k get pods -A                                                  27s ○ cluster-2
    NAMESPACE         NAME                                                  READY   STATUS    RESTARTS   AGE
    cert-manager      cert-manager-69f6576f94-z4psq                         1/1     Running   0          101s
    cert-manager      cert-manager-cainjector-7746bc4d84-2xqzn              1/1     Running   0          101s
    cert-manager      cert-manager-webhook-996bcb478-crncc                  1/1     Running   0          101s
    gke-managed-cim   kube-state-metrics-0                                  2/2     Running   0          5m43s
    gmp-system        collector-pb8z5                                       2/2     Running   0          4m14s
    gmp-system        collector-qmzzv                                       2/2     Running   0          4m15s
    gmp-system        collector-rs2xt                                       2/2     Running   0          4m15s
    gmp-system        gmp-operator-7fb5bc5476-b9fjp                         1/1     Running   0          5m15s
    kube-system       event-exporter-gke-7c4fd479b6-wqm6t                   2/2     Running   0          5m38s
    kube-system       fluentbit-gke-dk66z                                   3/3     Running   0          4m42s
    kube-system       fluentbit-gke-fr8tk                                   3/3     Running   0          4m40s
    kube-system       fluentbit-gke-zq5q9                                   3/3     Running   0          4m40s
    kube-system       gke-metrics-agent-67tjg                               3/3     Running   0          4m42s
    kube-system       gke-metrics-agent-pnq7w                               3/3     Running   0          4m43s
    kube-system       gke-metrics-agent-zb65l                               3/3     Running   0          4m40s
    kube-system       konnectivity-agent-94f4cdb75-5c7dw                    2/2     Running   0          4m15s
    kube-system       konnectivity-agent-94f4cdb75-qknjs                    2/2     Running   0          5m26s
    kube-system       konnectivity-agent-94f4cdb75-rwpsd                    2/2     Running   0          4m15s
    kube-system       konnectivity-agent-autoscaler-67d4f7d5f-k57dd         1/1     Running   0          5m24s
    kube-system       kube-dns-5954f95c5-gwxt2                              5/5     Running   0          4m16s
    kube-system       kube-dns-5954f95c5-sqnnz                              5/5     Running   0          5m50s
    kube-system       kube-dns-autoscaler-79b96f5cb-2glc2                   1/1     Running   0          5m49s
    kube-system       kube-proxy-gke-cluster-2-default-pool-470bfea5-8hqh   1/1     Running   0          4m39s
    kube-system       kube-proxy-gke-cluster-2-default-pool-470bfea5-kjm9   1/1     Running   0          3m48s
    kube-system       kube-proxy-gke-cluster-2-default-pool-470bfea5-zzbz   1/1     Running   0          4m36s
    kube-system       l7-default-backend-db86fddff-46rjp                    1/1     Running   0          5m21s
    kube-system       metrics-server-v0.7.1-6767545bf-5gvkv                 2/2     Running   0          4m12s
    kube-system       pdcsi-node-bdhb5                                      2/2     Running   0          4m42s
    kube-system       pdcsi-node-k9ksb                                      2/2     Running   0          4m40s
    kube-system       pdcsi-node-w78np                                      2/2     Running   0          4m41s
    k get storageclass                                             ○ cluster-2
    NAME                     PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
    premium-rwo              pd.csi.storage.gke.io   Delete          WaitForFirstConsumer   true                   5m9s
    standard                 kubernetes.io/gce-pd    Delete          Immediate              true                   5m9s
    standard-rwo (default)   pd.csi.storage.gke.io   Delete          WaitForFirstConsumer   true                   5m9s
  2. Updated the myvalues.yaml to include the storageClass override
    bat myvalues.yaml
    ───────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: myvalues.yaml
    ───────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    1   │ externalHostname: kcp-front-proxy.upbound-system.svc
    2   │ etcd:
    3   │   storageClassName: premium-rwo
  3. Installed KCP into the kind cluster using the command from the README
    helm install kcp ./charts/kcp --values ./myvalues.yaml --namespace kc
    p --create-namespace
    NAME: kcp
    LAST DEPLOYED: Wed Aug 28 13:02:46 2024
    NAMESPACE: kcp
    STATUS: deployed
    REVISION: 1
    TEST SUITE: None
  4. Once pods were up and healthy, verified the deployment
    k -n kcp get pods                                                                   ○ cluster-2
    NAME                               READY   STATUS    RESTARTS   AGE
    kcp-5c5f89646c-h654r               1/1     Running   0          3m
    kcp-etcd-0                         1/1     Running   0          3m
    kcp-etcd-1                         1/1     Running   0          107s
    kcp-etcd-2                         1/1     Running   0          95s
    kcp-front-proxy-7f6b547dd6-h8stm   1/1     Running   0          3m
    k get storageclass                                                                  ○ cluster-2
    NAME                     PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
    premium-rwo              pd.csi.storage.gke.io   Delete          WaitForFirstConsumer   true                   8m8s
    standard                 kubernetes.io/gce-pd    Delete          Immediate              true                   8m8s
    standard-rwo (default)   pd.csi.storage.gke.io   Delete          WaitForFirstConsumer   true                   8m8s
    -n kcp get pvc                                                                    ○ cluster-2
    NAME                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
    etcd-data-kcp-etcd-0   Bound    pvc-975ea596-5fb5-430a-8d50-a4bde82bbe14   8Gi        RWO            premium-rwo    <unset>                 2m16s
    etcd-data-kcp-etcd-1   Bound    pvc-e75a5900-543d-4171-b6c7-d2848d242277   8Gi        RWO            premium-rwo    <unset>                 63s
    etcd-data-kcp-etcd-2   Bound    pvc-cf990027-7adf-4e79-ad58-4d33fc8cd63a   8Gi        RWO            premium-rwo    <unset>                 51s
kcp-ci-bot commented 2 weeks ago

Hi @tnthornton. Thanks for your PR.

I'm waiting for a kcp-dev member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
embik commented 2 weeks ago

/ok-to-test

kcp-ci-bot commented 2 weeks ago

LGTM label has been added.

Git tree hash: 01e2b31308a8e5746adbab8e43bd7c9fd63d0fc8

kcp-ci-bot commented 2 weeks ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: embik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kcp-dev/helm-charts/blob/main/OWNERS)~~ [embik] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment