helm / charts

⚠️(OBSOLETE) Curated applications for Kubernetes
Apache License 2.0
15.49k stars 16.79k forks source link

alpha vs. beta annotations for PVCs #116

Closed ligc closed 7 years ago

ligc commented 8 years ago

I am using helm to install the chart stable/mariadb, the deploy was initialized successfully, but never finish, the pod status stuck at "Init:0/1".

Helm version: 2.0.0-alpha.5

root@c910f04x19k07:~/helm/bin# helm version
Client: &version.Version{SemVer:"v2.0.0-alpha.5", GitCommit:"a324146945c01a1e2dd7eaf23caf0e55fabfd3d2", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.0.0-alpha.5", GitCommit:"1a7373e584f2b7732d902963f020fa72cc2e642f", GitTreeState:"clean"}
root@c910f04x19k07:~/helm/bin# 

Kubernetes version 1.3.6:

root@c910f04x19k07:~/helm/bin# kubectl version
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.6", GitCommit:"ae4550cc9c89a593bcda6678df201db1b208133b", GitTreeState:"clean", BuildDate:"2016-08-26T18:13:23Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3+", GitVersion:"cfc-0.1.0-dirty", GitCommit:"7577a50e5c71505e3458884f83590270bb693351", GitTreeState:"dirty", BuildDate:"2016-10-16T17:27:39Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
root@c910f04x19k07:~/helm/bin# 

Steps and output:

root@c910f04x19k07:~/helm/bin# helm search
NAME                    VERSION DESCRIPTION                                       
stable/drupal           0.3.2   One of the most versatile open source content m...
stable/jenkins          0.1.0   A Jenkins Helm chart for Kubernetes.              
stable/mariadb          0.5.1   Chart for MariaDB                                 
stable/mysql            0.1.0   Chart for MySQL                                   
stable/redmine          0.3.2   A flexible project management web application.    
stable/wordpress        0.3.0   Web publishing platform for building blogs and ...
root@c910f04x19k07:~/helm/bin# helm install stable/mariadb
Fetched stable/mariadb to mariadb-0.5.1.tgz
looping-narwha
Last Deployed: Mon Oct 17 05:21:40 2016
Namespace: default
Status: DEPLOYED

Resources:
==> extensions/Deployment
NAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
looping-narwha-mariadb   1         0         0            0           0s

==> v1/PersistentVolumeClaim
NAME                     STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
looping-narwha-mariadb   Pending                                      0s

==> v1/Secret
NAME                     TYPE      DATA      AGE
looping-narwha-mariadb   Opaque    2         0s

==> v1/ConfigMap
NAME                     DATA      AGE
looping-narwha-mariadb   1         0s

==> v1/Service
NAME                     CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
looping-narwha-mariadb   20.0.0.215   <none>        3306/TCP   0s

Notes:
MariaDB can be accessed via port 3306 on the following DNS name from within your cluster:
looping-narwha-mariadb.default.svc.cluster.local

To connect to your database run the following command:

   kubectl run looping-narwha-mariadb-client --rm --tty -i --image bitnami/mariadb --command -- mysql -h looping-narwha-mariadb
**========================> After 5 minutes**

root@c910f04x19k07:~/helm/bin# kubectl get deployment
NAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
looping-narwha-mariadb   1         1         1            0           5m
root@c910f04x19k07:~/helm/bin# kubectl get pods
NAME                                     READY     STATUS     RESTARTS   AGE
looping-narwha-mariadb-436722775-hss7g   0/1       Init:0/1   0          6m
root@c910f04x19k07:~/helm/bin# kubectl describe pod looping-narwha-mariadb-436722775-hss7g
Name:           looping-narwha-mariadb-436722775-hss7g
Namespace:      default
Node:           c910f04x19k08.pok.stglabs.ibm.com/10.4.19.8
Start Time:     Mon, 17 Oct 2016 05:21:41 -0400
Labels:         app=looping-narwha-mariadb
                chart=mariadb-0.5.1
                heritage=Tiller
                pod-template-hash=436722775
                release=looping-narwha
Status:         Pending
IP:
Controllers:    ReplicaSet/looping-narwha-mariadb-436722775
Init Containers:
  copy-custom-config:
    Container ID:
    Image:              bitnami/mariadb:10.1.18-r0
    Image ID:
    Port:
    Command:
      sh
      -c
      mkdir -p /bitnami/mariadb/conf && cp -n /bitnami/mariadb_config/my.cnf /bitnami/mariadb/conf/my_custom.cnf
    State:                      Waiting
      Reason:                   PodInitializing
    Ready:                      False
    Restart Count:              0
    Environment Variables:      <none>
Containers:
  looping-narwha-mariadb:
    Container ID:
    Image:              bitnami/mariadb:10.1.18-r0
    Image ID:
    Port:               3306/TCP
    State:              Waiting
      Reason:           PodInitializing
    Ready:              False
    Restart Count:      0
    Liveness:           exec [mysqladmin ping] delay=30s timeout=5s period=10s #success=1 #failure=3
    Readiness:          exec [mysqladmin ping] delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment Variables:
      MARIADB_ROOT_PASSWORD:    <set to the key 'mariadb-root-password' in secret 'looping-narwha-mariadb'>
      MARIADB_USER:
      MARIADB_PASSWORD:         <set to the key 'mariadb-password' in secret 'looping-narwha-mariadb'>
      MARIADB_DATABASE:
Conditions:
  Type          Status
  Initialized   False 
  Ready         False 
  PodScheduled  True 
Volumes:
  config:
    Type:       ConfigMap (a volume populated by a ConfigMap)
    Name:       looping-narwha-mariadb
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  looping-narwha-mariadb
    ReadOnly:   false
  default-token-04890:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-04890
QoS Tier:       BestEffort
Events:
  FirstSeen     LastSeen        Count   From                                            SubobjectPath   Type   Reason           Message
  ---------     --------        -----   ----                                            -------------   -------------           -------
  6m            6m              1       {default-scheduler }                                            Normal Scheduled        Successfully assigned looping-narwha-mariadb-436722775-hss7g to c910f04x19k08.pok.stglabs.ibm.com
  4m            1m              2       {kubelet c910f04x19k08.pok.stglabs.ibm.com}                     WarningFailedMount      Unable to mount volumes for pod "looping-narwha-mariadb-436722775-hss7g_default(1cd9eee8-944b-11e6-b7ca-42ae0a041307)": timeout expired waiting for volumes to attach/mount for pod "looping-narwha-mariadb-436722775-hss7g"/"default". list of unattached/unmounted volumes=[data]
  4m            1m              2       {kubelet c910f04x19k08.pok.stglabs.ibm.com}                     WarningFailedSync       Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "looping-narwha-mariadb-436722775-hss7g"/"default". list of unattached/unmounted volumes=[data]

root@c910f04x19k07:~/helm/bin# 
prydonius commented 8 years ago

Hey @ligc, looking at the events from the pod, it looks like the PersistentVolumeClaim (PVC) isn't being bounded to a persistent volume (PV). You can check the state of your PVC using kubectl get pvc. If your cluster doesn't support dynamic provisioning of persistent volumes, your cluster admin will need to make some available for PVCs to bound to.

By default, these charts will request a Persistent Volume using PersistentVolumeClaims. If you want, you can disable this by overriding the persistence.enabled value:

custom.yaml

persistence:
  enabled: false
$ helm install stable/mariadb --values custom.yaml

Does this point you in the right direction?

ligc commented 8 years ago

HI @prydonius, thanks for the suggestion. I was able to deploy the stable/mariadb with "persistence.enabled=false".

We only have nfs-based persistent volume available in our Kubernetes cluster, I created one nfs persistent volume, but the stable/mariadb could not bound to the nfs persistent volume, searched on internet, it seems to me that kubernetes does not support dynamic persistent volume with nfs backend. Is there any way to configure the stable/mariadb to use the nfs-based persistent volume? I read through the files in stable/mariadb/templates but did not find useful hints.

root@c910f04x19k07:~/helm/bin# kubectl get pv
NAME      CAPACITY   ACCESSMODES   STATUS      CLAIM     REASON    AGE
pv0001    20Gi       RWO           Available                       5h
root@c910f04x19k07:~/helm/bin# kubectl describe pv pv0001
Name:           pv0001
Labels:         <none>
Status:         Available
Claim:
Reclaim Policy: Retain
Access Modes:   RWO
Capacity:       20Gi
Message:
Source:
    Type:       NFS (an NFS mount that lasts the lifetime of a pod)
    Server:     10.4.19.7
    Path:       /k8spv
    ReadOnly:   false
No events.

root@c910f04x19k07:~/helm/bin# helm install stable/mariadb
Fetched stable/mariadb to mariadb-0.5.1.tgz
flailing-hydra
Last Deployed: Tue Oct 18 03:43:38 2016
Namespace: default
Status: DEPLOYED

Resources:
==> v1/PersistentVolumeClaim
NAME                     STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
flailing-hydra-mariadb   Pending                                      0s

==> v1/Secret
NAME                     TYPE      DATA      AGE
flailing-hydra-mariadb   Opaque    2         0s

==> v1/ConfigMap
NAME                     DATA      AGE
flailing-hydra-mariadb   1         0s

==> v1/Service
NAME                     CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
flailing-hydra-mariadb   20.0.0.37    <none>        3306/TCP   0s

==> extensions/Deployment
NAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
flailing-hydra-mariadb   1         0         0            0           0s

Notes:
MariaDB can be accessed via port 3306 on the following DNS name from within your cluster:
flailing-hydra-mariadb.default.svc.cluster.local

To connect to your database run the following command:

   kubectl run flailing-hydra-mariadb-client --rm --tty -i --image bitnami/mariadb --command -- mysql -h flailing-hydra-mariadb
root@c910f04x19k07:~/helm/bin# kubectl describe pvc
Name:           flailing-hydra-mariadb
Namespace:      default
Status:         Pending
Volume:
Labels:         <none>
Capacity:
Access Modes:
Events:
  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason                  Message
  ---------     --------        -----   ----                            -------------   --------        ------                  -------
  1m            4s              7       {persistentvolume-controller }                  Warning         ProvisioningFailed      No provisioner plugin found for the claim!

root@c910f04x19k07:~/helm/bin# 
prydonius commented 8 years ago

@ligc the PVC should be able to bind to your NFS PV. I think the reason it might not be, is because by default MariaDB looks for PVs with the annotation:

 volume.alpha.kubernetes.io/storage-class: "generic"

If you add this annotation to your PV, does it work? Alternatively, if you install using the following YAML file, does it work?

persistence:
  storageClass: ""

According to the documentation, an empty storage class will looks for PVs without the annotation.

ligc commented 8 years ago

HI @prydonius, thanks, the "storage-class" is the right direction, but the two options you mentioned above do not seem work.

What I figured out was that the key of the annotation should be "volume.beta.kubernetes.io/storage-class" instead of "volume.alpha.kubernetes.io/storage-class", after I changed it in the stable/mariadb/templates/pvc.yaml, then it works perfectly. I am not exactly sure why the alpha version is not working, http://blog.kubernetes.io/2016/10/dynamic-provisioning-and-storage-in-kubernetes.html does indicate that the alpha version is still supported with Kubernetes 1.4, I am using Kubernetes 1.3.6.

We may want to change to use volume.beta.kubernetes.io/storage-class in stable/mariadb/templates/pvc.yaml, I am seeing the same issue with almost all the charts under http://storage.googleapis.com/kubernetes-charts/stable/...

prydonius commented 8 years ago

I am not exactly sure why the alpha version is not working, http://blog.kubernetes.io/2016/10/dynamic-provisioning-and-storage-in-kubernetes.html does indicate that the alpha version is still supported with Kubernetes 1.4, I am using Kubernetes 1.3.6.

Indeed, that's really strange, especially since you're on 1.3 and the annotation was alpha then. I think we decided to keep the alpha annotation since 1.4 is supposed to support it, but if, as you say beta works, then we should probably update the charts.

I am confused by this though, given the way it should work was outlined in this issue https://github.com/kubernetes/charts/pull/43#r76327200.

Maybe @erictune or @saad-ali can comment on this.

saad-ali commented 8 years ago

The alpha version of dynamic provisioning, triggered by the volume.alpha.kubernetes.io/storage-class annotation, completeley ignores the actual storage class value. As long as the annotation is present, regardless of its value, k8s triggers a hard coded provisioner, based on the cloud you are running in: EBS for AWS, Persistent Disk for Google Cloud, Cinder for OpenStack, and vSphere Volumes on vSphere. If you are not running on one of those cloud providers, the alpha version of dynamic provisioning does nothing (your claim remains unfulfilled).

The beta version of dynamic provisioning, triggered by the volume.beta.kubernetes.io/storage-class annotation actually looks at the value of the annotation and uses it to figure out what StorageClass/provisioner to use.

prydonius commented 8 years ago

Thanks for the clarification, in that case I think we should change the charts to use the beta annotation.

@saad-ali what would a good default value for the annotation be? We're currently using "generic", but I'm not sure if that has any significance. From the documentation, it sounds like a better default value would be to leave it as an empty string.

saad-ali commented 8 years ago

The beta version of dynamic provisioning requires the value of the annotation to correspond to the StorageClass to use for provisioning. That depends on what StorageClass objects the cluster admin choose to define for that cluster (varies from cluster to cluster). If you put the beta annotation with an arbitrary StorageClass value on your PersistentVolumeClaim (like "generic") and there are no PersistentVolume or StorageClass objects with the same storage class name, then the PersistentVolumeClaim will remain unbound. The value must correspond to an existing StorageClass or a per-provisioned PersistentVolume with the same storage class for the claim to be fulfilled.

A value of empty string for the beta annotation is a mechanism to disable dynamic provisioning: A PVC with a beta annotation and an empty-string value is bound only to pre-provisioned unbound PV object that fulfill the claim and have no storage class specified. If no such volumes exists, the claim remains unbound.

If you want your PersistentVolumeClaim object to be portable across clusters, leave the dynamic provisioning annotation off altogether. This will fall back to binding to pre-provisioned volumes for clusters that don't have dynamic provisioning. For cluster where the cluster administrator chooses to enable dynamic provisioning (by creating a StorageClass and marking it as default) a PVC without an annotation will automatically result in the use of the default StorageClass to dynamically provision a volume to fulfill the PVC.

This is designed so that the decision of how a request for storage is fulfilled, is in the hands of the cluster administrator (via StorageClass or manually created PersistentVolume objects), and a user scheduling work on the cluster only has to create a request for storage (via PersistentVolumeClaim Object).

prydonius commented 8 years ago

Thanks a lot for the explanations @saad-ali, this is really useful. In that case, I believe we should update the charts to conditionally set the annotation only if the .Values.persistence.storageClass option is defined.

Thoughts @viglesiasce?

prydonius commented 7 years ago

More on this. See https://github.com/kubernetes/kubernetes/pull/31617#issuecomment-259550502 and http://kubernetes.io/docs/user-guide/persistent-volumes/#writing-portable-configuration. Essentially, we should conditionally set the beta annotation if a storage class is provided, or revert to an alpha annotation if not.

The above PR will add storage classes by default to AWS, GKE, GCE, OpenStack clusters; which will enable us to always set the beta annotation. However, the earliest this might be available is 1.6.

erictune commented 7 years ago

Let's all +1 on https://github.com/kubernetes/kubernetes/pull/31617 to let the author know his PR is important to us.

krancour commented 7 years ago

I just got bit by this. The majority of all charts in this repo that use persistent volumes don't currently work in multi-zone clusters.

Can I suggest that the easiest path forward, even if it's only an interim solution is to include both the volume.alpha.kubernetes.io/storage-class and volume.beta.kubernetes.io/storage-class annotations anywhere where dynamic provisioning is currently employed.

afaik, k8s 1.3 will completely ignore volume.beta.kubernetes.io/storage-class since it doesn't know what to do with it and I have tested using both annotations at once on 1.4.6 in AWS and have found the results favorable (i.e. everything works as expected).

imho, this would be a great way to restore many charts to a working state for the 1.4.x crowd, and it doesn't prevent further discussion on this matter from continuing as things evolve further.

cc @prydonius @lachie83

Edit: To cover our bases, it would be good to test this idea further before committing to it, but as noted, my initial attempt has shown it to be promising.

lachie83 commented 7 years ago

Can we get agreement on the best outcome here?

It currently seems to me that we can add both annotations to ensure that it works for 1.3 and 1.4 clusters. Is that statement correct?

saad-ali commented 7 years ago

Can we get agreement on the best outcome here?

(xref discussion in https://github.com/kubernetes/kubernetes/issues/37239)

Ultimately we want to have users not worry about how their storage is fulfilled (unless they really care about it)--only cluster admins should worry about how storage is fulfilled. Therefore, ideally, users should just create a PVC with no annotation.

The cluster admin should decide how the PVC should be fulfilled, to do so they can:

A beta dynamic provisioning annotation is really something only a user should set because it is the mechanism for selecting a custom dynamic storage provisioner (e.g. I want an SSD instead of spinning disk, so I will select the mycluster-ssd StorageClass). More importantly any PVCs automatically created on a user's behalf (by charts, for example) should never assume a cluster implementation, and thus should never specify a beta dynamic provisioning annotation (unless explicitly asked by a user to do so).

The problem, as you've all run into, is if a cluster admin 1) did not manually create PV objects, and 2) did not a create a default StorageClass, then PVCs (without annotation) will remain unbound. The suggestions I've seen are:

  1. Create a PVC with alpha dynamic provisioning annotation (the 1.5 release is backwards compatible afterwall).
    • This will trigger dynamic provisioning based on the cloud you are running in.
      • Problem: if you are not running on one of the hardcoded cloud providers, this will cause the PVC to remaining unbound indefinitely.
  2. Create a PVC with beta dynamic provisioning annotation.
    • Problem: If the specified StorageClass does not exist on that cluster, the PVC will remain unbound indefinitely.
    • Beginning with 1.6 (PR https://github.com/kubernetes/kubernetes/pull/31617), all cloud providers will have a default StorageClass, but the same problem as the alpha exists:
      • Problem: if you are not running on one of the those cloud providers, the PVC will remain indefinitely unbound.
  3. Create a PVC with both alpha and beta dynamic provisioning annotation
    • Against 1.4+ clusters this will trigger dynamic provisioning using the specified StorageClass.
      • Problem: Same as 2 above.
    • Against 1.2-1.3 clusters this will trigger dynamic provisioning based on the cloud you are running in.
      • Problem: Same as 1 above.

The problem you are trying to solve is what happens if the cluster admin doesn't specify a way to fulfill storage. And the answer, while not great, is that the request for storage remains outstanding until the resources to fulfill it are available.

My suggestion is to specify no annotation by default, but to let users pass through a beta StorageClass if they want to (to select a class if they prefer). The problem with this approach is that it won't "just work" for cloud providers at the moment (in 1.4 and 1.5), until 1.6. So if you really want a better user experience for cloud provider users, then I guess you could stick with the alpha annotation for now, and once 1.6 ships, switch to no annotation (w/optional StorageClass pass through).

krancour commented 7 years ago

That's a very academic answer. It's well thought out, and while I appreciate many of the points laid out, the fact remains that in a multi-zone cluster with 1.4, many of the charts in this repo which currently use the alpha annotation simply do not work. Waiting for 1.6 so that this can be rectified "the right way" means those charts remain unusable on such clusters in the interim.

prydonius commented 7 years ago

Thanks for the detailed thoughts @saad-ali @krancour. It's really important to us to provide a good out of the box experience, so may I suggest the following approach:

I think this provides a good-enough default, whilst also making it easy to use storage classes (and of course there's the option to disable persistence entirely if a cluster just cannot support it).

@mgoodness uses this approach in the Prometheus chart: https://github.com/kubernetes/charts/pull/235.

If there is agreement on this approach, then I can update all the charts to follow it.

How do we want to communicate to users that a PVC may remain unbound indefinitely causing their application not to start? Is there something we can add to the chart notes? It would be good to think of a way to make it more obvious to users that they may need to disable persistence or specify a storage class for the chart to work on their cluster.

viglesiasce commented 7 years ago

@prydonius you beat me to it!

Yes I think that is the best thing we can do for now. Let's figure out the unbound issue in a separate thread.

krancour commented 7 years ago

@prydonius what would be wrong with keeping things as they are, but simply adding the beta annotation alongside any place the alpha annotation is currently used?

It seems to me that by doing so, the current behavior of the charts for clusters < 1.3 is not changed in any way from what it currently is and that for clusters >= 1.3, it fixes the current problem.

prydonius commented 7 years ago

@krancour I believe this will break the "just works" case for >= 1.4 since you need to have a storage class. Since >= 1.4 still supports the alpha annotation and dynamically provisions PVs on most cloud providers without needing a storage class, I think this is the best default for when a storage class isn't provided. Does that make sense? I should have a PR for this change to all charts soon.

krancour commented 7 years ago

Since >= 1.4 still supports the alpha annotation...

That wasn't clear to me that the beta provisioner didn't replace the alpha provisioner.

bluecmd commented 7 years ago

Anoter datapoint is that I'm unable to use alpha for my Ceph RBD storage class. See https://github.com/kubernetes/kubernetes/issues/39201,

ravishivt commented 7 years ago

I'm hitting this as well with a bare metal cluster and glusterfs (NFS) StorageClass. It looks like some charts have been updated (https://github.com/kubernetes/charts/search?utf8=%E2%9C%93&q=volume.beta) and some haven't (postgresql, redis, gitlab-ce). It looks like #255 was going for a chart-wide update but it got dropped. Should individual PRs be created to update the older charts?

lachie83 commented 7 years ago

@ravishivt. @prydonius is going to take an individual audit and file an issue. Happy to field any PRs that address this issue in the mean time. #235 is the standard approach we are following.

ravishivt commented 7 years ago

The only think I don't like about the approach in #235 is that it's not possible to leverage the default StorageClass by not providing an annotation. It's not a big deal though. I just have to make sure all my k8s users get the StorageClass's name right.

For cluster where the cluster administrator chooses to enable dynamic provisioning (by creating a StorageClass and marking it as default) a PVC without an annotation will automatically result in the use of the default StorageClass to dynamically provision a volume to fulfill the PVC.

prydonius commented 7 years ago

@ravishivt that's a good point. I think for now we should stick with the approach in #235, and the default storage class use-case will be covered when we update charts for k8s 1.6.

Thanks for everyone's input on this, I've created #520 to track the PRs @AmandaCameron has contributed.

lachie83 commented 7 years ago

Resolved. Closing