Setting PVC attributes in POM - default volume size

tdcox commented 8 years ago

Setting the default volume size for a PV claim in the POM like so:

<fabric8.defaultPersistentVolumeClaimRequestsStorage>
    50M
</fabric8.defaultPersistentVolumeClaimRequestsStorage>

Results in a JSON entry like:

    "apiVersion" : "v1",
    "kind" : "PersistentVolumeClaim",
    "metadata" : {
      "annotations" : { },
      "labels" : { },
      "name" : "www"
    },
    "spec" : {
      "accessModes" : [ "ReadWriteMany" ],
      "resources" : {
        "limits" : { },
        "requests" : {
          "storage" : "50M"
        }
      },
      "volumeName" : "www"
    }

However the deployment fails with:

"Unable to mount volumes for pod "nginx-test-vol-75afg_default(d5b09186-fa6d-11e5-91bd-323534663334)": unsupported volume type"

There exists a matching PV like so:

NAME      CAPACITY   ACCESSMODES   STATUS     CLAIM         REASON    AGE
www       50M        RWX           Released   default/www             14m

Full JSON here: https://gist.github.com/tdcox/3fd826045c95513bdb905797026b35d9

tdcox commented 8 years ago

Bah, pasting XML doesn't render properly. Above should read:

Setting the default volume size for a PV claim in the POM like so:

fabric8.defaultPersistentVolumeClaimRequestsStorage>50M</fabric8.defaultPersistentVolumeClaimRequestsStorage

tdcox commented 8 years ago

OK, I think this is a combination of another bug and a poor error message. Setting defaultPersistentVolumeClaimRequestsStorage does appear to work if the value is less than the capacity of the PV. If the value is larger, the deployment fails with 'unsupported volume type', which is misleading, but could be expected to fail.

What has been throwing me is that once you have deployed a pod against a PV, it will fail to redeploy subsequently, also with the same unhelpful error 'unsupported volume type'. To get it to redeploy, you seem to have to delete the Pod, the PV and the PVC manually, then recreate the PV (it's NFS, so the data is still there). At this point, you can redeploy again.

So, looks like CI / CD is currently not possible with storage involved unless I'm missing something.

rhuss commented 8 years ago

Indeed the error message is really not very helpful but it comes directly from OpenShift. Wonder whether we should open an issue there.

BTW, instead of setting the default claim size, in you example you could have set

<fabric8.volume.www.requestStorage>50M<fabric8.volume.www.requestStorage>

only for your specific volume (which is name www in your example)

tdcox commented 8 years ago

Thanks. Full POM is here if you wish to reproduce - it's just a Java archetype, tweaked.

https://gist.github.com/tdcox/de9caa18436d833c7b5b11923b29b3f0

tdcox commented 8 years ago

I can confirm that the same symptoms occur when using 'requestStorage'.

tdcox commented 8 years ago

Looks like this is also flooding the system log with the following every few seconds:

Apr  4 17:33:23 vagrant openshift: E0404 17:33:23.589436     952 persistent_claim.go:74] The volume is not yet bound to the claim. Expected to find the bind on volume.Spec.ClaimRef: &{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:www GenerateName: Namespace: SelfLink:/api/v1/persistentvolumes/www UID:14ac6902-fa8a-11e5-91bd-323534663334 ResourceVersion:109875 Generation:0 CreationTimestamp:2016-04-04 17:24:26 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[]} Spec:{Capacity:map[storage:{Amount:1000000000.000 Format:DecimalSI}] PersistentVolumeSource:{GCEPersistentDisk:<nil> AWSElasticBlockStore:<nil> HostPath:<nil> Glusterfs:<nil> NFS:0xc22cc8bec0 RBD:<nil> ISCSI:<nil> FlexVolume:<nil> Cinder:<nil> CephFS:<nil> FC:<nil> Flocker:<nil> AzureFile:<nil>} AccessModes:[ReadWriteMany] ClaimRef:<nil> PersistentVolumeReclaimPolicy:Retain} Status:{Phase:Available Message: Reason:}}
Apr  4 17:33:23 vagrant openshift: E0404 17:33:23.590163     952 kubelet.go:1716] Unable to mount volumes for pod "nginx-test-vol-me5z9_default(3b795bdd-fa8a-11e5-91bd-323534663334)": unsupported volume type; skipping pod
Apr  4 17:33:23 vagrant openshift: E0404 17:33:23.590193     952 pod_workers.go:138] Error syncing pod 3b795bdd-fa8a-11e5-91bd-323534663334, skipping: unsupported volume type

rhuss commented 8 years ago

Just tried to reproduce the issue with the stripped down yaml from below, however I fail.

@tdcox what type is your PV ? (hostPath ?) How did you create the PV ?

PV

apiVersion: v1
kind: PersistentVolume
metadata:
  name: www
spec:
  capacity:
    storage: 50M
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Recycle
  hostPath:
    path: /tmp

Deployment

---
apiVersion: "v1"
kind: "List"
items:
  - apiVersion: "v1"
    kind: "PersistentVolumeClaim"
    metadata: 
      name: "www"
    spec: 
      accessModes: 
        - "ReadWriteMany"
      resources: 
        requests: 
          storage: "50M"
      volumeName: "www"
  - apiVersion: "v1"
    kind: "ReplicationController"
    metadata: 
      name: "nginx-test-vol"
    spec: 
      replicas: 1
      selector: 
        project: "nginx-test-vol"
        group: "experiments"
      template: 
        metadata: 
          labels: 
            project: "nginx-test-vol"
            group: "experiments"
        spec: 
          containers: 
            - image: "nginx"
              name: "nginx-test-vol"
              volumeMounts: 
                - mountPath: "/usr/share/nginx/html"
                  name: "www"
                  readOnly: false
          volumes: 
            - name: "www"
              persistentVolumeClaim: 
                claimName: "www"
                readOnly: false

rhuss commented 8 years ago

However, when I increase the claim to 100M I the type error:

oc describe pod nginx-test-vol-8d7yr
Name:       nginx-test-vol-8d7yr
Namespace:  default
Node:       172.28.128.4/172.28.128.4
Start Time: Mon, 04 Apr 2016 20:40:04 +0200
Labels:     group=experiments,project=nginx-test-vol
Status:     Pending
IP:
Controllers:    ReplicationController/nginx-test-vol
Containers:
  nginx-test-vol:
    Container ID:
    Image:      nginx
    Image ID:
    Port:
    QoS Tier:
      cpu:      BestEffort
      memory:       BestEffort
    State:      Waiting
      Reason:       ContainerCreating
    Ready:      False
    Restart Count:  0
    Environment Variables:
Conditions:
  Type      Status
  Ready     False
Volumes:
  www:
    Type:   PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  www
    ReadOnly:   false
  default-token-1bb89:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-1bb89
Events:
  FirstSeen LastSeen    Count   From            SubobjectPath   Type        Reason      Message
  --------- --------    -----   ----            -------------   --------    ------      -------
  23s       23s     1   {default-scheduler }            Normal      Scheduled   Successfully assigned nginx-test-vol-8d7yr to 172.28.128.4
  23s       2s      3   {kubelet 172.28.128.4}          Warning     FailedMount Unable to mount volumes for pod "nginx-test-vol-8d7yr_default(a5482050-fa94-11e5-ba36-080027b5c2f4)": unsupported volume type
  23s       2s      3   {kubelet 172.28.128.4}          Warning     FailedSync  Error syncing pod, skipping: unsupported volume type

rhuss commented 8 years ago

After some research there are several issue open related to this bogus error message:

The periodic log file entries are due to periodically retrying to fulfill the PVC so I consider this to be the normal behaviour.

If you don't mind I would like to close this issue and I recommend to track the issues above for a fix which eventually will included in OpenShift, too.

tdcox commented 8 years ago

PV is created using something like this:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: www
spec:
  capacity:
    storage: 1G
  accessModes:
  - ReadWriteMany
  nfs:
    path: /share/pv
    server: 192.168.50.2
  persistentVolumeReclaimPolicy: Retain

then executing with oc create -f pv.yml.

I'm using NFS in these tests.

tdcox commented 8 years ago

I concur that the issues above are sufficient to cover the problem of obfuscated error messages, however I don't see anything in those relating to the fact that we are seemingly unable to re-use a PV marked as 'retain' when we redeploy a fabric8 app. That's somewhat of a showstopper for fabric8-devops, so I would hope that there were some ongoing issue tracking for this problem?

rhuss commented 8 years ago

Ah sorry, got lost about the real issue. 'will continue tomorrow ....

sorry ...

tdcox commented 8 years ago

No problem. Looking at the code, it appears that if you don't mark the claim as Read Only, it will be created as ReadWriteMany. This seems to mean that you can't use ReadWriteOnce volumes from the POM, but more interestingly, implies that the PV should support multiple containers binding to it. If that were the case, a new instance of a pod should bind straight to the volume, regardless of any existing binding?

Reading through the OpenShift docs again, it seems that 'retain' is described as 'manual recycling'. I can't see any description of a use case where a pod is undergoing a rolling upgrade and passing a PV on to a subsequent instance. I guess you could make arguments both for and against that scenario, so we should probably establish if this is a bug, or a prohibited activity to start with.

rhuss commented 8 years ago

@tdcox your observation is correct, if the PVC is not marked as ReadOnly its configured as ReadWriteMany

tbh, I don't even know all the possible modes for a PVC but will read this up and come back :)

tdcox commented 8 years ago

As a test, I've just created a second project that attempts to connect to the same PV in parallel. As expected, this fails with the same error.

tdcox commented 8 years ago

Ah, but I do get a different log entry with a file name and line number:

kubelet.go:1910] volume "fb8cd148-fa8f-11e5-91bd-323534663334/www", still has a container running "fb8cd148-fa8f-11e5-91bd-323534663334", skipping teardown

rhuss commented 8 years ago

@tdcox just learned a bit about how PVs and especially PVCs work.

The current plugin's design has the issue that it allows you to configure how a PVC is mounted into a Pod (readOnly or not via PersistentVolumeClaimVolumeSource) and then infers from that how a PVC is created. However there is no config information how the accessModes should look like. The plugin maps readOnly on the mount option to ReadOnly and ReadWriteMany access modes of the PVC.

However according to the source the following modes are available:

type PersistentVolumeAccessMode string

const (
    // can be mounted read/write mode to exactly 1 host
    ReadWriteOnce PersistentVolumeAccessMode = "ReadWriteOnce"
    // can be mounted in read-only mode to many hosts
    ReadOnlyMany PersistentVolumeAccessMode = "ReadOnlyMany"
    // can be mounted in read/write mode to many hosts
    ReadWriteMany PersistentVolumeAccessMode = "ReadWriteMany"
)

So it seems that ReadOnly doesn't even exist (should probably be ReadOnlyMany).

For non-readOnly PVCs I wonder whether the plugin

should add an extra config option to specify the access mode for a PVC
map to ReadWriteOnce
keep it mapping to ReadWriteMany

Also as there is no specification of the reclaim policy (which is Retain by default but could also be Recycle or Delete) and as @tdcox I wonder whether the default is the proper choice.

At the end I question whether managing of PVC should be really the task of f8-m-p or maybe better done by external tooling (gofabric8 for DevOps). (Of course dealing with PersistentVolumeClaimVolumeSources for Volumes attached to Pods remains part of f8-m-p's business).

@jdyson @jstrachan @rawlingsj wdyt ?

tdcox commented 8 years ago

I believe that not all possible storage providers will be capable of fulfilling all Access Modes, so it may be necessary to specify the mode directly.

Note also that 'Many' connections only apply to containers in the same namespace.

I think the important questions we need to know answers for are:

For devops pods like Gogs or Jenkins that require some continuity of storage, how do we specify the volume definitions as part of the cd-pipeline configuration and how do we maintain continuity of data across upgrades to fabric8 / cd-pipeline?
For end user applications that require continuity of storage, how do we manage upgrades against a range of roll-out strategies, particularly when updates to data storage function or structure may be involved?

There is a continuum of potential answers that ranges from trying to solve all problems in the data management domain, to declaring that fabric8 is for stateless functions only. It would be helpful to understand where the current thinking has reached on this.

@jdyson @jstrachan @rawlingsj

fabric8io / fabric8

Setting PVC attributes in POM - default volume size #5906

PV

Deployment