kubernetes-csi / external-resizer

Sidecar container that watches Kubernetes PersistentVolumeClaims objects and triggers controller side expansion operation against a CSI endpoint
Apache License 2.0
126 stars 130 forks source link

Cannot mount resized CephFS-PVC #238

Closed ibotty closed 1 year ago

ibotty commented 1 year ago

I cannot start a new container with the same pvc as a running container. Its events include the following.

  Warning  VolumeResizeFailed  2m2s   kubelet            NodeExpandVolume.NodeExpandVolume failed for volume "pvc-ac2174e8-8e9a-4c31-832e-a845d7cd3280" : Expander.NodeExpand found CSI plugin kubernetes.io/csi/rook-ceph.cephfs.csi.ceph.com to not support node expansion

The PVC had been resized before after running into quota issues. The new quota is reflect in the running container and the csi-cephfsplugin-provisioners csi-resizer signals success.

I1109 11:33:33.455618       1 main.go:93] Version : v1.6.0
I1109 11:33:34.457523       1 common.go:111] Probing CSI driver for readiness
I1109 11:33:34.458898       1 leaderelection.go:248] attempting to acquire leader lease rook-ceph/external-resizer-rook-ceph-cephfs-csi-ceph-com...
I1109 11:36:01.428423       1 leaderelection.go:258] successfully acquired lease rook-ceph/external-resizer-rook-ceph-cephfs-csi-ceph-com
I1109 11:36:01.428504       1 controller.go:255] Starting external resizer rook-ceph.cephfs.csi.ceph.com
E1118 07:51:12.251187       1 leaderelection.go:330] error retrieving resource lock rook-ceph/external-resizer-rook-ceph-cephfs-csi-ceph-com: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/rook-ceph/leases/external-resizer-rook-ceph-cephfs-csi-ceph-com": http2: client connection lost
W1118 07:51:12.295876       1 reflector.go:347] k8s.io/client-go/informers/factory.go:134: watch of *v1.PersistentVolume ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W1118 07:51:12.295876       1 reflector.go:347] k8s.io/client-go/informers/factory.go:134: watch of *v1.PersistentVolumeClaim ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
E1118 07:51:41.073989       1 leaderelection.go:367] Failed to update lock: rpc error: code = Unavailable desc = keepalive ping failed to receive ACK within timeout
I1123 10:36:53.109592       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mail", Name:"dovecot-mail-storage", UID:"ac2174e8-8e9a-4c31-832e-a845d7cd3280", APIVersion:"v1", ResourceVersion:"990884444", FieldPath:""}): type: 'Normal' reason: 'Resizing' External resizer is resizing volume pvc-ac2174e8-8e9a-4c31-832e-a845d7cd3280
I1123 10:36:53.293116       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mail", Name:"dovecot-mail-storage", UID:"ac2174e8-8e9a-4c31-832e-a845d7cd3280", APIVersion:"v1", ResourceVersion:"990884444", FieldPath:""}): type: 'Normal' reason: 'VolumeResizeSuccessful' Resize volume succeeded
I1123 10:37:09.030583       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mail", Name:"dovecot-mail-storage", UID:"ac2174e8-8e9a-4c31-832e-a845d7cd3280", APIVersion:"v1", ResourceVersion:"990884649", FieldPath:""}): type: 'Normal' reason: 'Resizing' External resizer is resizing volume pvc-ac2174e8-8e9a-4c31-832e-a845d7cd3280
I1123 10:37:09.056571       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mail", Name:"dovecot-mail-storage", UID:"ac2174e8-8e9a-4c31-832e-a845d7cd3280", APIVersion:"v1", ResourceVersion:"990884649", FieldPath:""}): type: 'Normal' reason: 'VolumeResizeSuccessful' Resize volume succeeded

The csi-cephfsplugin's csi-cephfsplugin on the node does not log anything, unfortunately.

The pvc is the following.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"dovecot-mail-storage","namespace":"mail"},"spec":{"accessModes":["ReadWriteMany"],"resources":{"requests":{"storage":"1Ti"}},"storageClassName":"rook-cephfs"}}
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: rook-ceph.cephfs.csi.ceph.com
    volume.kubernetes.io/storage-provisioner: rook-ceph.cephfs.csi.ceph.com
  creationTimestamp: "2022-06-29T11:28:43Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: dovecot-mail-storage
  namespace: mail
  resourceVersion: "990884653"
  uid: ac2174e8-8e9a-4c31-832e-a845d7cd3280
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Ti
  storageClassName: rook-cephfs
  volumeMode: Filesystem
  volumeName: pvc-ac2174e8-8e9a-4c31-832e-a845d7cd3280
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 1Ti
  phase: Bound

The corresponding pv is.

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rook-ceph.cephfs.csi.ceph.com
  creationTimestamp: "2022-06-29T11:28:43Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-ac2174e8-8e9a-4c31-832e-a845d7cd3280
  resourceVersion: "990884652"
  uid: e768e872-a6ce-4d78-bf50-9c4c6f7f6185
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 1Ti
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: dovecot-mail-storage
    namespace: mail
    resourceVersion: "741563720"
    uid: ac2174e8-8e9a-4c31-832e-a845d7cd3280
  csi:
    controllerExpandSecretRef:
      name: rook-csi-cephfs-provisioner
      namespace: rook-ceph
    driver: rook-ceph.cephfs.csi.ceph.com
    nodeStageSecretRef:
      name: rook-csi-cephfs-node
      namespace: rook-ceph
    volumeAttributes:
      clusterID: rook-ceph
      fsName: cephfs
      pool: cephfs-data0
      storage.kubernetes.io/csiProvisionerIdentity: 1655894034977-8081-rook-ceph.cephfs.csi.ceph.com
      subvolumeName: csi-vol-a1eac437-f79e-11ec-b370-0a58ac150424
      subvolumePath: /volumes/csi/csi-vol-a1eac437-f79e-11ec-b370-0a58ac150424/ea5d4907-8a5f-45f3-87dd-5fe71c4308a7
    volumeHandle: 0001-0009-rook-ceph-0000000000000001-a1eac437-f79e-11ec-b370-0a58ac150424
  persistentVolumeReclaimPolicy: Delete
  storageClassName: rook-cephfs
  volumeMode: Filesystem
status:
  phase: Bound

Environment:

Madhu-1 commented 1 year ago

@gnufied can you please help here.

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 year ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-csi/external-resizer/issues/238#issuecomment-1519034377): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.