openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.
Apache License 2.0
742 stars 106 forks source link

DiskPool not extended after extending block device #1631

Open todeb opened 6 months ago

todeb commented 6 months ago

Describe the bug after resizing block device the diskpool that uses that block device still reports old size

To Reproduce

gdisk /dev/sdx
...
partprobe -s
cryptsetup resize sdx1 -v  (if you have it encrypted)

Check if mayastor see resized device:

./kubectl-mayastor get block-devices node --all

Check diskpool:

kubectl get diskpool -n mayastor

Expected behavior Diskpool should be refreshed with new size

Screenshots If applicable, add screenshots to help explain your problem.

OS info (please complete the following information):

Additional context on my setup I had resized blockdevice from 6GB to 7GB. 7GB is correctly seen by kubectl-mayastore, but the deployed diskpool prior to resizing still reports 6GB.

todeb commented 6 months ago

/bug ?

tiagolobocastro commented 6 months ago

I thought I replied already, strange... Currently we don't support extending the DiskPool but this is part of the roadmap.

todeb commented 6 months ago

is there any tracking issue for that? On the roadmap this is marked as Pri 1 / Rel: (Q1 2024)

tiagolobocastro commented 4 months ago

We'll create a tracking issue for this, I'll add a label here meanwhile

teamosceola commented 3 weeks ago

What's the status of this feature? I've got a problem where I can't create a csi snapshot of a volume due to insufficient space in the pool the volume is in. I've tried adding new pools to make additional space available, but that does not seem to work when trying to take a snapshot.

Just FYI, I'm using velero to make backups of namespaces, and it's velero that is trying to take the snapshot and is failing. Here is the error message it getting.

Errors:
  Velero:    message: /Timed out awaiting reconciliation of VolumeSnapshot, VolumeSnapshotContent snapcontent-0c6999c4-c867-4bea-b65d-299b833d5c2b has error: Failed to check and update snapshot content: failed to take snapshot of the volume 58302678-0fee-4623-b468-a664657b9814: "rpc error: code = ResourceExhausted desc = error in response: status code '507 Insufficient Storage', content: 'RestJsonError { details: \"Not enough free space in the pool\", message: \"SvcError :: NotEnoughResources: Operation failed due to insufficient resources\", kind: ResourceExhausted }'"
             message: /Fail to wait VolumeSnapshot turned to ReadyToUse: CSI got timed out with error: Failed to check and update snapshot content: failed to take snapshot of the volume 58302678-0fee-4623-b468-a664657b9814: "rpc error: code = ResourceExhausted desc = error in response: status code '507 Insufficient Storage', content: 'RestJsonError { details: \"Not enough free space in the pool\", message: \"SvcError :: NotEnoughResources: Operation failed due to insufficient resources\", kind: ResourceExhausted }'"
             name: /tsp-dev-awx-postgres-13-0 message: /Error backing up item error: /error executing custom action (groupResource=persistentvolumeclaims, namespace=argocd-tsp-awx, name=postgres-13-tsp-dev-awx-postgres-13-0): rpc error: code = Unknown desc = CSI got timed out with error: Failed to check and update snapshot content: failed to take snapshot of the volume 58302678-0fee-4623-b468-a664657b9814: "rpc error: code = ResourceExhausted desc = error in response: status code '507 Insufficient Storage', content: 'RestJsonError { details: \"Not enough free space in the pool\", message: \"SvcError :: NotEnoughResources: Operation failed due to insufficient resources\", kind: ResourceExhausted }'"

Also, it should be noted that because of this, I now can't take a backup of my volumes in order to migrate to a new cluster with larger pools.