Closed masteryyh closed 10 months ago
@innobead we may need to backport this issue to LH v1.4.x milestone, for Harvester, the planning release date is April/04 with v1.2.0, can u please help to double-check if this is possible, thanks?
@guangbochen It's planned for 1.5.0, so it will be naturally backported to 1.4.x.
@PhanLe1010 Please help check this first to see the cause. Thanks.
It looks like the source snapshot is on a detached volume: unable to create volume: unable to create volume pvc-bf336afd-ad19-484d-bac8-60fbacedbfd6: failed to verify data source: cannot get client for volume pvc-6b2fd344-3843-48e8-9fff-bc4e5a090b3a: engine is not running
. Could we verify if we don't have this problem when the source volume is in attached state @masteryyh @weizhe0422
Btw, provisioning a new volume from a snapshot of a detach volume, this will require the enhancement https://github.com/longhorn/longhorn-manager/pull/1541. This is a big feature so I think it is not possible to backport it to 1.4.x. cc @innobead
It looks like the source snapshot is on a detached volume:
unable to create volume: unable to create volume pvc-bf336afd-ad19-484d-bac8-60fbacedbfd6: failed to verify data source: cannot get client for volume pvc-6b2fd344-3843-48e8-9fff-bc4e5a090b3a: engine is not running
. Could we verify if we don't have this problem when the source volume is in attached state @masteryyh @weizhe0422
@mantissahz Could you help check this part first? I assume there should be no issues with the attached volume. @masteryyh will also help clarify the reproduce and update here.
Btw, provisioning a new volume from a snapshot of a detach volume, this will require the enhancement longhorn/longhorn-manager#1541. This is a big feature so I think it is not possible to backport it to 1.4.x. cc @innobead
YES, for this new behavior, it's only available in 1.5.0. Currently, we only need to check if the existing behavior works as expected, which means creating a volume from a snapshot of a running
volume should work.
Result:
There are no issues with the attached volume, PVC could be created from a VolumeSnapshot
CR of the attached volume normally.
Scale down the deployment to 0 and the volume will be detached, then it failed to create a PVC from a VolumeSnapshot
CR of the detached volume.
Steps:
VolumeSnapshotClass
by the manifest
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
metadata:
name: longhorn-snapshot-vsc
driver: driver.longhorn.io
deletionPolicy: Delete
parameters:
type: snap
VolumeSnapshot
by the manifest
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: test-csi-volume-snapshot-longhorn-snapshot
spec:
volumeSnapshotClassName: longhorn-snapshot-vsc
source:
persistentVolumeClaimName: mysql-pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: restore-from-csi-snapshot-pvc
spec:
storageClassName: longhorn
dataSource:
name: test-csi-volume-snapshot-longhorn-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
As mentioned in this section csi-volume-snapshot-associated-with-longhorn-snapshot/#current-limitation
Thanks @mantissahz .
@masteryyh If this is the same case with you only being able to create a volume from a running snapshot, then it's the current behavior.
About creating a volume from an inactive snapshot from a detached volume, it will be improved in 1.5.0.
cc @guangbochen
I'm trying to bump Longhorn version in Harvester to v1.4.1 and repeat the reproduce steps here, the VM will stuck in Starting
phase and in status
of VM this appears:
volumeSnapshotStatuses:
- enabled: false
name: disk-1
reason: 2 matching VolumeSnapshotClasses for longhorn-image-4m88v
- enabled: false
name: cloudinitdisk
reason: Snapshot is not supported for this volumeSource type [cloudinitdisk]
the volume can provision successfully though.
UPDATE: Updated longhorn to v1.4.1 and snapshot-controller to v6.2.1 and problem still exists :(
In regard to this bug, does this mean Harvester snapshot restore is unusable as of v1.1.2? In my case, snapshot restoration simply does not work.
@hunghvu your case looks different because the error is due to more than one engine exists
, so that just means the source volume could be a migrating volume. Suggest you can create an issue at the harvester github repo instead to clarify the cause there.
It looks like the source snapshot is on a detached volume:
unable to create volume: unable to create volume pvc-bf336afd-ad19-484d-bac8-60fbacedbfd6: failed to verify data source: cannot get client for volume pvc-6b2fd344-3843-48e8-9fff-bc4e5a090b3a: engine is not running
. Could we verify if we don't have this problem when the source volume is in attached state @masteryyh @weizhe0422
@masteryyh Can you help answer the questions from @PhanLe1010 ? thanks.
If I understand this correctly, the snapshot is on the attached volume 🤔 when I'm testing this before this volume is attached already
I can't reproduce the issue in Harvester v1.2.1 with LH v1.4.3 with following steps:
@FrankYang0529
Create another PVC from the snapshot and update the VM to the new PVC. (We cannot use a pending PVC in GUI, so I update VM Yaml directly.)
How did you update the VM? Did you keep both old PVC and new PVC?
Create another PVC from the snapshot and update the VM to the new PVC. (We cannot use a pending PVC in GUI, so I update VM Yaml directly.)
How did you update the VM? Did you keep both old PVC and new PVC?
Yes, I use kubectl edit
to update VM and keep both PVC.
I have discussed with @FrankYang0529 and we both agree the original reproduction steps were a bit unusual here
Reproduce step:
- Install a harvester 1.1.0-rc3 environment, with snapshot-controller image replaced with k8s.gcr.io/sig-storage/snapshot-controller:v5.0.1 and CRDs edited according to here;
- Create a VM, with a 10Gi or other size of volume;
- Create a snapshot of the volume;
- Use the snapshot to create a volume;
- While the volume not provisioned by longhorn yet, replace the volume use the restored volume in the VM;
- Try to boot up the VM, the VM should stuck in Starting step;
- SSH into one of the node, kubectl describe pvc
should see the warning message given out by longhorn
In step 5.,
pvc-6b2fd344-3843-48e8-9fff-bc4e5a090b3a
was detached.)cc @innobead I think we can close the issue for now since @FrankYang0529 has tested and it works as expected now.
@ChanYiLin Let's added wontfix label as well.
Describe the bug
In Harvester, use a snapshot to create a volume, and while the volume is in
Pending
state, attach the volume to a VM immediately. Start the VM and the VM would stuck inStarting
state, SSH into one of the node and executekubectl describe pvc <volume-name>
can see some message given by Longhorn:To Reproduce
Reproduce steps here
Expected behavior
Volume should be attached to VM after provisioned by longhorn and VM should boot up without problems.
Log or Support bundle
longhorn-support-bundle_902bc133-4666-44c4-8e51-093f4093bfdf_2022-10-27T01-38-29Z.zip
Environment