Open AllenXu93 opened 1 month ago
To support dynamic PVC provision and migration for non-shared PVCs in KubeVirt, enabling live migration for VMs using localPV, you can follow these steps:
Create a DataVolume (DV):
createDV := func() *cdiv1.DataVolume {
sc, exist := libstorage.GetRWOFileSystemStorageClass()
Expect(exist).To(BeTrue())
dv := libdv.NewDataVolume(
libdv.WithRegistryURLSource(cd.DataVolumeImportUrlForContainerDisk(cd.ContainerDiskCirros)),
libdv.WithPVC(libdv.PVCWithStorageClass(sc),
libdv.PVCWithVolumeSize(size),
),
)
_, err := virtClient.CdiClient().CdiV1beta1().DataVolumes(ns).Create(context.Background(),
dv, metav1.CreateOptions{})
Expect(err).ToNot(HaveOccurred())
return dv
}
Create a VM with the DataVolume:
createVMWithDV := func(dv *cdiv1.DataVolume, volName string) *virtv1.VirtualMachine {
vmi := libvmi.New(
libvmi.WithNamespace(ns),
libvmi.WithInterface(libvmi.InterfaceDeviceWithMasqueradeBinding()),
libvmi.WithNetwork(virtv1.DefaultPodNetwork()),
libvmi.WithResourceMemory("128Mi"),
libvmi.WithDataVolume(volName, dv.Name),
libvmi.WithCloudInitNoCloud(libvmifact.WithDummyCloudForFastBoot()),
)
vm := libvmi.NewVirtualMachine(vmi,
libvmi.WithRunning(),
libvmi.WithDataVolumeTemplate(dv),
)
vm, err := virtClient.VirtualMachine(ns).Create(context.Background(), vm, metav1.CreateOptions{})
Expect(err).ToNot(HaveOccurred())
Eventually(matcher.ThisVM(vm), 360*time.Second, 1*time.Second).Should(matcher.BeReady())
libwait.WaitForSuccessfulVMIStart(vmi)
return vm
}
Update the VM with a new PVC:
updateVMWithPVC := func(vm *virtv1.VirtualMachine, volName, claim string) {
i := slices.IndexFunc(vm.Spec.Template.Spec.Volumes, func(volume virtv1.Volume) bool {
return volume.Name == volName
})
Expect(i).To(BeNumerically(">", -1))
By(fmt.Sprintf("Replacing volume %s with PVC %s", volName, claim))
updatedVolume := virtv1.Volume{
Name: volName,
VolumeSource: virtv1.VolumeSource{PersistentVolumeClaim: &virtv1.PersistentVolumeClaimVolumeSource{
PersistentVolumeClaimVolumeSource: k8sv1.PersistentVolumeClaimVolumeSource{
ClaimName: claim,
}}},
}
p, err := patch.New(
patch.WithReplace("/spec/dataVolumeTemplates", []virtv1.DataVolumeTemplateSpec{}),
patch.WithReplace(fmt.Sprintf("/spec/template/spec/volumes/%d", i), updatedVolume),
patch.WithReplace("/spec/updateVolumesStrategy", virtv1.UpdateVolumesStrategyMigration),
).GeneratePayload()
Expect(err).ToNot(HaveOccurred())
vm, err = virtClient.VirtualMachine(vm.Namespace).Patch(context.Background(), vm.Name, types.JSONPatchType, p, metav1.PatchOptions{})
Expect(err).ToNot(HaveOccurred())
Expect(vm.Spec.Template.Spec.Volumes[i].VolumeSource.PersistentVolumeClaim.
PersistentVolumeClaimVolumeSource.ClaimName).To(Equal(claim))
}
Wait for the migration to succeed:
waitForMigrationToSucceed := func(vmiName, ns string) {
waitMigrationToExist(vmiName, ns)
Eventually(func() bool {
vmi, err := virtClient.VirtualMachineInstance(ns).Get(context.Background(), vmiName,
metav1.GetOptions{})
Expect(err).ToNot(HaveOccurred())
if vmi.Status.MigrationState == nil {
return false
}
if !vmi.Status.MigrationState.Completed {
return false
}
Expect(vmi.Status.MigrationState.Failed).To(BeFalse())
return true
}, 120*time.Second, time.Second).Should(BeTrue())
}
Additionally, ensure that the VolumeMigration
and VolumesUpdateStrategy
feature gates are enabled in the KubeVirt CR, and the LiveMigrate
workloadUpdateStrategy is set. This can be done by adding the following to the KubeVirt CR:
apiVersion: kubevirt.io/v1
kind: KubeVirt
spec:
configuration:
developerConfiguration:
featureGates:
- VMLiveUpdateFeatures
- VolumesUpdateStrategy
- VolumeMigration
vmRolloutStrategy: LiveUpdate
workloadUpdateStrategy:
workloadUpdateMethods:
- LiveMigrate
Monitor the migration progress by watching the corresponding VirtualMachineInstanceMigration
object using the label kubevirt.io/volume-update-in-progress: <vm-name>
. For example:
$ kubectl get virtualmachineinstancemigrations -l kubevirt.io/volume-update-in-progress: vmi --watch=true
NAME PHASE VMI
kubevirt-workload-update-abcd Running vmi
Note that volume migration is supported between PersistentVolumeClaims and DataVolumes, and certain types of disks and volumes are not supported for migration [1][2].
To continue talking to Dosu, mention @dosu.
/cc
@AllenXu93 we don't support live migration with RWO PVCs. Sharable storage is a requirement in order to use live migration. Your proposal is interesting but I don't think is currently possible with the API we have for PVCs. You need to specify directly the claim you want to use in the VM definition. If virt-controller creates a new PVC, it will also need to update the claim in the VM spec. This is problematic for a couple of reasons. First, each migration will alter the VM definition and we don't want this. Migration could be triggered automatically for multiple reasons like the workload balancing or eviction, and the users will end up with a modified VM. Secondly, we want to be git ops compatible. This is similar to the first case, only the users should be able to modify the VM specification.
If you want to change the claim, then you need some abstraction of the volume which won't change the VM definition. We currently don't have such an API and mechanism, if you wish, feel free to open a design proposal. I would also suggest you to take a look at: https://github.com/kubevirt/community/blob/main/design-proposals/volume-update-strategy.md
@AllenXu93 we don't support live migration with RWO PVCs. Sharable storage is a requirement in order to use live migration. Your proposal is interesting but I don't think is currently possible with the API we have for PVCs. You need to specify directly the claim you want to use in the VM definition. If virt-controller creates a new PVC, it will also need to update the claim in the VM spec. This is problematic for a couple of reasons. First, each migration will alter the VM definition and we don't want this. Migration could be triggered automatically for multiple reasons like the workload balancing or eviction, and the users will end up with a modified VM. Secondly, we want to be git ops compatible. This is similar to the first case, only the users should be able to modify the VM specification.
If you want to change the claim, then you need some abstraction of the volume which won't change the VM definition. We currently don't have such an API and mechanism, if you wish, feel free to open a design proposal. I would also suggest you to take a look at: https://github.com/kubevirt/community/blob/main/design-proposals/volume-update-strategy.md
I had tested for this feature, it can't solve migration for RWO volumes, as it described in proposals:
This feature is not directly aimed at live migrating the VM using RWO volumes. Users will therefore continue to run into the restriction that the VM cannot be live-migrated if it uses RWO volumes.
Is support dynamic PVC provision for localPV migration is a correctly approach ? If so, I would like to contribute. I have wrote a demo to support this feature, and it work well in my cluster.
Sorry for the late reply.
Is support dynamic PVC provision for localPV migration is a correctly approach ? If so, I would like to contribute. I have wrote a demo to support this feature, and it work well in my cluster.
Yes, if you want to migrate the VM to another node then you need to copy the storage as well. You can take a look to my initial design: https://github.com/kubevirt/community/pull/240 which also included the VM live migration with RWO volumes. The bits in the code are already there thanks to the volume migration, it will be more a matter of designing the correct API. We want to distinguish the user intentions, that's why we excluded live migration with RWO.
Please, keep in mind that VM live migration is triggered by the VMI Migration CRD, hence you need:
Is your feature request related to a problem? Please describe: Currently, If I use localPV for vm, the vm can't use live migration; For example, if I create
VirtualMachineInstanceMigration
for the vm which use localPV as systemDisk, it will reject creation request with err:I thought the reason is that kubevirt vm only support static pvc, if pvc is not shareable (not ReadWriteMany), like localPV (openebs) , migration won't success because the target launcher pod will be scheduled in the same node as source pod by localPV affinity;
Describe the solution you'd like: I better way to solve this problem is to support dynamic pvc provision, like
volumeClaimTemplates
in statefulset; Everytime vm migration happen, virt-controller should create a new pvc for new launcher pod, so that migration can be executed successfully; If vm is stop, or pod is deleted unexpectedly, it can use the same pvc, so that vm can be resumed in the same node with older data;