kubevirt / kubevirt

Kubernetes Virtualization API and runtime in order to define and manage virtual machines.
https://kubevirt.io
Apache License 2.0
5.62k stars 1.34k forks source link

Support dynamic PVC provision and support migration for not shared PVC #12801

Open AllenXu93 opened 1 month ago

AllenXu93 commented 1 month ago

Is your feature request related to a problem? Please describe: Currently, If I use localPV for vm, the vm can't use live migration; For example, if I create VirtualMachineInstanceMigration for the vm which use localPV as systemDisk, it will reject creation request with err:

Error from server: error when creating "migration.yaml": admission webhook "migration-create-validator.kubevirt.io" denied the request: Cannot migrate VMI, Reason: DisksNotLiveMigratable, Message: cannot migrate VMI: PVC registry-image-datavolume is not shared, live migration requires that all PVCs must be shared (using ReadWriteMany access mode)

I thought the reason is that kubevirt vm only support static pvc, if pvc is not shareable (not ReadWriteMany), like localPV (openebs) , migration won't success because the target launcher pod will be scheduled in the same node as source pod by localPV affinity;

Describe the solution you'd like: I better way to solve this problem is to support dynamic pvc provision, like volumeClaimTemplates in statefulset; Everytime vm migration happen, virt-controller should create a new pvc for new launcher pod, so that migration can be executed successfully; If vm is stop, or pod is deleted unexpectedly, it can use the same pvc, so that vm can be resumed in the same node with older data;

dosubot[bot] commented 1 month ago

To support dynamic PVC provision and migration for non-shared PVCs in KubeVirt, enabling live migration for VMs using localPV, you can follow these steps:

  1. Create a DataVolume (DV):

    createDV := func() *cdiv1.DataVolume {
       sc, exist := libstorage.GetRWOFileSystemStorageClass()
       Expect(exist).To(BeTrue())
       dv := libdv.NewDataVolume(
           libdv.WithRegistryURLSource(cd.DataVolumeImportUrlForContainerDisk(cd.ContainerDiskCirros)),
           libdv.WithPVC(libdv.PVCWithStorageClass(sc),
               libdv.PVCWithVolumeSize(size),
           ),
       )
       _, err := virtClient.CdiClient().CdiV1beta1().DataVolumes(ns).Create(context.Background(),
           dv, metav1.CreateOptions{})
       Expect(err).ToNot(HaveOccurred())
       return dv
    }
  2. Create a VM with the DataVolume:

    createVMWithDV := func(dv *cdiv1.DataVolume, volName string) *virtv1.VirtualMachine {
       vmi := libvmi.New(
           libvmi.WithNamespace(ns),
           libvmi.WithInterface(libvmi.InterfaceDeviceWithMasqueradeBinding()),
           libvmi.WithNetwork(virtv1.DefaultPodNetwork()),
           libvmi.WithResourceMemory("128Mi"),
           libvmi.WithDataVolume(volName, dv.Name),
           libvmi.WithCloudInitNoCloud(libvmifact.WithDummyCloudForFastBoot()),
       )
       vm := libvmi.NewVirtualMachine(vmi,
           libvmi.WithRunning(),
           libvmi.WithDataVolumeTemplate(dv),
       )
       vm, err := virtClient.VirtualMachine(ns).Create(context.Background(), vm, metav1.CreateOptions{})
       Expect(err).ToNot(HaveOccurred())
       Eventually(matcher.ThisVM(vm), 360*time.Second, 1*time.Second).Should(matcher.BeReady())
       libwait.WaitForSuccessfulVMIStart(vmi)
    
       return vm
    }
  3. Update the VM with a new PVC:

    updateVMWithPVC := func(vm *virtv1.VirtualMachine, volName, claim string) {
       i := slices.IndexFunc(vm.Spec.Template.Spec.Volumes, func(volume virtv1.Volume) bool {
           return volume.Name == volName
       })
       Expect(i).To(BeNumerically(">", -1))
       By(fmt.Sprintf("Replacing volume %s with PVC %s", volName, claim))
       updatedVolume := virtv1.Volume{
           Name: volName,
           VolumeSource: virtv1.VolumeSource{PersistentVolumeClaim: &virtv1.PersistentVolumeClaimVolumeSource{
               PersistentVolumeClaimVolumeSource: k8sv1.PersistentVolumeClaimVolumeSource{
                   ClaimName: claim,
               }}},
       }
    
       p, err := patch.New(
           patch.WithReplace("/spec/dataVolumeTemplates", []virtv1.DataVolumeTemplateSpec{}),
           patch.WithReplace(fmt.Sprintf("/spec/template/spec/volumes/%d", i), updatedVolume),
           patch.WithReplace("/spec/updateVolumesStrategy", virtv1.UpdateVolumesStrategyMigration),
       ).GeneratePayload()
       Expect(err).ToNot(HaveOccurred())
       vm, err = virtClient.VirtualMachine(vm.Namespace).Patch(context.Background(), vm.Name, types.JSONPatchType, p, metav1.PatchOptions{})
       Expect(err).ToNot(HaveOccurred())
    
       Expect(vm.Spec.Template.Spec.Volumes[i].VolumeSource.PersistentVolumeClaim.
           PersistentVolumeClaimVolumeSource.ClaimName).To(Equal(claim))
    }
  4. Wait for the migration to succeed:

    waitForMigrationToSucceed := func(vmiName, ns string) {
       waitMigrationToExist(vmiName, ns)
       Eventually(func() bool {
           vmi, err := virtClient.VirtualMachineInstance(ns).Get(context.Background(), vmiName,
               metav1.GetOptions{})
           Expect(err).ToNot(HaveOccurred())
           if vmi.Status.MigrationState == nil {
               return false
           }
           if !vmi.Status.MigrationState.Completed {
               return false
           }
           Expect(vmi.Status.MigrationState.Failed).To(BeFalse())
           return true
       }, 120*time.Second, time.Second).Should(BeTrue())
    }

Additionally, ensure that the VolumeMigration and VolumesUpdateStrategy feature gates are enabled in the KubeVirt CR, and the LiveMigrate workloadUpdateStrategy is set. This can be done by adding the following to the KubeVirt CR:

apiVersion: kubevirt.io/v1
kind: KubeVirt
spec:
  configuration:
    developerConfiguration:
      featureGates:
        - VMLiveUpdateFeatures
        - VolumesUpdateStrategy
        - VolumeMigration
    vmRolloutStrategy: LiveUpdate
  workloadUpdateStrategy:
    workloadUpdateMethods:
    - LiveMigrate

Monitor the migration progress by watching the corresponding VirtualMachineInstanceMigration object using the label kubevirt.io/volume-update-in-progress: <vm-name>. For example:

$ kubectl get virtualmachineinstancemigrations -l kubevirt.io/volume-update-in-progress: vmi --watch=true
NAME                           PHASE       VMI
kubevirt-workload-update-abcd  Running     vmi

Note that volume migration is supported between PersistentVolumeClaims and DataVolumes, and certain types of disks and volumes are not supported for migration [1][2].

To continue talking to Dosu, mention @dosu.

alicefr commented 1 month ago

/cc

alicefr commented 1 month ago

@AllenXu93 we don't support live migration with RWO PVCs. Sharable storage is a requirement in order to use live migration. Your proposal is interesting but I don't think is currently possible with the API we have for PVCs. You need to specify directly the claim you want to use in the VM definition. If virt-controller creates a new PVC, it will also need to update the claim in the VM spec. This is problematic for a couple of reasons. First, each migration will alter the VM definition and we don't want this. Migration could be triggered automatically for multiple reasons like the workload balancing or eviction, and the users will end up with a modified VM. Secondly, we want to be git ops compatible. This is similar to the first case, only the users should be able to modify the VM specification.

If you want to change the claim, then you need some abstraction of the volume which won't change the VM definition. We currently don't have such an API and mechanism, if you wish, feel free to open a design proposal. I would also suggest you to take a look at: https://github.com/kubevirt/community/blob/main/design-proposals/volume-update-strategy.md

AllenXu93 commented 1 month ago

@AllenXu93 we don't support live migration with RWO PVCs. Sharable storage is a requirement in order to use live migration. Your proposal is interesting but I don't think is currently possible with the API we have for PVCs. You need to specify directly the claim you want to use in the VM definition. If virt-controller creates a new PVC, it will also need to update the claim in the VM spec. This is problematic for a couple of reasons. First, each migration will alter the VM definition and we don't want this. Migration could be triggered automatically for multiple reasons like the workload balancing or eviction, and the users will end up with a modified VM. Secondly, we want to be git ops compatible. This is similar to the first case, only the users should be able to modify the VM specification.

If you want to change the claim, then you need some abstraction of the volume which won't change the VM definition. We currently don't have such an API and mechanism, if you wish, feel free to open a design proposal. I would also suggest you to take a look at: https://github.com/kubevirt/community/blob/main/design-proposals/volume-update-strategy.md

I had tested for this feature, it can't solve migration for RWO volumes, as it described in proposals:

This feature is not directly aimed at live migrating the VM using RWO volumes. Users will therefore continue to run into the restriction that the VM cannot be live-migrated if it uses RWO volumes.

Is support dynamic PVC provision for localPV migration is a correctly approach ? If so, I would like to contribute. I have wrote a demo to support this feature, and it work well in my cluster.

alicefr commented 1 month ago

Sorry for the late reply.

Is support dynamic PVC provision for localPV migration is a correctly approach ? If so, I would like to contribute. I have wrote a demo to support this feature, and it work well in my cluster.

Yes, if you want to migrate the VM to another node then you need to copy the storage as well. You can take a look to my initial design: https://github.com/kubevirt/community/pull/240 which also included the VM live migration with RWO volumes. The bits in the code are already there thanks to the volume migration, it will be more a matter of designing the correct API. We want to distinguish the user intentions, that's why we excluded live migration with RWO.

Please, keep in mind that VM live migration is triggered by the VMI Migration CRD, hence you need:

  1. somehow to extend that object to include the destination volumes or
  2. a mechanism which replaces the source PVCs with the destination PVCs since you cannot use a PVC with the same name