kubevirt / containerized-data-importer

Data Import Service for kubernetes, designed with kubevirt in mind.
Apache License 2.0
429 stars 269 forks source link

The disk import via 'DataVolume' fails for a warm migration of windows VM. #3430

Open cniket opened 2 months ago

cniket commented 2 months ago

What happened: I have a DV that has source as vddk (spec.source.vddk) using which I am trying to migration a windows VM from VMware to Kubevirt. Before starting the migration if the windows VM is in 'power on' state, then the migration(warm) doesn't start and the corresponding import pod fails. Attaching the file: importer-plan-winserver2k19-warm-powered-on.log that has the corresponding errors. If the windows VM is in 'power-off' state before starting the migration, then the migration(cold) happens without any issue. For Linux VM, both 'warm' and 'cold' migration works without any issue.

Tried the migration with VSphere administration privileges(full access), however still getting the same errors.

What you expected to happen: Both 'cold' and 'warm' migration for Windows VM should work similar to Linux VMs.

How to reproduce it (as minimally and precisely as possible):

Additional context: Add any other context about the problem here.

Environment:

❯ k get cdi cdi -oyaml
apiVersion: cdi.kubevirt.io/v1beta1
kind: CDI
metadata:
  annotations:
    cdi.kubevirt.io/configAuthority: ""
  creationTimestamp: "2024-05-03T17:53:01Z"
  finalizers:
  - operator.cdi.kubevirt.io
  generation: 29
  name: cdi
  resourceVersion: "123180875"
  uid: adca7d62-bb11-4a86-ac47-93017635d0d9
spec:
  imagePullPolicy: IfNotPresent
  infra:
    nodeSelector:
      kubernetes.io/os: linux
    tolerations:
    - key: CriticalAddonsOnly
      operator: Exists
  workload:
    nodeSelector:
      kubernetes.io/os: linux
status:
  conditions:
[...]
    type: Degraded
  observedVersion: v1.57.0
  operatorVersion: v1.57.0
  phase: Deployed
  targetVersion: v1.57.0
❯
❯ kubectl version
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2
❯
❯ k get dv plan-winserver2k19-vm-3187-fmdg2 -n virtualmachines -oyaml
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    cdi.kubevirt.io/storage.deleteAfterCompletion: "false"
    cdi.kubevirt.io/storage.usePopulator: "false"
    forklift.konveyor.io/disk-source: '[vcenter-datastore] winserver2k19/winserver2k19_1.vmdk'
    migration: 325eb7fc-e733-4a25-b105-db3422813752
    plan: 8ff9313b-4f38-4104-b623-9cc63fb17db6
    vmID: vm-3187
  generateName: plan-winserver2k19-vm-3187-
  generation: 1
  labels:
[...]
spec:
  checkpoints:
  - current: snapshot-5368
    previous: ""
  source:
    vddk:
      backingFile: '[vcenter-datastore-1] winserver2k19/winserver2k19_1.vmdk'
      initImageURL: jmv2/vddk:7.0.3
      secretRef: plan-winserver2k19-vm-3187-b5df3
      thumbprint: <thumbprint>
      url: <vsphere-url>/sdk
      uuid: <vm-uuid>
  storage:
    accessModes:
    - ReadWriteMany
    resources:
      requests:
        storage: 1Gi
    storageClassName: ceph-block
status:
[...]
❯
mrnold commented 2 months ago

This looks very similar to a bug that was recently fixed: https://github.com/kubevirt/containerized-data-importer/pull/3385

Is there any chance you can try with a version that includes that change? It looks like it was backported to CDI releases 1.58, 1.59, and 1.60.

cniket commented 2 months ago

Hello @mrnold ,

Thanks a lot for your reply and pointing to the respective PR.

I will try with one of those CDI releases with higher version and will share the results.

cniket commented 2 months ago

Hello @mrnold ,

The migration completed successfully without those errors with cdi v1.58. Thanks a lot for your help.

However, I am facing new issue now, after completing DV migration, the VM comes up in 'Running' state. When I am trying to take its console using kubectl virt vnc <vm-name> , getting windows blue screen error as per attached screenshot. (Not sure if this is the right place to add this issue)

Screenshot 2024-09-12 at 12 42 02

JFI: serureBoot is kept 'false' in the VM template;

        firmware:
          bootloader:
            efi:
              secureBoot: false

What could be causing this boot issue for post migration?

akalenyu commented 2 months ago

long shot but maybe @lyarwood or @vsibirsk are able to help EDIT: looks like people hit this before https://gist.github.com/Francesco149/dc156cfd9ecfc3659469315c45fa0f96 https://bugzilla.redhat.com/show_bug.cgi?id=1908421

mrnold commented 2 months ago

Usually that means something didn't work right in the VM conversion step, either the VirtIO drivers weren't installed or the virtual hardware configuration was not quite as expected. Probably forklift is the right place to continue discussion.