Closed LamNguy closed 1 day ago
@LamNguy : it seems that the metadata service is not reachable on your network, do you know why? is it possible you have static IPs configured and lost access to the DHCP?
@mnaser Hi, I think the metadata service is working well since I can create new VM with correct IP, I wonder how the container reach the metadata service, can you let me know where I should run the container. Thanks
In my environment, I run the container on the bastion node where can reach both OpenStack and Vmware.
Can you run curl http://169.254.169.254/openstack/latest/meta_data.json
successfully? Also, I wonder if this is because it's podman
.. can you add --network host
to the commands?
@LamNguy did it end up working with --network host
?
@mnaser Hi, let me try again with it, thank you
`[root@registry cloud]# podman run -it --privileged --network host \
-v /dev:/dev \ -v /home/cloud/vmware-vix-disklib-distrib/:/usr/lib64/vmware-vix-disklib:ro \ --env-file <(env | grep OS_) \ registry.atmosphere.dev/library/migratekit:latest \ migrate \ --vmware-endpoint 10.1.0.23 \ --vmware-username lam.nd@vsphere.local \ --vmware-password SVTcoimo@23 \ --vmware-path /svtechhn/vm/cloudvm/lam.ndvm/ubuntu INFO[0000] Setting Disk Bus: virtio
Creating snapshot 100% [==========================================================================================================================] (100/100) [0s:0s] DEBU[0000] Running command: /usr/sbin/nbdkit --exit-with-parent --readonly --foreground --unix=/tmp/migratekit-49414492/nbdkit.sock --pidfile=/tmp/migratekit-49414492/nbdkit.pid vddk server=10.1.0.23 user=lam.nd@vsphere.local password=SVTcoimo@23 thumbprint=FB:9D:25:5D:9C:2B:B2:F5:16:12:D5:3E:DA:36:A7:AE:67:CD:F3:2C compression=skipz vm=moref=vm-14090 snapshot=snapshot-14181 [10.1.0.21_ssd03_nvme] ubuntu_4/ubuntu.vmdk WARN[0001] Change ID mismatch, full copy needed currentChangeId= snapshotChangeId="52 c3 51 5e e8 93 8f b1-0a bd bd 60 2d 55 8b a5/5" INFO[0001] Attaching volume volume_id=1b65f25f-43dc-4d5c-8393-5be787ef2ab1 Removing snapshot 100% [==========================================================================================================================] (100/100) [0s:0s] Error: Get "http://169.254.169.254/openstack/latest/meta_data.json": dial tcp 169.254.169.254:80: i/o timeout Usage: migratekit migrate [flags]
Flags: -h, --help help for migrate
Global Flags: --availability-zone string Openstack availability zone for blockdevice & server --disk-bus-type disk-bus-type Specifies the type of disk controller to attach disk devices to. (default virtio) --vmware-endpoint string VMware endpoint (hostname or IP only) --vmware-password string VMware password --vmware-path string VMware VM path (e.g. '/Datacenter/vm/VM') --vmware-username string VMware username --volume-type string Openstack volume type`
I try with option --network host but get the same error. I think the problem that the container try to reach the metadata service but from where I run the container it's can not reach the metadata service. @mnaser So I will describe my env again:
@LamNguy is this virtual machine running on OpenStack?
No, this is VMware virtual machine , so the VM must be a OpenStack VM right?
@mnaser Hi, I change VM to OpenStack and it fix the error dial tcp 169.254.169.254, I run the command again and found some problems
Flags: -h, --help help for migrat
The second try
[root@rhel ~]# docker run -it --privileged --network host -v /dev:/dev -v /root/vmware-vix-disklib-distrib/:/usr/lib64/vmware-vix-disklib:ro --env-file <(env | grep OS_) registry.atmosphere.dev/library/migratekit:latest migrate --vmware-endpoint 10.1.0.23 --vmware-username lam.nd@vsphere.local --vmware-password SVTcoimo@23 --vmware-path /svtechhn/vm/cloudvm/lam.ndvm/ubuntu
INFO[0000] Setting Disk Bus: virtio
Creating snapshot 100% [======================================================================================================================================================================] (100/100) [0s:0s]
DEBU[0000] Running command: /usr/sbin/nbdkit --exit-with-parent --readonly --foreground --unix=/tmp/migratekit-561067734/nbdkit.sock --pidfile=/tmp/migratekit-561067734/nbdkit.pid vddk server=10.1.0.23 user=lam.nd@vsphere.local password=SVTcoimo@23 thumbprint=FB:9D:25:5D:9C:2B:B2:F5:16:12:D5:3E:DA:36:A7:AE:67:CD:F3:2C compression=skipz vm=moref=vm-14090 snapshot=snapshot-14191 [10.1.0.21_ssd03_nvme] ubuntu_4/ubuntu.vmdk
WARN[0001] Change ID mismatch, full copy needed currentChangeId= snapshotChangeId="52 c3 51 5e e8 93 8f b1-0a bd bd 60 2d 55 8b a5/41"
INFO[0001] Attaching volume volume_id=be03d993-7a17-4729-9675-38e05e829db8
INFO[0002] Detected instance UUID, attaching volume... instance_uuid=ab243f15-2a37-4663-85c0-ce434d9a7c1a
INFO[0003] Device for volume not found, checking again... volume_id=be03d993-7a17-4729-9675-38e05e829db8
INFO[0004] Device for volume not found, checking again... volume_id=be03d993-7a17-4729-9675-38e05e829db8
INFO[0005] Device for volume not found, checking again... volume_id=be03d993-7a17-4729-9675-38e05e829db8
INFO[0006] Device found device=/dev/vdb volume_id=be03d993-7a17-4729-9675-38e05e829db8
INFO[0006] Starting full copy disk="[10.1.0.21_ssd03_nvme] ubuntu_4/ubuntu.vmdk" vm=ubuntu
DEBU[0006] Running command: /usr/bin/nbdcopy --progress=3 nbd+unix:///?socket=/tmp/migratekit-561067734/nbdkit.sock /dev/vdb destination=/dev/vdb source="nbd+unix:///?socket=/tmp/migratekit-561067734/nbdkit.sock"
munmap_chunk(): invalid pointer
nbdcopy: nbd+unix:///?socket=/tmp/migratekit-561067734/nbdkit.sock: nbd_connect_uri: recv: server disconnected unexpectedly
Removing snapshot 100% [======================================================================================================================================================================] (100/100) [0s:0s]
Error: exit status 1
Usage:
Odd, I've never seen that error. To be honest, we've not really tested with podman
so I wonder if there are some sort of selinux or odd issues that are at play here.
Is it easily possible to try to see if it works with an Ubuntu (or Docker) based environment?
Hi @mnaser , Thank you so much for helping me, so with your advise, I replaced with ubuntu 24 + docker, so I try again:
The issue which the installer did not wait the volume change from creating to available status still occurs, so I need re-run the command but I see the second nbd-copy is a full-copy, comparing with the first time I see the nbd-copy has --destination-is-zero, which is much faster than the second try. I try several time and can pass this issue if lucky ==> could you update the installer to have a timeout option
I run the migrate option first, so after that I run the option cutover but when it run the virt-v2v-inplace has error (with the image attached), do you meet this issue or have any idea
I have a question that if the source VM is already install with virtio-driver and cloud-init, so i use option --virt-v2v=false is fine for a migration right?
Hi @mnaser , Thank you so much for helping me, so with your advise, I replaced with ubuntu 24 + docker, so I try again:
- The issue which the installer did not wait the volume change from creating to available status still occurs, so I need re-run the command but I see the second nbd-copy is a full-copy, comparing with the first time I see the nbd-copy has --destination-is-zero, which is much faster than the second try. I try several time and can pass this issue if lucky ==> could you update the installer to have a timeout option
https://github.com/vexxhost/migratekit/pull/11 should address this, once it merges it should wait 60 seconds for the volume to become available.
- I run the migrate option first, so after that I run the option cutover but when it run the virt-v2v-inplace has error (with the image attached), do you meet this issue or have any idea
- I have a question that if the source VM is already install with virtio-driver and cloud-init, so i use option --virt-v2v=false is fine for a migration right?
Yes, virt-v2v
is really mostly important for installing drivers (more specifically for Windows systems), if you've got those, you're good to go. I'm working through that bug though.
Hi @mnaser , thanks for updating the code, can you suggest me with the error virt-v2v-inplace
Hi @mnaser, With your expertise can I ask you question about the migrate VM to OpenStack using virt-v2v, I have issue that the for example, my vmdk is 100 thin Gb, when I converted volume and upload to openstack it process 100Gb but the actual size of the volume is about 10G. I need run dd command to sparse it. dd if=/var/tmp/ubuntu-sda bs=16M conv=sparse iflag=fullblock | ssh root@192.168.30.25 "dd of=/dev/openstack/volume-ee6c1ed7-6841-4b6a-96f6-1d0e91736112 bs=16M conv=sparse oflag=direct" Can you have any better approach
Hi, with VM run UEFI, in openstack only support metadata with image: openstack image set --property hw_firmware_type=uefi $IMAGE, so if I export directlty VM to OpenStack volume so I cant find any way to run it with option hw_firmware_type
Sorry, I can only help you with this project here. I am trying to investigate the passt
issue which I've seen with a few of our customers but you can skip it if you don't need the drivers?
@mnaser Sure, so I think I will be looking for the update code to resolve this issue since the manual installation of virtio driver is tricky, for example I install virtio-driver on Windows VMware but when I migrate to VMware it seem not recognize the disk, maybe I miss some step.
@mnaser I read the #14 , my vddk vmware is also 8.0.3
OK, the root cause for this passt
issue is Ubuntu 24.04
This is why my development using 22.04 -- I have not run into this issue.
Ok, I will try again with ubuntu 22.04
@LamNguy did things work for you with 22.04 ?
Sure, with ubuntu 22.04 the past issue is solved, I can see the convert volume is successfully but at the final the process is failed and no instance is launched on Openstack, I test with OS ubuntu and RHEL8 has same issue. , below is some logs at the end of the process.
@LamNguy it seems that this is because virt-v2v is taking a long time, so by the time we are shutting down the old nbdkit servers, we are getting NotAuthenticated -- it's almost like a session is timing out ..
This is related to https://github.com/vmware/govmomi/issues/224
https://github.com/vexxhost/migratekit/pull/20 should help with timeouts, you can try with the new image once that is merged.
It's wild that your virt-v2v run takes almost 2 hours which is timing out the session, it seems the SELinux relabeling taking a really long time, you can probably get away with skipping it.
https://github.com/vexxhost/migratekit/pull/21 should help avoid doing the SELinux relabel, I don't really see the point of it, it will cut 86 minutes from your virt-v2v
run..
I think another issue you might be seeing is the lack of nested virtualization which is making things slower (perhaps!)
Sure, I will try again later. Our OpenStack is nested virtualization which is built on VMware. I will check it.
Hi thanks you very much, with new container image (that bypass relabel selinux), I test migration successfully with ubuntu, RHEL and window. I will look for other cases, so I have some question that:
The skip SELinux shouldn't affect it because we're not changing formats, we're operating on a block level.
There's no need to install anything, virt-v2v will inject the correct drivers for Windows.
Network mapping can't easily be overwritten because if we remove it then the system will be confused because how will Migratekit know what network to create the port on?
i understand, thank you
Hi, I get this error but I can not know what is root cause, I run it from the bastion which has 2 interface, one can connect to VMware, one can connect to OpenStack cluster. [root@bastion vmware-vix-disklib]# docker run -it --rm --privileged -v /dev:/dev -v /usr/lib64/vmware-vix-disklib/:/usr/lib64/vmware-vix-disklib:ro --env-file <(env | grep OS_) registry.atmosphere.dev/library/migratekit:latest cutover --vmware-endpoint xxxx --vmware-username xxxxx --vmware-password xxxxxx --vmware-path /svtechhn/vm/cloudvm/lam.ndvm/lam.nd_bootstrap_kolla --flavor a8fde5f6-56c6-452b-b07d-a40b38141fff --network-mapping mac=00:50:56:8e:b2:73,network-id=05ba5267-e72d-4f08-9006-05f058ec8df4,subnet-id=b5f7c194-ebae-4759-b47c-4f593581be49,ip=10.1.30.245 Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg. INFO[0000] Setting Disk Bus: virtio
INFO[0000] Ensuring OpenStack resources exist
INFO[0000] Flavor exists, ensuring network resources exist flavor=small INFO[0000] Port already exists port=263a9a1e-e097-41d0-a86a-5b3f974f0052 INFO[0000] Starting migration cycle
Creating snapshot 100% [======================================================================================================================================================================] (100/100) [0s:0s] DEBU[0001] Running command: /usr/sbin/nbdkit --exit-with-parent --readonly --foreground --unix=/tmp/migratekit-1716472452/nbdkit.sock --pidfile=/tmp/migratekit-1716472452/nbdkit.pid vddk server=xxxxxx user=xxxxxx password=xxxx thumbprint=FB:9D:25:5D:9C:2B:B2:F5:16:12:D5:3E:DA:36:A7:AE:67:CD:F3:2C compression=skipz vm=moref=vm-3121 snapshot=snapshot-14067 [10.1.0.21_ssd02] lam.nd_bootstrap_kolla/lam.nd_bootstrap_kolla.vmdk INFO[0001] Data does not exist, full copy needed
INFO[0002] Creating new volume
INFO[0002] Volume created, setting to bootable volume_id=041975bc-236e-485d-be8c-87992032f5c8 INFO[0002] Setting volume to be UEFI volume_id=041975bc-236e-485d-be8c-87992032f5c8 INFO[0002] Attaching volume volume_id=041975bc-236e-485d-be8c-87992032f5c8 Removing snapshot 100% [======================================================================================================================================================================] (100/100) [0s:0s] Error: Get "http://169.254.169.254/openstack/latest/meta_data.json": dial tcp 169.254.169.254:80: i/o timeout