siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.6k stars 527 forks source link

Unable to import OVA 1.6.0 into vSphere #8148

Closed wh1test closed 4 days ago

wh1test commented 8 months ago

Bug Report

I cannot upload OVA template (https://github.com/siderolabs/talos/releases/download/v1.6.0/vmware-amd64.ova) into my vSphere vCenter 7.0.3 (see logs below) using vmware.sh script. I tried on two different vCenters without luck. At the same time https://github.com/siderolabs/talos/releases/download/v1.5.5/vmware-amd64.ova works properly and I'm able to create cluster.

Logs

govc: 400 Bad Request: {"type":"com.vmware.vapi.std.errors.not_allowed_in_current_state","value":{"error_type":"NOT_ALLOWED_IN_CURRENT_STATE","messages":[{"args":["complete","ERROR","Error transferring file disk.vmdk to ds:///vmfs/volumes/64639e18-31cd0331-f2ef-ec2a724c92bc//contentlib-bdec4bee-33df-43be-83fc-f4c9c40e736c/88fd18b2-b0a7-4496-8a3d-a378d3cffdac/disk_58747ba8-0a22-4548-9fbc-204f934131bc.vmdk?serverId=4c4d4424-981c-44bf-be16-6d19c7be40d0. Reason: IO error during transfer of ds:///vmfs/volumes/64639e18-31cd0331-f2ef-ec2a724c92bc//contentlib-bdec4bee-33df-43be-83fc-f4c9c40e736c/88fd18b2-b0a7-4496-8a3d-a378d3cffdac/disk_58747ba8-0a22-4548-9fbc-204f934131bc.vmdk?serverId=4c4d4424-981c-44bf-be16-6d19c7be40d0: Pipe closed"],"default_message":"The operation complete can not be invoked when the import session is in the ERROR state. Reason: Error transferring file disk.vmdk to ds:///vmfs/volumes/64639e18-31cd0331-f2ef-ec2a724c92bc//contentlib-bdec4bee-33df-43be-83fc-f4c9c40e736c/88fd18b2-b0a7-4496-8a3d-a378d3cffdac/disk_58747ba8-0a22-4548-9fbc-204f934131bc.vmdk?serverId=4c4d4424-981c-44bf-be16-6d19c7be40d0. Reason: IO error during transfer of ds:///vmfs/volumes/64639e18-31cd0331-f2ef-ec2a724c92bc//contentlib-bdec4bee-33df-43be-83fc-f4c9c40e736c/88fd18b2-b0a7-4496-8a3d-a378d3cffdac/disk_58747ba8-0a22-4548-9fbc-204f934131bc.vmdk?serverId=4c4d4424-981c-44bf-be16-6d19c7be40d0: Pipe closed","id":"com.vmware.vdcs.cls-main.update_session_invalid_state_for_operation_reason"}]}} Expand (1 line)

Environment

vSphere 7.0.3

smira commented 8 months ago

I don't see anything meaningful which might explain this problem. There was a similar issue with 1.5.0 VMWare images, but this got fixed with some point releases. Just want to make sure you're not trying 1.5.0 instead of 1.6.0.

# diff 1.6.0/disk.ovf 1.5.5/disk.ovf
6c6
<     <File ovf:href="disk.vmdk" ovf:id="file1" ovf:size="83379200"/>
---
>     <File ovf:href="disk.vmdk" ovf:id="file1" ovf:size="83470336"/>
# diff 1.6.0/disk.mf 1.5.5/disk.mf
1,2c1,2
< SHA256(disk.vmdk)= e95d3432fb916c551b334fc1b4f68ba457f188725d8ca69fb0c1da8b798b6e7b
< SHA256(disk.ovf)= df82af696929566d1f4ea91dfa2c1dcc08ccded062263da5e40f6e2d37618076
---
> SHA256(disk.vmdk)= 5bbe9920397da11802b8e96a54dac288d57884dd36164b29230d6db1c5c4eb5f
> SHA256(disk.ovf)= ab4f5d1bd316863fdcbdc7cc9eac47dd30bb7791a8e221d14d556f903f07fe5a
$ qemu-img info 1.5.5/disk.vmdk
image: 1.5.5/disk.vmdk
file format: vmdk
virtual size: 8 GiB (8589934592 bytes)
disk size: 79.6 MiB
cluster_size: 65536
Format specific information:
    cid: 196666111
    parent cid: 4294967295
    create type: streamOptimized
    extents:
        [0]:
            compressed: true
            virtual size: 8589934592
            filename: 1.5.5/disk.vmdk
            cluster size: 65536
            format: 
Child node '/file':
    filename: 1.5.5/disk.vmdk
    protocol type: file
    file length: 79.6 MiB (83470336 bytes)
    disk size: 79.6 MiB
$ qemu-img info 1.6.0/disk.vmdk
image: 1.6.0/disk.vmdk
file format: vmdk
virtual size: 8 GiB (8589934592 bytes)
disk size: 79.5 MiB
cluster_size: 65536
Format specific information:
    cid: 4039020857
    parent cid: 4294967295
    create type: streamOptimized
    extents:
        [0]:
            compressed: true
            virtual size: 8589934592
            filename: 1.6.0/disk.vmdk
            cluster size: 65536
            format: 
Child node '/file':
    filename: 1.6.0/disk.vmdk
    protocol type: file
    file length: 79.5 MiB (83379200 bytes)
    disk size: 79.5 MiB
wh1test commented 8 months ago

I don't see anything meaningful which might explain this problem. There was a similar issue with 1.5.0 VMWare images, but this got fixed with some point releases. Just want to make sure you're not trying 1.5.0 instead of 1.6.0.

# diff 1.6.0/disk.ovf 1.5.5/disk.ovf
6c6
<     <File ovf:href="disk.vmdk" ovf:id="file1" ovf:size="83379200"/>
---
>     <File ovf:href="disk.vmdk" ovf:id="file1" ovf:size="83470336"/>
# diff 1.6.0/disk.mf 1.5.5/disk.mf
1,2c1,2
< SHA256(disk.vmdk)= e95d3432fb916c551b334fc1b4f68ba457f188725d8ca69fb0c1da8b798b6e7b
< SHA256(disk.ovf)= df82af696929566d1f4ea91dfa2c1dcc08ccded062263da5e40f6e2d37618076
---
> SHA256(disk.vmdk)= 5bbe9920397da11802b8e96a54dac288d57884dd36164b29230d6db1c5c4eb5f
> SHA256(disk.ovf)= ab4f5d1bd316863fdcbdc7cc9eac47dd30bb7791a8e221d14d556f903f07fe5a
$ qemu-img info 1.5.5/disk.vmdk
image: 1.5.5/disk.vmdk
file format: vmdk
virtual size: 8 GiB (8589934592 bytes)
disk size: 79.6 MiB
cluster_size: 65536
Format specific information:
    cid: 196666111
    parent cid: 4294967295
    create type: streamOptimized
    extents:
        [0]:
            compressed: true
            virtual size: 8589934592
            filename: 1.5.5/disk.vmdk
            cluster size: 65536
            format: 
Child node '/file':
    filename: 1.5.5/disk.vmdk
    protocol type: file
    file length: 79.6 MiB (83470336 bytes)
    disk size: 79.6 MiB
$ qemu-img info 1.6.0/disk.vmdk
image: 1.6.0/disk.vmdk
file format: vmdk
virtual size: 8 GiB (8589934592 bytes)
disk size: 79.5 MiB
cluster_size: 65536
Format specific information:
    cid: 4039020857
    parent cid: 4294967295
    create type: streamOptimized
    extents:
        [0]:
            compressed: true
            virtual size: 8589934592
            filename: 1.6.0/disk.vmdk
            cluster size: 65536
            format: 
Child node '/file':
    filename: 1.6.0/disk.vmdk
    protocol type: file
    file length: 79.5 MiB (83379200 bytes)
    disk size: 79.5 MiB

Hello Andrey, I checked many times I use correct link for OVA v1.6.0 : https://github.com/siderolabs/talos/releases/download/v1.6.0/vmware-amd64.ova Have you tried to import the template into vSphere? I'm ready for remote session if needed to help find out a solution. My telegram: @wh1test

smira commented 8 months ago

I don't have vSphere to test with, but at the same time if I see the failure this might not help me much, as I don't know what the issue is, that's why I was trying to compare both working/non-working images to see if there's some change.

smira commented 8 months ago

I wonder if this is similar: https://github.com/aws/eks-anywhere/issues/4507

Does it work with v1.6.1 OVA? Does it work via the UI?

wh1test commented 8 months ago

I don't have vSphere to test with, but at the same time if I see the failure this might not help me much, as I don't know what the issue is, that's why I was trying to compare both working/non-working images to see if there's some change.

Andrey, I'm ready to provide my vSphere for investigation. Just contact me in telegram @wh1test

Does it work with v1.6.1 OVA? Does it work via the UI? Unfortunately, v1.6.1 doesn't work as well. I tried through UI in two different vSphere environments without luck.

wh1test commented 8 months ago

I was able to import OVA (v.1.6.0 and 1.6.1) into vSphere, but version 6.7 only. Currently OVA v.1.6.0 and 1.6.1 is working with vSphere 7.0.3. I hope it will be fixed.

wh1test commented 8 months ago

Finally I found a workaround how to import Talos OVA v1.6.0 into vSphere.

  1. Deploy OFV template.
  2. Switch the VM into Template.
  3. Clone the template into Content library.

Then I was able to create cluster using vmware.sh script.

rothgar commented 4 days ago

Thanks for posting the workaround. We'll close this issue since you are not blocked and it appears to be a vsphere issue.