josenk / terraform-provider-esxi

Terraform-provider-esxi plugin
GNU General Public License v3.0
540 stars 154 forks source link

Number of vCPUs after apply is always 1 #124

Closed swevm closed 4 years ago

swevm commented 4 years ago

Describe the bug Setting numvcpus to anything different than no vCPUs of cloned image always result in a new VM with only 1 vCPU.

To Reproduce Apply a clone configuration with numvcpus higher than 1.

Expected behavior New VM with amount of vCPUs in specification.

Error messages when running terraform apply with TF_LOG=TRACE is below. Its the only anomaly I´ve been able to capture related to this issue. Though I haven´t looked at the ESXi side to see if I can find any error there.

2020/09/06 17:39:56 [WARN] Provider "registry.terraform.io/josenk/esxi" produced an unexpected new value for esxi_guest.Default, but we are tolerating it because it is using the legacy plugin SDK. The following problems may be the cause of any confusing errors from downstream operations:

josenk commented 4 years ago

Something like this has been reported in the past... https://github.com/josenk/terraform-provider-esxi/issues/100 But it was closed due to insufficient information/debugging. I was not able to re-create this issue. If you are interested, you may need to do additional debugging and send your complete debug log.

The most common reason this could happen is the source system has Guest OS type that doesn't allow more than one cpu. (a Generic Linux for example)

Check your source vm (the vmx file) to see if there is a numvcpu set.

Have you tried other sources?

swevm commented 4 years ago

Went through source image configuration and its set as Linux/Ubuntu 64-bit and I also verified clone get the same ostype setting. In cloned VM there is no numvpus entry.

Ran a quick test creating an empty VM. Now there is an entry for numvcpus set to same amount as I have in main.tf for testing.

I tried to debug on the ESX side to see if anything go wrong during clone process but a not able to find anything. Atm the moment the only trace of something going wrong during the creation process is the log line in my initial post. Do you have any idea what else can be done to debug?

Btw I´m running ESXi 7.0 but that should not make a difference afaik.

josenk commented 4 years ago

Set Terrafom debugging, then do a terrafom apply.

$ export TF_LOG=DEBUG

Then send me the source and destination vmx file and the debug output.

swevm commented 4 years ago

Haven´t had time to run with debug yet but have some updates from testing earlier today. Tested three different ways to create a VM:

Changes in VMX file for the different test cases: vCPU related config in VMX for case 1 & 2: cat cloudimg1.vmx | grep cpu ''' numvcpus = "2" numa.autosize.vcpu.maxPerVirtualNode = "2" '''

vCPU related config in VMX for case 3: ''' numa.autosize.vcpu.maxPerVirtualNode = "1" '''

josenk commented 4 years ago

Can you please try cloning a different source vm. I have cloned hundreds of vms and I have not seen this issue.

To debug this further, I will need a copy of the source and destination vmx file and the debug output.

Btw, the 'numvcpus: was cty.StringVal("2"), but now cty.StringVal("1")' error is simply telling you that the current cpus is 1, but it should have been 2... There's no hint there to tell you why.

swevm commented 4 years ago

Cloned another VM. Still same issue.

I don´t see why different OSs should behave different as numvcpu changes are at control plane level and has nothing to do with the OS apart from ostype which control VM hw config boundaries. I can manually change numvcpu in UI so there are no limitation with ostype and vm hardware version for this test vm.

Debug log attached.

Original vmx: .encoding = "UTF-8" config.version = "8" virtualHW.version = "17" vmci0.present = "TRUE" floppy0.present = "FALSE" memSize = "1024" bios.bootRetry.delay = "10" powerType.suspend = "soft" tools.upgrade.policy = "manual" sched.cpu.units = "mhz" sched.cpu.affinity = "all" vm.createDate = "1599495617334373" scsi0.virtualDev = "pvscsi" scsi0.present = "TRUE" sata0.present = "TRUE" usb.present = "TRUE" ehci.present = "TRUE" ethernet0.virtualDev = "vmxnet3" ethernet0.networkName = "VM Home Network" ethernet0.addressType = "generated" ethernet0.wakeOnPcktRcv = "FALSE" ethernet0.uptCompatibility = "TRUE" ethernet0.present = "TRUE" sata0:0.startConnected = "FALSE" sata0:0.deviceType = "atapi-cdrom" sata0:0.fileName = "auto detect" sata0:0.present = "TRUE" displayName = "rancheros1" guestOS = "coreos-64" toolScripts.afterPowerOn = "TRUE" toolScripts.afterResume = "TRUE" toolScripts.beforeSuspend = "TRUE" toolScripts.beforePowerOff = "TRUE" tools.syncTime = "FALSE" uuid.bios = "56 4d 6e 76 54 9b 7e 0c-fe 54 b3 b5 34 6e 0a d9" uuid.location = "56 4d 6e 76 54 9b 7e 0c-fe 54 b3 b5 34 6e 0a d9" vc.uuid = "52 72 54 3e 41 48 fc fd-c9 a3 6e 27 f0 09 64 6c" sched.cpu.min = "0" sched.cpu.shares = "normal" sched.mem.min = "0" sched.mem.minSize = "0" sched.mem.shares = "normal" scsi0:0.deviceType = "scsi-hardDisk" scsi0:0.fileName = "rancheros-vmware.vmdk" sched.scsi0:0.shares = "normal" sched.scsi0:0.throughputCap = "off" scsi0:0.present = "TRUE" ethernet0.generatedAddress = "00:0c:29:6e:0a:d9" vmci0.id = "879626969" cleanShutdown = "TRUE" tools.guest.desktop.autolock = "FALSE" nvram = "rancheros1.nvram" pciBridge0.present = "TRUE" svga.present = "TRUE" pciBridge4.present = "TRUE" pciBridge4.virtualDev = "pcieRootPort" pciBridge4.functions = "8" pciBridge5.present = "TRUE" pciBridge5.virtualDev = "pcieRootPort" pciBridge5.functions = "8" pciBridge6.present = "TRUE" pciBridge6.virtualDev = "pcieRootPort" pciBridge6.functions = "8" pciBridge7.present = "TRUE" pciBridge7.virtualDev = "pcieRootPort" pciBridge7.functions = "8" hpet0.present = "TRUE" RemoteDisplay.maxConnections = "-1" sched.cpu.latencySensitivity = "normal" svga.autodetect = "TRUE" sata0:0.autodetect = "TRUE" numa.autosize.cookie = "10012" numa.autosize.vcpu.maxPerVirtualNode = "1" sched.swap.derivedName = "/vmfs/volumes/5f0233e9-aace1a4c-af8e-001b211e3072/rancheros1/rancheros1-ec51ca15.vswp" pciBridge0.pciSlotNumber = "17" pciBridge4.pciSlotNumber = "21" pciBridge5.pciSlotNumber = "22" pciBridge6.pciSlotNumber = "23" pciBridge7.pciSlotNumber = "24" scsi0.pciSlotNumber = "160" usb.pciSlotNumber = "32" ethernet0.pciSlotNumber = "192" ehci.pciSlotNumber = "33" vmci0.pciSlotNumber = "34" sata0.pciSlotNumber = "35" scsi0.sasWWID = "50 05 05 66 54 9b 7e 00" vmotion.checkpointFBSize = "16777216" vmotion.checkpointSVGAPrimarySize = "16777216" vmotion.svga.mobMaxSize = "16777216" vmotion.svga.graphicsMemoryKB = "16384" ethernet0.generatedAddressOffset = "0" monitor.phys_bits_used = "45" softPowerOff = "FALSE" usb:0.present = "TRUE" usb:0.deviceType = "hid" usb:0.port = "0" usb:0.parent = "-1" usb:1.speed = "2" usb:1.present = "TRUE" usb:1.deviceType = "hub" usb:1.port = "1" usb:1.parent = "-1" migrate.hostLog = "./rancheros1-ec51ca15.hlog"

destination vmx: .encoding = "UTF-8" config.version = "8" virtualHW.version = "17" pciBridge0.present = "TRUE" svga.present = "TRUE" pciBridge4.present = "TRUE" pciBridge4.virtualDev = "pcieRootPort" pciBridge4.functions = "8" pciBridge5.present = "TRUE" pciBridge5.virtualDev = "pcieRootPort" pciBridge5.functions = "8" pciBridge6.present = "TRUE" pciBridge6.virtualDev = "pcieRootPort" pciBridge6.functions = "8" pciBridge7.present = "TRUE" pciBridge7.virtualDev = "pcieRootPort" pciBridge7.functions = "8" vmci0.present = "TRUE" hpet0.present = "TRUE" floppy0.present = "FALSE" memSize = "12288" powerType.suspend = "soft" tools.upgrade.policy = "manual" sched.cpu.units = "mhz" vm.createDate = "1599496342511920" usb.pciSlotNumber = "32" ehci.pciSlotNumber = "33" usb.present = "TRUE" ehci.present = "TRUE" scsi0.virtualDev = "pvscsi" scsi0.pciSlotNumber = "160" scsi0.present = "TRUE" sata0.pciSlotNumber = "35" sata0.present = "TRUE" scsi0:0.deviceType = "scsi-hardDisk" scsi0:0.fileName = "rancherostest1.vmdk" scsi0:0.present = "TRUE" sata0:0.startConnected = "FALSE" sata0:0.deviceType = "atapi-cdrom" sata0:0.fileName = "CD/DVD drive 0" sata0:0.present = "TRUE" vmci0.pciSlotNumber = "34" displayName = "rancherostest1" guestOS = "coreos-64" toolScripts.afterPowerOn = "TRUE" toolScripts.afterResume = "TRUE" toolScripts.beforeSuspend = "TRUE" toolScripts.beforePowerOff = "TRUE" tools.syncTime = "FALSE" uuid.bios = "56 4d dc 0f 8d 9e 43 ac-ff 39 e1 37 38 b7 51 c3" uuid.location = "56 4d dc 0f 8d 9e 43 ac-ff 39 e1 37 38 b7 51 c3" vc.uuid = "52 54 9a 2d c3 8d 62 cb-53 31 74 9f ff f0 1e b9" nvram = "rancherostest1.nvram" svga.autodetect = "TRUE" sched.cpu.shares = "normal" ethernet0.networkName = "VM Home Network" ethernet0.virtualDev = "e1000" ethernet0.present = "TRUE"

disk.EnableUUID = "TRUE" numa.autosize.cookie = "10012" numa.autosize.vcpu.maxPerVirtualNode = "1" sched.swap.derivedName = "/vmfs/volumes/0acf94f8-efba6d09/rancherostest1/rancherostest1-791bdb21.vswp" migrate.hostlog = "./rancherostest1-791bdb21.hlog" scsi0:0.redo = "" pciBridge0.pciSlotNumber = "17" pciBridge4.pciSlotNumber = "21" pciBridge5.pciSlotNumber = "22" pciBridge6.pciSlotNumber = "23" pciBridge7.pciSlotNumber = "24" ethernet0.pciSlotNumber = "36" scsi0.sasWWID = "50 05 05 6f 8d 9e 43 a0" svga.vramSize = "16777216" vmotion.checkpointFBSize = "16777216" vmotion.checkpointSVGAPrimarySize = "16777216" vmotion.svga.mobMaxSize = "16777216" vmotion.svga.graphicsMemoryKB = "16384" ethernet0.addressType = "generated" ethernet0.generatedAddress = "00:0c:29:b7:51:c3" ethernet0.generatedAddressOffset = "0" vmci0.id = "951538115" monitor.phys_bits_used = "45" cleanShutdown = "FALSE" softPowerOff = "FALSE" usb:0.present = "TRUE" usb:0.deviceType = "hid" usb:0.port = "0" usb:0.parent = "-1" usb:1.speed = "2" usb:1.present = "TRUE" usb:1.deviceType = "hub" usb:1.port = "1" usb:1.parent = "-1"

Terraform debug.pdf

josenk commented 4 years ago

Thanks for the logs and vmx... I was able to make some progress to figure out root cause. I'll do some additional testing, then I should be able to release a new version with the fix.

josenk commented 4 years ago

https://github.com/josenk/terraform-provider-esxi/releases/tag/v1.7.2 has been published. This should fix the problem.

Review your versions.tf file (or where ever you define the 'required_providers' block. Then run terraform init to download the latest provider version.

swevm commented 4 years ago

Working as expected now. Thanks for a quick fix 👍