Open gred7 opened 3 years ago
Files identified in the description:
plugins/modules/vmware_guest.py
](https://github.com/['ansible-collections/amazon.aws', 'ansible-collections/community.aws', 'ansible-collections/community.vmware']/blob/main/plugins/modules/vmware_guest.py)If these files are inaccurate, please update the component name
section of the description or use the !component
bot command.
cc @Akasurde @Tomorrow9 @goneri @lparkes @nerzhul @pdellaert @pgbidkar @warthog9 click here for bot help
also I do not see the case of two networks (or rather more than one) in your tests
I can verify the behaviour what @gred7 reported: networks are added but left unconnected
. Additionally, Connect At Power On
option is also not checked. In my case, I faced that issue after upgrading Ansible from 2.9.13
to 4.1.0
- and there was only one network card.
This happens when vmware tools does not report that guest OS customization completed successfully. Unfortunately, it seems to even happen when customization is not requested. Also fairly confident that I started seeing this behavior after a recent upgrade.
I can verify the behaviour what @gred7 reported:
networks are added but left unconnected
. Additionally,Connect At Power On
option is also not checked. In my case, I faced that issue after upgrading Ansible from2.9.13
to4.1.0
- and there was only one network card.
I am using ansible 4.1.0 and not facing this atall. Deploying more than 100+ OVA and templates in my vCenter. If the customization wont work then it wont attach the interface.
$ pip3 show ansible
Name: ansible
Version: 4.1.0
Summary: Radically simple IT automation
Home-page: https://ansible.com/
Author: Ansible, Inc.
Author-email: info@ansible.com
License: GPLv3+
Location: /usr/local/lib/python3.8/dist-packages
Requires: ansible-core
Required-by:
For windows, here is what I am using:
name: Deploying vm from '{{ win_temp }}' vmware_guest: hostname: '{{ vcenter_hostname }}' username: '{{ vcenter_username }}' password: '{{ vcenter_password }}' datacenter: '{{ vsphere_datacenter }}' cluster: '{{ vsphere_cluster }}' datastore: '{{ vsphere_datastore }}' name: '{{ inventory_hostname }}' template: '{{ win_temp }}' folder: '{{ folder }}' validate_certs: 'no' networks:
For linux, here is what I am using:
name: Deploying vm from '{{ lin_temp }}' vmware_guest: hostname: '{{ vcenter_hostname }}' username: '{{ vcenter_username }}' password: '{{ vcenter_password }}' datacenter: '{{ vsphere_datacenter }}' cluster: '{{ vsphere_cluster }}' datastore: '{{ vsphere_datastore }}' name: '{{ inventory_hostname }}' template: '{{ lin_temp }}' folder: '{{ folder }}' validate_certs: 'no' networks:
Hope this may help you.
Also seeing the same issue on vSphere 6.7 deploying Linux machines, let me know if any debug output is required.
Ansible version
$ ansible --version
ansible 2.10.11
config file = /etc/ansible/ansible.cfg
configured module search path = ['/home/mattb/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3/dist-packages/ansible
executable location = /usr/bin/ansible
python version = 3.8.5 (default, May 27 2021, 13:30:53) [GCC 9.3.0]
Workstation OS
$ lsb_release -a
LSB Version: core-11.1.0ubuntu2-noarch:security-11.1.0ubuntu2-noarch
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
As an aside, I worked around this issue by adding an arbitrary wait timeout and then applying the connected states again, thought it might be useful for anyone finding this issue like I did.
- name: Wait 30 seconds for VM deployment
wait_for:
timeout: 30
- name: Fix unconnected network
vmware_guest:
hostname: '{{ vc_ipaddress }}'
username: '{{ vault_vc_username }}'
password: '{{ vault_vc_password }}'
validate_certs: False
datacenter: '{{ vc_datacenter }}'
name: '{{ vm_name }}'
networks:
- name: '{{ vc_vm_net_name }}'
connected: yes
start_connected: yes
As an aside, I worked around this issue by adding an arbitrary wait timeout and then applying the connected states again, thought it might be useful for anyone finding this issue like I did.
- name: Wait 30 seconds for VM deployment wait_for: timeout: 30 - name: Fix unconnected network vmware_guest: hostname: '{{ vc_ipaddress }}' username: '{{ vault_vc_username }}' password: '{{ vault_vc_password }}' validate_certs: False datacenter: '{{ vc_datacenter }}' name: '{{ vm_name }}' networks: - name: '{{ vc_vm_net_name }}' connected: yes start_connected: yes
wait_for is usually not needed if you will use "connected: yes" in the main play as I have mentioned earlier.
@Udayendu not for me, I have connected
and start_connected
both set to yes in the initial Play and still the interface doesn't get connected. Perhaps that's a difference between your 4.1.0 and my 2.10.11?
connected: yes
start_connected: yes
@Udayendu not for me, I have
connected
andstart_connected
both set to yes in the initial Play and still the interface doesn't get connected. Perhaps that's a difference between your 4.1.0 and my 2.10.11?connected: yes start_connected: yes
Ok. That should not be the case because my code is working well since 2.9. You are facing this issue only with Ubuntu or other linux OS as well ?
Honestly I haven't tried running it from another OS, I ran into the issue a couple days ago which led me to the OPs post and thought I'd add the +1 to it. I don't really have the bandwidth right now to test from other platforms but if I can provide any debug from what I have, I'd be happy to help.
I also faced the similar issue for cloning RHEL 7.9/8.4 VMs. I found error log at /var/log/vmware-imc/toolsDeployPkg.log as below for example.
[2021-06-25T22:38:57.739Z] [ error] execv failed to run (/usr/bin/cloud-init), errno=(2), error message:(そのようなファイルやディレクトリはありません)
[2021-06-25T22:38:57.743Z] [ info] Process exited normally after 0 seconds, returned 127
[2021-06-25T22:38:57.743Z] [ info] No more output from stdout
[2021-06-25T22:38:57.743Z] [ info] No more output from stderr
[2021-06-25T22:38:57.743Z] [ info] Customization command output:
''.
[2021-06-25T22:38:57.743Z] [ error] Customization command failed with exitcode: 127, stderr: ''.
[2021-06-25T22:38:57.743Z] [ info] cloud-init is not installed.
[2021-06-25T22:38:57.743Z] [ info] Executing traditional GOSC workflow.
[2021-06-25T22:38:57.743Z] [ debug] Command to exec : '/usr/bin/perl'.
[2021-06-25T22:38:57.743Z] [ info] sizeof ProcessInternal is 56
[2021-06-25T22:38:57.744Z] [ info] Returning, pending output from stdout
[2021-06-25T22:38:57.744Z] [ info] Returning, pending output from stderr
[2021-06-25T22:38:57.749Z] [ error] execv failed to run (/usr/bin/perl), errno=(2), error message:(そのようなファイルやディレクトリはありません)
[2021-06-25T22:38:57.751Z] [ info] Process exited normally after 0 seconds, returned 127
[2021-06-25T22:38:57.751Z] [ info] No more output from stdout
[2021-06-25T22:38:57.751Z] [ info] No more output from stderr
[2021-06-25T22:38:57.751Z] [ info] Customization command output:
It indicates that the script was not able to find /usr/bin/cloud-init and /usr/bin/perl. Are these packages required for the current vmware_guest module?
I installed perl
and cloud-init
packages into the template VM and I tried to clone the VM using the vmware_guest
module, the VM network connected as expected.
I have a similar issue. In my case I wanted to do a vm clone without any modifications. This should not involve any custom specs.
- name: Create a VM from a Template
community.vmware.vmware_guest:
hostname: "{{ vcenter.hostname }}"
username: "{{ vcenter.username }}"
password: "{{ vcenter.password }}"
datacenter: "{{ vm.datacenter }}"
validate_certs: false
name: "{{ vm.hostname }}"
template: "{{ vm.template }}"
folder: "{{ vm.folder }}"
cluster: "{{ vm.cluster }}"
datastore: "{{ vm.datastore }}"
state: poweredon
hardware:
num_cpus: "{{ vm.cpu_cores}}"
memory_mb: "{{ vm.memory_mb }}"
networks:
- name: "{{ vm.n1_name }}"
- name: "{{ vm.n2_name }}"
- name: "{{ vm.n3_name }}"
- name: "{{ vm.n4_name }}"
- name: "{{ vm.n5_name }}"
delegate_to: localhost
After launching ansible playbook with debug -vvvvv
, I have noticed this part:
...
"networks": [
{
"name": "pg1",
"type": "dhcp"
},
{
"name": "pg2",
"type": "dhcp"
},
{
"name": "pg3",
"type": "dhcp"
},
{
"name": "pg4",
"type": "dhcp"
},
{
"name": "pg5",
"type": "dhcp"
}
]
...
This play resulted in vm deployed but all networks interfaces were in disconnected state. Doing some research on VMware side, I have come across this event message:
Reconfigured VMNAME
on ESX_NAME
in DC_NAME
. Modified:
config.tools.pendingCustomization: <unset> -> "/vmfs/volumes/5e29bffb-42291496-5039-0025b513aa0d/VMNAME/imcf-oRxxAd";
config.hardware.device(4002).connectable.startConnected: true -> false;
config.hardware.device(4001).connectable.startConnected: true -> false;
config.hardware.device(4004).connectable.startConnected: true -> false;
config.hardware.device(4003).connectable.startConnected: true -> false;
config.hardware.device(4000).connectable.startConnected: true -> false;
Added: config.extraConfig("tools.deployPkg.fileName"): (key = "tools.deployPkg.fileName", value = "imcf-oRxxAd");
So I assume that during Customization, VMware disables network interfaces and reenables them after custom spec is finalized. In my case, a template is not customizable and therefore after some time (5-10 minutes), customization fails and NICs are never enabled. VMWare event log line:
An error occurred while customizing VMNAME. For details reference the log file <No Log> in the guest OS.
I dag a bit in the code to check why even customizations are launched when I do not provide any OS level parameters. This brought me to these lines: https://github.com/ansible-collections/community.vmware/blob/ae8bcbbecb68999ab9a580a9c725434959e570ba/plugins/modules/vmware_guest.py#L2745-L2755
I have tested this piece of code on my inputs and surely it resulted in execution of custom spec.
I think this behavoir should be changes so that if type
is not provided, there should not be any modifications.
Another idea is to introduce a third default value of type
field that is null
or None
.
If you intend to perform os customization then you probably need to troubleshoot custom spec execution on os level.
For reference, another similar issue: https://github.com/ansible/ansible/issues/24193
We are also experiencing this. Ansible Tower 3.7.5 / Ansible 2.9.18. Create new VM from template and include connected: yes
and start_connected: yes
in the module network params and it fails to set them. We added a follow-on step to run the vmware_guest_network module to force these to be set after the VM is successfully created.
##
## create a VM from template, powered off state, then add disks and tags
##
## THIS FAILS TO SET start_connected / connected to True.
##
- name: create the guest vm using template
community.vmware.vmware_guest:
validate_certs: no
hostname: "{{ vcenter[location|lower].vc }}"
datacenter: "{{ vcenter[location|lower].dc }}"
cluster: "{{ vcenter[location|lower].cl }}"
name: "{{ vm_guest_name | lower }}"
state: poweredoff
template: "{{ os_type }}"
folder: "{{ esx_folder }}"
datastore: "{{ vcenter[location|lower].ds }}"
hardware:
hotadd_cpu: yes
hotadd_memory: yes
memory_mb: "{{ vm_spec[vm_size].ram }}"
num_cpus: "{{ vm_spec[vm_size].cpu }}"
networks:
- name: "VLAN_{{ vlan }}"
type: dhcp
start_connected: yes
connected: yes
wait_for_ip_address: no
delegate_to: localhost
register: newvm
##
## ensure the network connects on startup
##
## THIS SUCCEEDS TO SET start_connected / connected to True.
##
- name: set the vm network to connect at startup
community.vmware.vmware_guest_network:
validate_certs: no
hostname: "{{ vcenter[location|lower].vc }}"
datacenter: "{{ vcenter[location|lower].dc }}"
cluster: "{{ vcenter[location|lower].cl }}"
name: "{{ vm_guest_name | lower }}"
mac_address: "{{ newvm.instance.hw_eth0.macaddress }}"
network_name: "VLAN_{{ vlan }}"
start_connected: yes
connected: yes
I'm also experiencing this issue in playbooks that previously worked fine. I'm attempting the workaround that @walterrowe is suggesting but it's a bit more work, since my VMs have multiple NICs depending on their role.
I'm also experiencing this issue in playbooks that previously worked fine. I'm attempting the workaround that @walterrowe is suggesting but it's a bit more work, since my VMs have multiple NICs depending on their role.
Since last few months even I am doing the same. I build my own template for both windows and linux. As a part of our solution we need to add multiple nics to the vm. Hence in the template we have no nic added by default. And as per the requirement I keep adding the nic and configure them using vmware_guest_network
module. So far have not seen any failure with this approach.
But vmware_guest
module need some fix for sure as its not able to handle the vm deployment with multiple nics and some time with single nic.
I'm also experiencing this issue in playbooks that previously worked fine.
@docandrew What do you mean with "previously"? Could you give a version where it worked, and the version where it stopped working for you? Preferably the version of the community.vmware
collection, but even the Ansible (community package) version would help. This might make it easier to troubleshoot.
vmware_guest
module need some fix for sure as its not able to handle the vm deployment with multiple nics and some time with single nic.
@Udayendu I've tested with Ansible 7.2.0 (community.vmware
3.3.0) today and the issue still exists. Looks like it works fine if you create a new VM, but not if you deploy from a template. I just don't understand why. vmware_guest
is a bit... complex :-/
Now this is interesting. I deliberatly crash the module with self.module.fail_json(msg="Template deployed")
directly after deploying from / cloning the template here:
I see the following events in the vCenter:
The first Reconfigured virtual machine
event adds the NICs with startConnected = true
and connected = false
:
Added: config.hardware.device(4001): (dynamicProperty = <unset>, key = 4001, deviceInfo = (label = "Network adapter 2", summary = "DVSwitch: 50 0a 18 77 60 a4 91 07-02 b9 9f 0b b2 40 58 bd"), backing = (port = (switchUuid = "50 0a 18 77 60 a4 91 07-02 b9 9f 0b b2 40 58 bd", portgroupKey = "dvportgroup-103366", portKey = "2196", connectionCookie = 1778155007)), connectable = (migrateConnect = "unset", startConnected = true, allowGuestControl = true, connected = false, status = "untried"), slotInfo = null, controllerKey = 100, unitNumber = 8, numaNode = <unset>, addressType = "assigned", macAddress = "00:50:56:8a:02:4c", wakeOnLanEnabled = true, resourceAllocation = (reservation = 0, share = (shares = 50, level = "normal"), limit = -1), externalId = <unset>, uptCompatibilityEnabled = true, uptv2Enabled = <unset>); config.hardware.device(4000): (dynamicProperty = <unset>, key = 4000, deviceInfo = (label = "Network adapter 1", summary = "DVSwitch: 50 0a 18 77 60 a4 91 07-02 b9 9f 0b b2 40 58 bd"), backing = (port = (switchUuid = "50 0a 18 77 60 a4 91 07-02 b9 9f 0b b2 40 58 bd", portgroupKey = "dvportgroup-30", portKey = "91", connectionCookie = 1778151995)), connectable = (migrateConnect = "unset", startConnected = true, allowGuestControl = true, connected = false, status = "untried"), slotInfo = null, controllerKey = 100, unitNumber = 7, numaNode = <unset>, addressType = "assigned", macAddress = "00:50:56:8a:ba:c5", wakeOnLanEnabled = true, resourceAllocation = (reservation = 0, share = (shares = 50, level = "normal"), limit = -1), externalId = <unset>, uptCompatibilityEnabled = true, uptv2Enabled = <unset>)
Then the second Reconfigured virtual machine
event changes startConnected
to false
:
config.hardware.device(4001).connectable.startConnected: true -> false; config.hardware.device(4000).connectable.startConnected: true -> false;
@mariolenz thank you for looking into this - in my case I am updating playbooks that had been written for an Ansible version prior to the splitting off of the community VMWare module. Having done some more investigation I don't think it's the same issue that others are referencing in this thread. The playbooks used to run on a single ESXi host with all the necessary portgroups attached to a vSwitch on that host, and now they are a clustered ESXi setup and the port groups were only being added to one of the hosts.
I'm not sure if this is something detectable from the vmware module or not. If it is, it might be helpful to get an outright failure to create the VM when the network isn't present on the ESXi host it is about to be created on. In any case, mea culpa with regards to my comment.
I am facing similar issue
$ansible --version
ansible [core 2.15.1]
config file = None
configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python3.9/site-packages/ansible
ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible
python version = 3.9.17 (main, Jul 4 2023, 06:21:22) [GCC 12.2.0] (/usr/local/bin/python)
jinja version = 3.1.2
libyaml = True
I added the following in network section
networks:
- connected: true
name: "{{ network_name }}"
start_connected: true
Still whenever the VM is created, the network interface is always disconnected. If I connect the interface from the vcenter, the VM connects to the port group. Please advice.
SUMMARY
ansible 2.9.22
vmware_guest: hostname: "some" username: "someone" password: "somepass" template: "{{ vmtemplate }}" validate_certs: false folder: "" datacenter: qarea name: "{{ tempname }}" state: poweredon guest_id: ubuntu64Guest cluster: "DRS" disk:
ISSUE TYPE
COMPONENT NAME
vmware_guest
ANSIBLE VERSION
CONFIGURATION
OS / ENVIRONMENT
ubuntu linux Linux jenkins 4.4.0-210-generic #242-Ubuntu SMP Fri Apr 16 09:57:56 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
STEPS TO REPRODUCE
EXPECTED RESULTS
when set to true both(or one of) networks are connected on VM poweron
ACTUAL RESULTS
networks are added but left unconnected. in vm i see 2 interfaces with NO CARRIER/DOWN state. if I log into vsphere web interface I see both networks in unconnected state. by hands. I am able to set them to connected, then VM interfaces are up, and got ip addresses.