ansible-collections / community.vmware

Ansible Collection for VMware
GNU General Public License v3.0
349 stars 335 forks source link

Network not applied on Almalinux 9 after reboot #1795

Closed TheFrisianClause closed 1 year ago

TheFrisianClause commented 1 year ago
SUMMARY

When creating an Alma Linux 9 template on Vsphere 8.x and after that deployed with Ansible/vmware_guest module. New network parameters are being applied. But when the VM is rebooted, the network parameters revert back to template network parameters.

ISSUE TYPE

Creating VM from template, after that deployed with Ansible. Deployment succesfull but after reboot the VM loses the IP addresses provisioned by Ansible.

COMPONENT NAME

community.vmware.vmware_guest

ANSIBLE VERSION
ansible [core 2.14.2]
COLLECTION VERSION
community.vmware  3.7.0
CONFIGURATION
CONFIG_FILE() = /etc/ansible/ansible.cfg
DEFAULT_HOST_LIST(/etc/ansible/ansible.cfg) = ['/srv/ansible/inventory/hosts']
DEFAULT_ROLES_PATH(/etc/ansible/ansible.cfg) = ['/srv/ansible/roles']
HOST_KEY_CHECKING(/etc/ansible/ansible.cfg) = False
OS / ENVIRONMENT

VMware ESXi, 8.0.0, 21203435 vCenter 8.0.1.00200

STEPS TO REPRODUCE

Create an Almalinux 9 template on vmware and deploy it with Ansible. Then reboot the VM.

EXPECTED RESULTS

After Ansible deployment, it should get the IP assigned by ansible and not the IP assigned via templating.

ansibullbot commented 1 year ago

Files identified in the description: None

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

TheFrisianClause commented 1 year ago

Still having this issue, did someone manage to find a solution as this is looks to me like a pretty big issue.

ihumster commented 1 year ago

@TheFrisianClause You try use this workaround? https://kb.vmware.com/s/article/88199

TheFrisianClause commented 1 year ago

Yeah I already tried alot of solutions but no fix as of yet. The problem is, that the VM is being deployed with the correct IP. But once the VM is rebooted the IP is gone and is not even there anymore.

The network device is unknown apparently after reboot of VM.

ihumster commented 1 year ago

@TheFrisianClause Do you get similar behavior if you use vSphere customization when cloning a template from the GUI? (I'm just trying to figure out if this problem is a problem with the ansible module or a problem with vSphere customization in general).

TheFrisianClause commented 1 year ago

I tested it, but with manually cloning via the GUI it also says that the Network Device is unknown. Maybe it is something with the template/ vSphere templating and not Ansible?

ihumster commented 1 year ago

Yep, this is exactly what I want to say. In my opinion, the first rule of troubleshooting ansible is to make sure that the ansible is to blame. =)

TheFrisianClause commented 1 year ago

Sure, but how is it possible that when Ansible is deploying the VM via the template, the IP's are actually assigned and the network device is up and running. But when a reboot occurs it is not working anymore.

But still it is strange that this behaviour is non-existent on Ubuntu.

ihumster commented 1 year ago

All customization scripts 'live' in vCenter Server. Ansbile modules only make certain API calls. There are different customization scripts for different OS.

TheFrisianClause commented 1 year ago

Ah seems fair, well anyway seems I have to continue troubleshooting :) Thanks for the help anyway! :)

TheFrisianClause commented 1 year ago

Sorry I will have to reopen this, I just created a fresh template and deploying a VM from it works. The Network Device is working as well. I think the last time it didnt work, because I did all of the 'solutions' that were suggested so it broke the VM a bit.

Will try and see if I can also deploy with Ansible right now.

TheFrisianClause commented 1 year ago

When deploying template via Ansible and Perl installed. The deployment takes a long time, Network isn't configured, but the VM is deployed within VMware.

When playbook is done, the Network Device is still not up. When I do a manual 'nmcli con up ens33' the network comes up with the default IP of the VM template.

Now installed Cloud-init and perl together and the deployment is alot faster and network is configured. Although after restart the settings are gone. I think it has something to do with Cloud-init, that Cloud-init causes these issues for some reason on reboot. Because after reboot the network device is still there and after activation the IP reverts back to template IP.

TheFrisianClause commented 1 year ago

Deploying the template manually takes some time with the Network Manager service. Also the ens33 device has to be manually enabled in order to work.

So I am almost certain that it is cloud-init.

When removing cloud-init and creating cloud-init.disabled within the /etc/cloud directory, the VM deploys quick and ens33 is still disabled. So you still have to manually enable it. But the IP that is givin through Ansible is there. But then again after reboot the whole NMClient is messed up and nmcli is not working anymore...

NetworkManager fails to start.

ihumster commented 1 year ago

Try to ue cloud-init https://kb.vmware.com/s/article/59557

TheFrisianClause commented 1 year ago

Unfortunately neither only Perl or Cloud-init is working. The networking config is applied with Cloud-init and not with Perl. But after reboot with Cloud-init enabled the NetworkManager had some trouble again and the IP reverted back to template IP.

TheFrisianClause commented 1 year ago

So after one week of research I found out that cloud-init is the problem here, as it still seems to have problems with the transition of ifcfg -> NetworkManager on RHEL and RHEL related distro's. I am curious how long it is going to take them to fix this.

Davka commented 7 months ago

Try

name: migrate ifcfg
become: true
shell: nmcli connection migrate