Closed LeifErik1995 closed 1 month ago
Interestingly, I'm running in the same issue. Still trying to figure out more however and I don't know if it is similar issue. I seem to face this behavior only when running 2 packer jobs at the same time - running one at once, seems to work. So you mentioning "DHCP release" gave me a good idea to look towards DHCP...
What I noticed watching my DHCP leases... When the VM starts and boots the first time:
Here, the Unique ID is a very long ID.
Just shortly after the first reboot (the autoinstall/cloudinit setup), I can see:
I seem to only have this issue with Ubuntu 22.04, not with 24.02 when both are building at pretty much the same time...
If I get another clever idea or find out something, I will update accordingly.
In my case, i am hitting the issue randomly even when running a single packer job. So might not be similar to what you are facing @patschi
Is your packer VM the only VM your DHCP server is handling in the same network? I have a dedicated network for packer it building VM templates, so no other DHCP requests ongoing.
hmm, interesting. No there is no dedicated network for packer (The network i use in my packer build is commonly used with around 500 vms associated with it). Maybe i will ask my build infra team to have a dedicated network and try this and see.
Interesting. I was tail'ing the DHCP server logs during one template build process:
I think it is related to the multiple releases... when it's released and another client requests a IP, the IP might get reallocated to a different client.
There are 3 releases:
Still trying to figure out more...
So. I found a workaround. As per my previous guess, the issue was indeed the DHCP.
The tricky part was, that after autoinstall starts, it immediately gets an IP address assigned from the DHCP server and Packer is picking up this specific address - even before any autoinstall configuration was parsed. This happens before the network:
configuration in the autoinstall definition takes effect.
In other words:
network:
configuration where it releases prior IP and requests a new one. (Possibly a restart of dhclient) Looks like during this timeframe, the same IP can get re-assigned and breaks packer.For context and as an example, I use following network:
:
network:
network:
version: 2
ethernets:
ens192:
dhcp4: true
dhcp-identifier: mac
My workaround was stopping open-vm-tools
in autoinstall
context to not give packer chance to read the current IP (which could change and packer doesn't need the IP at this stage anyway):
early-commands:
# Ensures that Packer does not connect too soon.
- sudo systemctl stop ssh.socket # Needed for Ubuntu 24.04 as any SSH connection restarts SSH daemon.
- sudo systemctl stop ssh # Stop the SSH daemon itself (for good measure).
- sudo systemctl stop open-vm-tools # Stop the VM reporting the current IP to vCenter, so also to packer. Does not affect package in target build.
- sudo systemctl disable open-vm-tools # For good measure and prevent auto-restart during open-vm-tools update.
And on the packer side, I tell packer to wait at least 2 minutes before checking the current IP. This prevents that packer receives the IP address via open-vm-tools
before they were stopped in the early stage:
ip_settle_timeout = "2m" # wait to get final IP after autoinstall complete
So ideally packer will only get the IP address from the VM once rebooted, to get its final IP. This took a friend and me now hours to figure out...
For me, it works now as I expected it to.
Thanks @patschi, i will check it out. It is my first time with such in-depth tasks on devops and infra side. Usually i work on application side of the stuff. Once i try it out then i will update the thread.
Yeah, let me know if it helps and if it is the same/similar issue. I'm very curious to know.
I have been building this for my private lab now - for the like 2-4 times per year I need a new, fresh Ubuntu VM... Maybe the time invested pay off in a few decades...
Try modifying the boot command for Ubuntu to include the following:
boot_command = [
... other commands ...
"autoinstall network-config=disabled"
... other commands ...
]
This should skip the initial IP address.
Any luck @LeifErik1995 @patschi?
Try modifying the boot command for Ubuntu to include the following:
boot_command = [ ... other commands ... "autoinstall network-config=disabled" ... other commands ... ]
This should skip the initial IP address.
@tenthirtyam I will try this in some time. Currently busy with another task. Once I am somewhat light on my workload, will check back on all the suggestions here.
Try modifying the boot command for Ubuntu to include the following:
That's a valid workaround, thanks for suggesting!
While I think this will help, it does have following disadvantages:
But obviously, the latter can be done during using provisioning script.
Thanks for confirming, @patschi.
Closing based on the workaround provided for Ubuntu's installer in https://github.com/hashicorp/packer-plugin-vsphere/issues/425#issuecomment-2106511430.
Well. I stumbled about some other weird issue when using kind of the same approach as suggested above: The network was also disabled on the final, created VM after the installer finished.
Interestingly, this has only shown its behavior with Ubuntu 24.04 on an ongoing reproducible basis. With Ubuntu 22.04 the kernel parameters are copied too, but network was still working. Looks like there were some changes in latest 24.04.
So in case anyone has the same behavior...
The boot commands were:
boot_command = [
"c<wait3s>",
"set gfxpayload=keep<enter>",
"linux /casper/vmlinuz --- autoinstall network-config=disabled",
"<enter><wait>",
"initrd /casper/initrd",
"<enter><wait>",
"boot",
"<enter>"
]
After almost 8 hours of research and pulling my hair out, I found out the three ---
essentially mean to persist these kernel parameters in /etc/default/grub
after even the install completed: https://serverfault.com/questions/1055649/how-can-i-add-a-kernel-argument-to-a-debian-preseed-file/1055838#1055838:~:text=that%20triple%20dash%20is%20the%20key%20to%20telling%20the%20installer%20that%20the%20kernel%20command%20line%20parameters%20following%20it%20are%20not%20just%20meant%20to%20work%20around%20issues%20during%20installation%2C%20but%20should%20also%20be%20persisted%20in%20bootloader%20configuration%20-%20such%20as%20%2Fetc%2Fdefault%2Fgrub.
Removed ---
- now all good and Ubuntu 24.04 deployment works offline + final update when finished.
Hi folks,
I am experiencing an issue with IP address where old one not getting recognized after the reboot when building ubuntu 22.04, host ESXi 7.0. After reboot IP is no longer assigned to the VM.
My cloud-init is as below
Boot command is as follows
I scoured through the issues that relate to this to make some headway on this, but there seems to be some issue with my config. <BTW, I am very new to packer, so any advice is appreciated however trivial it might be>. Also attaching the screenshot of both the after reboot vm log, and the packer debug log at the time of the error. After that and i get the no route to host error, the vm ends up in the state of the final image.
Thanks