canonical / packer-maas

Packer templates to create MAAS deployable images
Other
288 stars 175 forks source link

Rocky8 build on Jammy (22.04) is hanging on "Waiting for shutdown..." #180

Closed derekcat closed 9 months ago

derekcat commented 11 months ago

Ran for 51min on a VM (2 cores, 6GB RAM) and 39min on a NUC (12 core, 64GB RAM, NVMe), both running 22.04.

It was running successfully and faster on the NUC last time I tried with the same version of the repo, but seemingly no difference before and after doing git pull.

Not sure what the problem is, nor how to best introspect this issue. Thoughts?

derekcat commented 11 months ago

Update, the NUC finished after 44min and 12sec Is this just a really slow process?

derekcat commented 11 months ago

https://pastebin.ubuntu.com/p/4g3z8HGhrV/ Not sure what's going on... The build on the NUC is "finished" but the .tar.gz file is only 3.5KB? o_O I tried wiping and redownloading the packer-maas repo and just replaced the rocky.ks.in file with the old one, but getting the same problem.

derekcat commented 11 months ago

Meanwhile on the VM build attempt, I fired up screen to capture what happened in the end there and I got this:

=> qemu.rocky8: Trying http://download.rockylinux.org/pub/rocky/8/isos/x86_64/Rocky-x86_64-boot.iso?checksum=sha256%3A88baefca6f0e78b53613773954e0d7c2d8d28ad863f40623db75c40f505b5105
==> qemu.rocky8: http://download.rockylinux.org/pub/rocky/8/isos/x86_64/Rocky-x86_64-boot.iso?checksum=sha256%3A88baefca6f0e78b53613773954e0d7c2d8d28ad863f40623db75c40f505b5105 => /home/ubuntu/.cache/packer/092b0bf072d1911f3af75f4e1cafc1d24600c6cf.iso
==> qemu.rocky8: Starting HTTP server on port 8105
    qemu.rocky8: No communicator is set; skipping port forwarding setup.
==> qemu.rocky8: Looking for available port between 5900 and 6000 on 127.0.0.1
==> qemu.rocky8: Starting VM, booting from CD-ROM
    qemu.rocky8: The VM will be run headless, without a GUI. If you want to
    qemu.rocky8: view the screen of the VM, connect via VNC without a password to
    qemu.rocky8: vnc://127.0.0.1:5925
==> qemu.rocky8: Overriding default Qemu arguments with qemuargs template option...
==> qemu.rocky8: Waiting 3s for boot...
==> qemu.rocky8: Connecting to VM via VNC (127.0.0.1:5925)
==> qemu.rocky8: Typing the boot commands over VNC...
    qemu.rocky8: No communicator is configured -- skipping StepWaitGuestAddress
==> qemu.rocky8: Waiting for shutdown...
==> qemu.rocky8: Failed to shutdown
==> qemu.rocky8: Provisioning step had errors: Running the cleanup provisioner, if present...
==> qemu.rocky8: Deleting output directory...
Build 'qemu.rocky8' errored after 1 hour 28 seconds: Failed to shutdown

==> Wait completed after 1 hour 28 seconds

==> Some builds didn't complete successfully and had errors:
--> qemu.rocky8: Failed to shutdown

==> Builds finished but no artifacts were created.
make: *** [Makefile:17: rocky8.tar.gz] Error 1
rm http/rocky.ks
r00ta commented 11 months ago

Meanwhile on the VM build attempt, I fired up screen to capture what happened in the end there and I got this:

=> qemu.rocky8: Trying http://download.rockylinux.org/pub/rocky/8/isos/x86_64/Rocky-x86_64-boot.iso?checksum=sha256%3A88baefca6f0e78b53613773954e0d7c2d8d28ad863f40623db75c40f505b5105
==> qemu.rocky8: http://download.rockylinux.org/pub/rocky/8/isos/x86_64/Rocky-x86_64-boot.iso?checksum=sha256%3A88baefca6f0e78b53613773954e0d7c2d8d28ad863f40623db75c40f505b5105 => /home/ubuntu/.cache/packer/092b0bf072d1911f3af75f4e1cafc1d24600c6cf.iso
==> qemu.rocky8: Starting HTTP server on port 8105
    qemu.rocky8: No communicator is set; skipping port forwarding setup.
==> qemu.rocky8: Looking for available port between 5900 and 6000 on 127.0.0.1
==> qemu.rocky8: Starting VM, booting from CD-ROM
    qemu.rocky8: The VM will be run headless, without a GUI. If you want to
    qemu.rocky8: view the screen of the VM, connect via VNC without a password to
    qemu.rocky8: vnc://127.0.0.1:5925
==> qemu.rocky8: Overriding default Qemu arguments with qemuargs template option...
==> qemu.rocky8: Waiting 3s for boot...
==> qemu.rocky8: Connecting to VM via VNC (127.0.0.1:5925)
==> qemu.rocky8: Typing the boot commands over VNC...
    qemu.rocky8: No communicator is configured -- skipping StepWaitGuestAddress
==> qemu.rocky8: Waiting for shutdown...
==> qemu.rocky8: Failed to shutdown
==> qemu.rocky8: Provisioning step had errors: Running the cleanup provisioner, if present...
==> qemu.rocky8: Deleting output directory...
Build 'qemu.rocky8' errored after 1 hour 28 seconds: Failed to shutdown

==> Wait completed after 1 hour 28 seconds

==> Some builds didn't complete successfully and had errors:
--> qemu.rocky8: Failed to shutdown

==> Builds finished but no artifacts were created.
make: *** [Makefile:17: rocky8.tar.gz] Error 1
rm http/rocky.ks

I'd suggest to run the build again with PACKER_LOG=1 sudo make

alexsander-souza commented 11 months ago

please make sure the user you run this is a member of the kvm group, otherwise I fear QEMU will use CPU emulation.

https://pastebin.ubuntu.com/p/4g3z8HGhrV/ Not sure what's going on... The build on the NUC is "finished" but the .tar.gz file is only 3.5KB? o_O I tried wiping and redownloading the packer-maas repo and just replaced the rocky.ks.in file with the old one, but getting the same problem.

This was fixed in #181

derekcat commented 11 months ago

@r00ta Tried that on the NUC and got this: https://pastebin.ubuntu.com/p/cxYbkQbfqV/

@alexsander-souza That resolved the hang on shutdown but I'm still getting 3.5KB outputs: https://pastebin.ubuntu.com/p/yT4p3xM5RY/

alexsander-souza commented 11 months ago

The expected build time is less than 10 minutes, ignoring the time to download the ISO. The VM spawned by Packer uses only one CPU core, so the number of cores available in the host shouldn't affect the build time.

I suspect QEMU is using CPU emulation instead of virtualisation. What's the output of:

derekcat commented 11 months ago
derek@derek-ov-nuc:~/code/packer-maas/rocky8$ kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used
derek@derek-ov-nuc:~/code/packer-maas/rocky8$ id
uid=1000(derek) gid=1000(derek) groups=1000(derek),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),110(lxd)
derekcat commented 11 months ago

Not sure the group will make much difference as I've been running the build with sudo on the NUC

alexsander-souza commented 11 months ago

true, the group membership doesn't matter if you run as root.

what's the size of output-rocky8/packer-rocky8?

looking at the output when PACKER_LOG is set, it seems the VM cannot connect to the HTTP server in the host. do you have any firewall rules in place?

derekcat commented 11 months ago

It's... Pretty small haha..

derek@derek-ov-nuc:~/code/packer-maas/rocky8$ sudo du -hs *
8.0K    http
4.0K    Makefile
4.2M    output-rocky8
4.0K    README.md
4.0K    rocky8.pkr.hcl
4.0K    rocky8.tar.gz

Doesn't look like it:

derek@derek-ov-nuc:~/code/packer-maas/rocky8$ sudo systemctl status ufw
● ufw.service - Uncomplicated firewall
     Loaded: loaded (/lib/systemd/system/ufw.service; enabled; vendor preset: enabled)
     Active: active (exited) since Thu 2023-10-19 00:17:27 UTC; 1 month 24 days ago
       Docs: man:ufw(8)
   Main PID: 1270 (code=exited, status=0/SUCCESS)
        CPU: 959us

Oct 19 00:17:27 derek-ov-nuc systemd[1]: Starting Uncomplicated firewall...
Oct 19 00:17:27 derek-ov-nuc systemd[1]: Finished Uncomplicated firewall.
derek@derek-ov-nuc:~/code/packer-maas/rocky8$ sudo ufw status
Status: inactive
alexsander-souza commented 11 months ago

that qcow2 file should be over 2GB, so the install process didn't even start and Packer is timing out after 1 hour (shutdown_timeout value in the template). The logs shows that it's failing to download the kickstart file. I also see use detected accelerator: tcg, which means Packer cannot use kvm for some reason.

Can you update Packer to 1.10.0? You are using the latest version of the qemu plugin but not the latest Packer. (should work but you never know)

derekcat commented 11 months ago

Hmmm... No luck from upgrading Packer either >_< https://pastebin.ubuntu.com/p/fS5G8JKXbW/

alexsander-souza commented 11 months ago

have you tried to build another template (e.g. centos7) just to validate your Packer install?

derekcat commented 11 months ago

Hmmm centos8 gives me this:

packer init centos8.pkr.hcl && packer build centos8.pkr.hcl
Error: 1 error(s) occurred:

* Error downloading checksum file: bad response code: 404 in "file:https://mirrors.edge.kernel.org/centos/8.4.2105/isos/x86_64/CHECKSUM"

  on centos8.pkr.hcl line 27:
  (source code not available)

make: *** [Makefile:17: centos8.tar.gz] Error 1
rm http/centos8.ks

Though I'll try centos7 next

derekcat commented 11 months ago

Hmmm CentOS 7 seems to have worked?

Build 'qemu.centos7' finished after 45 minutes 39 seconds.

==> Wait completed after 45 minutes 39 seconds

==> Builds finished. The artifacts of successful builds are:
--> qemu.centos7: VM files in directory: output-centos7
--> qemu.centos7: VM files in directory: output-centos7
rm http/centos7.ks
derek@derek-ov-nuc:~/code/packer-maas/centos7$ sudo du -hs *
[sudo] password for derek: 
4.0K    centos7.pkr.hcl
518M    centos7.tar.gz
8.0K    http
4.0K    Makefile
1.8G    output-centos7
4.0K    README.md

So something specific to Centos8 and Rocky 8? Trying Rocky 9 next.

derekcat commented 11 months ago

Hmmm Rocky 9 worked...

Build 'qemu.rocky9' finished after 10 minutes 19 seconds.

==> Wait completed after 10 minutes 19 seconds

==> Builds finished. The artifacts of successful builds are:
--> qemu.rocky9: VM files in directory: output-rocky9
--> qemu.rocky9: VM files in directory: output-rocky9
rm http/rocky.ks
derek@derek-ov-nuc:~/code/packer-maas/rocky9$ sudo du -hs *
8.0K    http
4.0K    Makefile
2.1G    output-rocky9
4.0K    README.md
4.0K    rocky9.pkr.hcl
806M    rocky9.tar.gz
derekcat commented 11 months ago

Reran Rocky9 and it's still working normally... Hermmm

alexsander-souza commented 11 months ago

Centos8 is past its EOL, so mirrors are starting to disappear. We are keeping the template for historical reasons only.

did you make any changes to the Rocky8 template? the Rocky9 template is a copy of it, and I can spot just one difference that could be relevant:

qemuargs = [["-serial", "stdio"], ["-cpu", "host"]]
derekcat commented 11 months ago

Yeah, I'd added a few packages (under the # My packages sections in the output above). Looks like that was the issue on the NUC, but still no idea what the failure is on the VM - that one was using the included standard kickstart file.

github-actions[bot] commented 10 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 9 months ago

This issue was closed because it has been inactive for 30 days since being marked as stale.