hashicorp / packer

Packer is a tool for creating identical machine images for multiple platforms from a single source configuration.
http://www.packer.io
Other
15.14k stars 3.33k forks source link

Unable to finish building a KVM image on Ubuntu 18.04 (Bionic Beaver) #6432

Closed archit closed 6 years ago

archit commented 6 years ago

I'm getting an error building my KVM image using the qemu builder, on Ubuntu Bionic Beaver 18.04. Here are

==> ****: Halting the virtual machine...
==> ****: Converting hard drive...
==> ****: Error converting hard drive: QemuImg error: qemu-img: Could not open 'output-****/****': Failed to get shared "write" lock
==> ****: Is another process using the image?
==> ****: Deleting output directory...
Build '****' errored: Error converting hard drive: QemuImg error: qemu-img: Could not open 'output-****/****': Failed to get shared "write" lock
Is another process using the image?

This btw worked just fine on Ubuntu 14.04 (trusty) and Ubuntu 16.04 (xenial). Here are the relevant versions

(venv) ➜  provisioner git:(master) ✗ lsb_release -a
LSB Version:    core-9.20170808ubuntu1-noarch:printing-9.20170808ubuntu1-noarch:security-9.20170808ubuntu1-noarch
Distributor ID: Ubuntu
Description:    Ubuntu 18.04 LTS
Release:        18.04
Codename:       bionic
(venv) ➜  provisioner git:(master) ✗ packer --version
1.2.4
(venv) ➜  provisioner git:(master) ✗ qemu-system-x86_64 --version
QEMU emulator version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.4)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
(venv) ➜  provisioner git:(master) ✗
SwampDragons commented 6 years ago

Interesting. I wonder if the image isn't shutting down properly.

archit commented 6 years ago

@SwampDragons how would I test that ? do a watch lsof | grep /path/to/output-folder ?

SwampDragons commented 6 years ago

I think that seems like a good approach, yeah. See whether the output file ever actually gets closed.

SwampDragons commented 6 years ago

You might also be able to track the actual qemu process with a ps command to see if it is still running at the time we try to convert it.

SwampDragons commented 6 years ago

Yeah, my gut instinct is that you're seeing a situation where we're trying to convert the file before the qemu build process has actually released it. if you run something like ps | grep qemu you should be able to see the actual qemu process that Packer launches; if that process is still running when we try to run qemu-img then we have a problem.

example of what the process looks like:


16588 ttys001    1:12.91 /usr/local/bin/qemu-system-x86_64 -m 512 -machine type=pc,accel=tcg -display sdl -cdrom /Users/mmarsh/dev/repro_cases/packer-qemu-templates/ubuntu/packer_cache/f8fd5c3ff54d2ced0eca03e93f30f0f53477156699278433e327e4e3d6752ff8.iso -drive file=output-ubuntu1404/ubuntu1404,if=virtio,cache=writeback,discard=ignore,format=qcow2 -vnc 127.0.0.1:82 -smp cpus=1 -device virtio-net,netdev=user.0 -boot once=d -fda /var/folders/8t/0yb5q0_x6mb2jldqq_vjn3lr0000gn/T/packer006711561 -name ubuntu1404 -netdev user,id=user.0,hostfwd=tcp::2746-:22
archit commented 6 years ago

but if I do find it, what are our next steps? Do you think its some kind of filesystem race condition introduced by qemu?

archit commented 6 years ago

On another system where this build works just fine has these details ...

» qemu-system-x86_64 --version                                                                                     
QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.30), Copyright (c) 2003-2008 Fabrice Bellard
» lsb_release -a                                                                                                    
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:    16.04
Codename:    xenial
SwampDragons commented 6 years ago

That is the ubuntu 16.04 guest though, right? My guess is that the shutdown command we're sending to ubuntu 18.04 isn't doing its job, and the vm isn't being shut down and the qemu process isn't actually ending when Packer thinks the build should be done, meaning the lock on the file isn't being released.

Is there any chance you can provide the config for the 18.04 build so that I can try to reproduce locally?

archit commented 6 years ago

No. In both cases, the guest is Ubuntu 14.04 (trusty). The output I've pasted before is from two different hosts, first is 18.04 where it fails, second one is 16.04 where the build succeeds.

SwampDragons commented 6 years ago

hmm, okay. I still think knowing whether the qemu process has ended before we try to kick off the convert would be useful. If it is, we need to figure out why it's not ending. If not, we need to figure out where that lock is coming from. This might be as simple as wrapping the convert call in a retryable function, but I'd like to know why the problem is actually occurring.

archit commented 6 years ago

Interestingly, it was really difficult to determine that using the ps | grep approach, because the time taken between Halting the virtual machine... and Converting hard drive... is too minuscule for the human eye. Most I could tell was that in the ps there was a brief reference to qemu-system-x86_64 <defunct>. Is that sufficient proof?

SwampDragons commented 6 years ago

I tinkered a bit for you and built a version of packer that I think may solve this if it is indeed a race condition: https://github.com/hashicorp/packer/compare/retry_convert

Here's the binary built for linux, or you can pull that branch and build yourself if you'd prefer. packer.zip

s0undt3ch commented 6 years ago

I hit the same issue, the binary you provided @SwampDragons worked.

SwampDragons commented 6 years ago

Awesome, thanks for letting me know!

archit commented 6 years ago

@SwampDragons sorry for the late response. But wanted to put in my feedback that the issue is indeed resolved with your patched version.

Thanks again for the quick fix!

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.