Closed larsyencken closed 10 months ago
We've encountered this before. I didn't find a proper fix, but I got it working through this workaround.
Oh, cool! In this case, it was fixed with a reboot, but ideally you wouldn't have to.
We've encountered this before. I didn't find a proper fix, but I got it working through this workaround.
What was the workaround? The link is now dead.
@ExpatUK below is a workaround. We grab all pids and then run the sudo nsenter -t ...
command on them (as per the discussion) until we're able to destroy the container. It's super hacky, but it works. The solution from the discussion might be enough for you.
def _destroy_container(host: Host, name: str):
print(f"--- Destroying container {name} and its data")
try:
host.run(f"sudo lxc delete --force {name}")
except SystemExit:
# This is LXC bug https://discuss.linuxcontainers.org/t/lxc-delete-result-in-failed-to-destroy-zfs-filesystem-dataset-is-busy/5728
# We get the following error when trying to delete a container:
# Error: Error deleting storage volume: Failed to run: zfs destroy -r lxd/containers/staging-site-scripts-relative-url:
# exit status 1 (cannot destroy 'lxd/containers/staging-site-scripts-relative-url': dataset is busy)
# The workaround is to unmount the filesystem and try again.
print("!!! Container could not be destroyed, trying workaround")
pids = host.run('pgrep -fl "lxc monitor"', capture_output=True).replace(
"lxd", ""
)
# try all pids in random order (perhaps there's a better way?)
pids = [pid.strip() for pid in pids.split("\n")]
random.shuffle(pids)
for pid in pids:
try:
host.run(
f"sudo nsenter -t {pid} -m -- umount /var/snap/lxd/common/lxd/storage-pools/lxd_zfs/containers/{name}"
)
except SystemExit:
print(f"... Could not be unmounted with pid {pid}, trying new pid")
continue
host.run(f"sudo lxc delete --force {name}")
print("!!! Container destroyed")
break
else:
print("!!! Container could not be destroyed")
When copying and moving containers, it seems that containers on
foundation-1
cannot be deleted.It appears to be a snapd/lxd combination issue. We might be able to delete them if we disable lxd temporarily.
See upstream issue: https://github.com/canonical/lxd/issues/11168