lima-vm / lima

Linux virtual machines, with a focus on running containers
https://lima-vm.io/
Apache License 2.0
15.23k stars 600 forks source link

M1 Mac running an x86_64 Lima instance consistently freezes on Ventura 13.4.1 on a brew installed lima. #1649

Open nemonik opened 1 year ago

nemonik commented 1 year ago

Description

On an M1 Pro with the following instance config (note the arch is x86_64) running on Ventura 13.4.1 on a brew installed lima. My Ubuntu Jammy Lima VM starts, but freezes doing the following

sudo apt-get update
sudo apt-get upgrade

The instance freezes mid upgrade and later after limactl stop... restart, shell in a sudo dpkg --configure -a.

I've tried tweaking CPU and memory settings and the same thing happens.

The problem consistently happens.

Prior to upgrading to Ventura my x86_64 instances worked fine. I'm not sure what other changes occurred prior to upgrading. Maybe I should build and install lima from source?

Here is my config file with comments stripped for brevity (It is the template except for configuring DNS and configuring the use's home directory to be writable.):

arch: x86_64

images:
# Try to use release-yyyyMMdd image if available. Note that release-yyyyMMdd will be removed after several months.
- location: "https://cloud-images.ubuntu.com/releases/22.04/release-20220712/ubuntu-22.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:86481acb9dbd62e3e93b49eb19a40c66c8aa07f07eff10af20ddf355a317e29f"
- location: "https://cloud-images.ubuntu.com/releases/22.04/release-20220712/ubuntu-22.04-server-cloudimg-arm64.img"
  arch: "aarch64"
  digest: "sha256:e1ce033239f0038dca5ef09e582762ba0d0dfdedc1d329bc51bb0e9f5057af9d"
# Fallback to the latest release image.
# Hint: run `limactl prune` to invalidate the cache
- location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img"
  arch: "x86_64"
- location: "https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-arm64.img"
  arch: "aarch64"

cpus: null

memory: null

disk: null

mounts:
- location: "~"
  mountPoint: null
  writable: true
  sshfs:
    cache: null
    followSymlinks: null
    sftpDriver: null
  9p:
    securityModel: null
    protocolVersion: null
    msize: null
    cache: null
- location: "/tmp/lima"
  writable: true

mountType: null

ssh:
  localPort: 0
  loadDotSSHPubKeys: null
  forwardAgent: null
  forwardX11: null
  forwardX11Trusted: null

caCerts:
  removeDefaults: null

containerd:
  system: null
  user: null

cpuType:
  aarch64: null
  x86_64: null

firmware:
  legacyBIOS: null

video:
  display: null

propagateProxyEnv: null

hostResolver:
  enabled: false
  ipv6: null
  hosts:

dns:
 - 100.64.0.1

The instance spin up but freezes on

update-initramfs: Generating /boot/initrd.img-6.2.0-20-generic

I cannot open another shell on the instance nor can interrupt the process. If I do open another shell prior watching processes prior to the sudo apt-get upgrade the shell will lock too.

When I go to stop the instance it doesn't stop smoothly

INFO[0000] Sending SIGINT to hostagent process 61430
INFO[0000] Waiting for the host agent and the driver processes to shut down
INFO[0000] [hostagent] Received SIGINT, shutting down the host agent
INFO[0000] [hostagent] Shutting down the host agent
INFO[0000] [hostagent] Stopping forwarding "/run/lima-guestagent.sock" (guest) to "/Users/mjwalsh/.lima/mep-amd64/ga.sock" (host)
INFO[0000] [hostagent] Unmounting "/Users/mjwalsh"
INFO[0000] [hostagent] Unmounting "/tmp/lima"
INFO[0000] [hostagent] Shutting down QEMU with ACPI
INFO[0000] [hostagent] Sending QMP system_powerdown command
FATA[0180] did not receive an event with the "exiting" status

I tried uninstalling brew installed qemu 8.0.2 and installing my own tap of 8.0.0, but the same thing happens.

I'm lost as what to do.

Is there something I can do to provide insight and are others seeing this?

nemonik commented 1 year ago

Am I the only one seeing the same? @esposm03 you gave the issue a thumbs up does this indicate you are impacted as well.

esposm03 commented 1 year ago

Yes, I am impacted too, I just didn't want to start polluting this thread with "Me too" comments.

AkihiroSuda commented 1 year ago

Thanks for reporting. Does it work stably with vz?

esposm03 commented 1 year ago

Yes it does, however I need to both run and debug i386 binaries which rosetta doesn't support. My current workaround is manually running the binary I want to debug under qemu-i386 -g option, and then in another shell using gdb as normal.

mdorier commented 1 year ago

I don't know if it's related but I get frequent freezes on an M2 Max with an aarch64 Ubuntu VM. This happens randomly, a few times a day (sometimes even a few minutes apart). I have 3 or 4 terminals open with zsh running in the VM, I'm usually editing code with Vim or running make, etc. and all my terminals would become unresponsive. Doing lima zsh hangs, I can only do limactl stop, and restart the VM.

I'm not familiar at all with the technology behind lima, so let me know if there is anything I can do when it freezes that could provide more information as to what the problem is.

Occasionally, limactl stop will also hang indefinitely on this:

$ limactl stop
INFO[0000] Sending SIGINT to hostagent process 9976
INFO[0000] Waiting for the host agent and the driver processes to shut down
INFO[0000] [hostagent] Received SIGINT, shutting down the host agent
INFO[0000] [hostagent] Shutting down the host agent
INFO[0000] [hostagent] Stopping forwarding "/run/lima-guestagent.sock" (guest) to "/Users/mdorier/.lima/default/ga.sock" (host)

When this happens, I have no other choice but to restart my computer. If I kill processes, the next limactl commands will fail or hang anyway.

Note: limactl is version 0.16.0, installed via brew.

euforic commented 1 year ago

I don't know if it's related but I get frequent freezes on an M2 Max with an aarch64 Ubuntu VM. This happens randomly, a few times a day (sometimes even a few minutes apart). I have 3 or 4 terminals open with zsh running in the VM, I'm usually editing code with Vim or running make, etc. and all my terminals would become unresponsive. Doing lima zsh hangs, I can only do limactl stop, and restart the VM.

I'm not familiar at all with the technology behind lima, so let me know if there is anything I can do when it freezes that could provide more information as to what the problem is.

Occasionally, limactl stop will also hang indefinitely on this:

$ limactl stop
INFO[0000] Sending SIGINT to hostagent process 9976
INFO[0000] Waiting for the host agent and the driver processes to shut down
INFO[0000] [hostagent] Received SIGINT, shutting down the host agent
INFO[0000] [hostagent] Shutting down the host agent
INFO[0000] [hostagent] Stopping forwarding "/run/lima-guestagent.sock" (guest) to "/Users/mdorier/.lima/default/ga.sock" (host)

When this happens, I have no other choice but to restart my computer. If I kill processes, the next limactl commands will fail or hang anyway.

Note: limactl is version 0.16.0, installed via brew.

I am encountering the same issue. Is there a good way to debug this to help identify the issue?

chrisshyi commented 1 year ago

I don't know if it's related but I get frequent freezes on an M2 Max with an aarch64 Ubuntu VM. This happens randomly, a few times a day (sometimes even a few minutes apart). I have 3 or 4 terminals open with zsh running in the VM, I'm usually editing code with Vim or running make, etc. and all my terminals would become unresponsive. Doing lima zsh hangs, I can only do limactl stop, and restart the VM.

I'm not familiar at all with the technology behind lima, so let me know if there is anything I can do when it freezes that could provide more information as to what the problem is.

Occasionally, limactl stop will also hang indefinitely on this:

$ limactl stop
INFO[0000] Sending SIGINT to hostagent process 9976
INFO[0000] Waiting for the host agent and the driver processes to shut down
INFO[0000] [hostagent] Received SIGINT, shutting down the host agent
INFO[0000] [hostagent] Shutting down the host agent
INFO[0000] [hostagent] Stopping forwarding "/run/lima-guestagent.sock" (guest) to "/Users/mdorier/.lima/default/ga.sock" (host)

When this happens, I have no other choice but to restart my computer. If I kill processes, the next limactl commands will fail or hang anyway.

Note: limactl is version 0.16.0, installed via brew.

I'm seeing something similar on my end. My VM hangs and I'm forced to stop it.

❯ limactl stop dev
INFO[0000] Sending SIGINT to hostagent process 20385
INFO[0000] Waiting for the host agent and the driver processes to shut down
INFO[0000] [hostagent] Received SIGINT, shutting down the host agent
INFO[0000] [hostagent] Shutting down the host agent
INFO[0000] [hostagent] Stopping forwarding "/run/lima-guestagent.sock" (guest) to "/Users/chrisshyi/.lima/dev/ga.sock" (host)
INFO[0000] [hostagent] Unmounting "/tmp/lima"
INFO[0000] [hostagent] Shutting down QEMU with ACPI
INFO[0000] [hostagent] Sending QMP system_powerdown command
FATA[0180] did not receive an event with the "exiting" status

Running limactl list afterwards shows that the VM is stopped

❯ limactl list
NAME    STATUS     SSH            VMTYPE    ARCH       CPUS    MEMORY    DISK      DIR
dev     Stopped    127.0.0.1:0    qemu      aarch64    4       8GiB      200GiB    ~/.lima/dev
afbjorklund commented 1 year ago

As far as I know, the ACPI works differently on ARM. So that could explain acpid is not able to shut down the OS correctly (it should still "pull the plug" eventually)

It should not affect running VMs, but we might want to try to execute a software shutdown on this architecture ? That is also the suggested workaround.

thejan2009 commented 12 months ago

I randomly experience a similar issue - the VM freezes completely and I can only force stop it. But I use arm64 VM on vz driver on kind of the same platform 2021 M1 Pro.

Can I collect any logs that would be useful to debug this?

ns-cshyi commented 11 months ago

@afbjorklund Thanks for chiming in. Could you elaborate on what you meant by a software shutdown on ARM?

afbjorklund commented 11 months ago

I mean if the "hardware" power down signal is not working properly, then maybe issuing a poweroff command in the shell works better?

chrisshyi commented 11 months ago

ah I see what you mean! But generally when the VM freezes we don't shell access, which is why I had to stop the instance from the host

afbjorklund commented 11 months ago

That is a different issue, I was referring to the specific issue when the ACPI was not properly caught by the guest OS

anthonator commented 8 months ago

I'm experiencing this as well on an M2 Max MBP. My experience seems identical to @mdorier. Happy to provide any details that would be useful for debugging.

lima version: 0.20.1 (installed via brew) VM type: qemu OS version: Sonoma 14.2.1 Guest VM: Ubuntu 23.10 (release 20240209)

AkihiroSuda commented 8 months ago

I'm experiencing this as well on an M2 Max MBP.

Does vz work better?

nemonik commented 8 months ago

I set the cpu to 1 and haven't had a problem since.