Open vsiravar opened 1 year ago
@vsiravar is this consistently reproducible ?
Before sleep was there any high intensive task running in vm ?
@vsiravar is this consistently reproducible ?
No, it's quite intermittent.
Before sleep was there any high intensive task running in vm ?
Not really, I just have a hello-world container running in the vm. I have not experienced this behaviour with qemu.
@vsiravar With current master we now have support for video display. If possible could you enable display and try to replicate the same ??
When it hangs you can check from ui and see if vm is accessible. This will give an idea if the issue is with network/with vm itself
With current master we now have support for video display. If possible could you enable display and try to replicate the same ??
Sure, will try this out. Thanks!
I think this doesn't only happen when computer wakes up from sleep...
I successfully initialized the VM and ran some commands normally. But after I reply several messages in Slack and come back (around 10 mins), it starts to hang and return FATA[0928] exit status 255
.
VM Service has 300% + CPU usage.
@ningziwen could you also try enabling display as mentioned above and see ??
Also do share you template which you used.
@balajiv113 Sorry I didn't get what it means. Would you like to do screen recording and upload the video? Or using any GUI? Could you point me the instruction if it is GUI?
@ningziwen Steps to enable display
video:
display: "vz"
This will give us a idea if there are some issues with network/whole vm itself.
I tired the above steps myself. Haven't got high cpu usage but the freeze happens.
On checking the GUI during the freeze even that was not responsive so i think the freeze happens on virtualization.framework level not on network.
I have also raised a support ticket with Apple with the same info.
Note: This happens to me on M1 only. My intel runs smooth for weeks with sleep and wake cases
@balajiv113 Hey. Did you get any reply from Apple? Is the support ticket link sharable?
Updated ticket description and title based on new behaviour observed.
Maybe once https://github.com/lima-vm/lima/issues/1659 is resolved you can look at the serial.log
to see if there is any related log messages.
Confirming that this is still happening (HEAD
as of today, M1)
I am also experiencing the same issue. Just started using limactl instead of other VM providers. First had to deal with the time shift, so I have added the following
timedatectl set-ntp no
apt update
apt install -y ntp
Now, every morning get to the high CPU usage, and cannot access my VMs.
I started a lima virtual machine with the following command, and logged in to the virtual machine background from video using root
limactl create --name=default template://docker \
--cpus=2 --memory=4 --vm-type=vz --mount-writable=true \
--disk=5 --network=lima:user-v2 --rosetta --video
limactl start
How can I confirm whether it is a problem with the virtual machine network or the m1 virtualization service?
I have encountered both of the following situations:
lima date -R
in the terminal to freeze, I can confirm from the video that the virtual machine is still running and the CPU usage is not high;lima date -R
in the terminal to freeze, I can confirm from the video that the virtual machine has stopped and the Virtualization
process takes up 200% of the CPU resources;How can I help identify the problem in the above two situations?
lima version 0.18.0
macOS version 14.0 (23A344)
When I wrote the above the second scenario happened
ha.stderr.log
file@balajiv113 Was able to catch the following in the network log when this occurs:
time="2023-11-12T19:22:25-05:00" level=info msg="new connection from to "
2023/11/12 19:22:28 tcpproxy: for incoming conn 127.0.0.1:56720, error dialing "192.168.104.1:22": connect tcp 192.168.104.1:22: connection was refused
time="2023-11-12T19:22:44-05:00" level=error msg="r.CreateEndpoint() = connection was refused"
Unsure if this is relevant. The network process seems to remain alive.
I tried disabling rosetta but that did not help. Something interesting I noticed though is that after disabling rosetta, when the vm hangs, cpu is pinned at half the allocated cpu. Pinned at 100% when allocated 2 cpu. But when rosetta is enabled it's usually pinned at 200%.
After upgrading my M2 Mac mini to Sonoma, I've been encountering this issue frequently. Yesterday, I noticed that one of my Lima VM and an UTM VM (both utilizing the virtualization.framework) froze simultaneously.
The UTM VM works after killing and restarting it. However, the Lima VM fails to restart after a lima stop -f
. When I use lima start
, that VM encounters errors similar to issue #1915 (by what I remember from, the logs are lost). Recreating the VM solves the problem.
In addition, my Colima VM, also running on vz, has been experiencing frequent hangs as well. I can always resolve it by using the lima stop -f
command and then restarting it.
I'm able to reproduce this issue almost every time when starting a large docker compose project (which I'm unable to share unfortunately). Today I noticed something new. I opened the system log utility to view any logs related to virtualization during one of these events. Doing so I was able to get some logs that seem interesting:
default 16:56:39.933077-0500 symptomsd Received CPU usage trigger:
com.apple.Virtualization.Virtual[72861] () used 90.01s of CPU over 177.06 seconds (averaging 50%), violating a CPU usage limit of 90.00s over 180 seconds.
default 16:56:40.028006-0500 symptomsd RESOURCE_NOTIFY trigger for com.apple.Virtualization.Virtual [72861] (90009971208 nanoseconds of CPU usage over 177.00s seconds, violating limit of 90000000000 nanoseconds of CPU usage over 180.00s seconds)
default 17:18:27.814709-0500 runningboardd Periodic Run States <RBProcessState| identity:xpcservice<com.apple.Virtualization.VirtualMachine([anon<limactl>(502):72856])(502)>:72861 role:UserInteractive gpuRole:None explicitJetsamBand:0 memoryLimit:Inactive(Default) flags:60 guaranteedRunning:NO legacyFinishTaskReason:0 inheritances:<RBMutableInheritanceCollection| inheritancesByEnvironment:{
}> primitiveAssertions:[
<RBSProcessAssertionInfo| type:2 reason:20246 name:"Domain" domain:"com.apple.launchservicesd:RoleUserInteractive" expl:"uielement:72861">
]>
These logs occur very close to when the the vm begins to hang. From my naive perspective this kind of seems like the os may be killing the virtualization process or severely throttling it for using too much cpu. Does that seem possible? I tried setting the vm's cpu limit to the number of cores my machine has but am still able to reproduce this. Side note: I'm strangely able to set the number of cpu to a number larger than my machine has.
The final log occurs some time after the vm begins to hang.
I tired the above steps myself. Haven't got high cpu usage but the freeze happens.
On checking the GUI during the freeze even that was not responsive so i think the freeze happens on virtualization.framework level not on network.
I have also raised a support ticket with Apple with the same info.
Note: This happens to me on M1 only. My intel runs smooth for weeks with sleep and wake cases
I have the same issue with qemu. Running the same command will sometimes work and sometimes freeze the vm, requiring a stop --force
, with CPU usage being somewhere around 400%. However, the 400% CPU usage occur on the qemu-system-x86_64
task. I'm on an M2 Mac and am using cpuType:\ x86_64: "max"
in my config, using qemu v8.2.1.
You have already raised a ticket with Apple, but would it be possible to double-check and confirm if in your scenario the behaviour is reproducible using qemu instead of vz?
Problem
Virtualization Framework intermittently starts consuming 100%-220%(from Activity Monitor) CPU and is unresponsive. This leads to all limactl commands being unresponsive or failing. This intermittently happens when the lima vm is started and left alone for a while.
Behaviour observed
limactl shell <vm name>
.Once the vm gets to this state All limactl commands fail.
Workaround
The way around it is to recreate vm.
Related issue
https://github.com/docker/for-mac/issues/6655
Expected behaviour
That the vm should not hang when the computer wakes up from sleep.
Host info