thenickdude / KVM-Opencore

OpenCore disk image for running macOS VMs on Proxmox/QEMU
https://www.nicksherlock.com/2021/10/installing-macos-12-monterey-on-proxmox-7/
GNU General Public License v3.0
1.3k stars 117 forks source link

macOS kernel panic during boot due to non-monotonic TSC #15

Open sej7278 opened 3 years ago

sej7278 commented 3 years ago

Hi, this is possibly not even an opencore issue, but does anything jump out at you as being an obvious problem?

Got a VM running 11.6.1 with your oc 0.7.4 install on my Xeon E5-2650v2, but it simply will not take the Monterey update (also didn't do one of the beta's i tried or a fresh install) it starts the upgrade but after a couple of reboots seems to panic and get stuck at mach reboot.

It gets past the time-out here:

1_timeout

But then panics here:

2_panic

And finally hangs completely here trying to reboot i guess:

3_mach_reboot

My config.plist is basically the same as yours but with some smbios stuff added by opencore-configurator (serials etc.) oc-validator had nothing bad to say about it.

libvirt xml is: libvirt.txt

Host is Debian Sid, qemu 6.1.0, libvirt 7.6.0, kernel 5.14.12

thenickdude commented 3 years ago

High IO on the host during the upgrade might be triggering a kernel panic due to timeouts in the guest. Try adding the parameters I identified here to your boot-args:

https://www.nicksherlock.com/2020/08/solving-macos-vm-kernel-panics-on-heavily-loaded-proxmox-qemu-kvm-servers/

tlbto_us=0 vti=9
sej7278 commented 3 years ago

ok thanks, will try in the morning and report back.

sej7278 commented 3 years ago

ok tried that and thought it was going to make a difference as cpu usage went through the roof (like 100% on half my 16 cores!) but it still ended up at the mach reboot crash.

it seems to reboot a lot at this disk crypto stage (not crash, just reboot back to opencore chooser):

disk

i might try reducing the vcpu's to 4, maybe its a thread timeout/race or something....?

thenickdude commented 3 years ago

That's curious, if CPU usage increased it seems like a kernel thread is spinning in an infinite loop. tlbto_us=0 causes failures in a core to respond in a timely fashion to a TLB flush to be completely ignored instead of triggering a panic.

I'll check out your VM config

thenickdude commented 3 years ago

Where did you get your OVMF image by the way? You might try switching to one provided by your distro just in case

sej7278 commented 3 years ago

I think the ovmf was from osx-kvm, I'll try the Debian one, might also try a fresh install again instead of an upgrade or maybe try without gpu passthrough.

Reducing the core count made no difference nor did switching to virtio-net from vmxnet3 (didn't realize that worked).

It seems to stall at various points for a few minutes then reboot, but once it gets to Mach reboot it's definitely dead.

thenickdude commented 3 years ago

I've never observed behaviour like that so I'm a bit in the dark on what might cause it, sorry!

If you've got any passthrough devices defined, does it boot if they're removed?

sej7278 commented 3 years ago

OVMF_CODE.fd or OVMF_CODE_4M.fd from debian seem to reduce cpu usage to almost nothing, also reduced the reboots but still ends up at MACH Reboot i'll try removing the gpu next as that seemed to work for someone on reddit

thenickdude commented 3 years ago

I definitely recommend removing passthrough GPUs during upgrades because the repeated restarts asks a lot of the shitty AMD GPU drivers.

sej7278 commented 3 years ago

i'm going to close this and give up, as a completely fresh install with fresh oc15 and ovmf and no passthrough doesn't even get as far as disk utility, so i'm assuming monterey is a lot more fussy about hardware or software (as bigsur runs fine on the same vm).

thanks for your time.

thenickdude commented 3 years ago

The only real hardware the guest can even see is your CPU, which is perfectly compatible (same generation as mine)

thenickdude commented 3 years ago

Oh your CPU argument is missing +hypervisor. The macOS kernel gives all sorts of timing slack to you if it knows it's running in a VM, which requires +hypervisor

sej7278 commented 3 years ago

i tried adding that to my existing flags and i tried changing completely to:

-cpu host,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor,+invtsc

but it didn't make any difference. this is really confusing as i've never really had a problem before (other than bigsur which just needed a new opencore) but monterey seems unsurmountable to me - i can't even get to the installer let alone an upgrade!

kuasha420 commented 3 years ago

@thenickdude do you have a monterey vm running with this EFI?

thenickdude commented 3 years ago

Absolutely, I installed using a recovery, a full installer, and upgraded from Big Sur. Didn't have any problems with any of those scenarios.

Passthrough of RX580 successful.

QEMU 6.0.0-4, edk2-stable202108, pc-q35-6.0

thenickdude commented 3 years ago

Here's my QEMU commandline for my VM with passthrough:

/usr/bin/kvm \
  -no-shutdown \
  -smbios 'type=1,uuid=...' \
  -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' \
  -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/zvol/rpool/vms/vm-110-disk-1' \
  -smp '16,sockets=1,cores=16,maxcpus=16' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
  -vga none \
  -nographic \
  -cpu 'Penryn,enforce,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,vendor=GenuineIntel' \
  -m 16384 \
  -object 'memory-backend-file,id=ram-node0,size=16384M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
  -numa 'node,nodeid=0,cpus=0-15,memdev=ram-node0' \
  -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg \
  -device 'vfio-pci,host=0000:03:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:03:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' \
  -device 'vfio-pci,host=0000:00:1a.0,id=hostpci1,bus=ich9-pcie-port-2,addr=0x0' \
  -device 'vfio-pci,host=0000:00:1d.0,id=hostpci2,bus=ich9-pcie-port-3,addr=0x0' \
  -drive 'file=/dev/zvol/rpool/vms/vm-111-disk-0,if=none,id=drive-virtio0,cache=unsafe,discard=on,format=raw,aio=io_uring,detect-zeroes=unmap' \
  -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap110i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
  -device 'virtio-net-pci,mac=...,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
  -machine 'type=q35+pve0' \
  -device 'isa-applesmc,osk=...' \
  -smbios 'type=2' \
  -cpu 'host,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor,+invtsc'

(Duplicate args are due to Proxmox config restrictions)

/usr/share/qemu-server/pve-q35-4.0.cfg is:

[device "ehci"]
  driver = "ich9-usb-ehci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.7"

[device "uhci-1"]
  driver = "ich9-usb-uhci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.0"
  masterbus = "ehci.0"
  firstport = "0"

[device "uhci-2"]
  driver = "ich9-usb-uhci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.1"
  masterbus = "ehci.0"
  firstport = "2"

[device "uhci-3"]
  driver = "ich9-usb-uhci3"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.2"
  masterbus = "ehci.0"
  firstport = "4"

[device "ehci-2"]
  driver = "ich9-usb-ehci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.7"

[device "uhci-4"]
  driver = "ich9-usb-uhci4"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.0"
  masterbus = "ehci-2.0"
  firstport = "0"

[device "uhci-5"]
  driver = "ich9-usb-uhci5"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.1"
  masterbus = "ehci-2.0"
  firstport = "2"

[device "uhci-6"]
  driver = "ich9-usb-uhci6"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.2"
  masterbus = "ehci-2.0"
  firstport = "4"

[device "audio0"]
  driver = "ich9-intel-hda"
  bus = "pcie.0"
  addr = "1b.0"

[device "ich9-pcie-port-1"]
  driver = "pcie-root-port"
  x-speed = "16"
  x-width = "32"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.0"
  port = "1"
  chassis = "1"

[device "ich9-pcie-port-2"]
  driver = "pcie-root-port"
  x-speed = "16"
  x-width = "32"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.1"
  port = "2"
  chassis = "2"

[device "ich9-pcie-port-3"]
  driver = "pcie-root-port"
  x-speed = "16"
  x-width = "32"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.2"
  port = "3"
  chassis = "3"

[device "ich9-pcie-port-4"]
  driver = "pcie-root-port"
  x-speed = "16"
  x-width = "32"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.3"
  port = "4"
  chassis = "4"

[device "pcidmi"]
  driver = "i82801b11-bridge"
  bus = "pcie.0"
  addr = "1e.0"

[device "pci.0"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "1.0"
  chassis_nr = "1"

[device "pci.1"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "2.0"
  chassis_nr = "2"

[device "pci.2"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "3.0"
  chassis_nr = "3"

[device "pci.3"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "4.0"
  chassis_nr = "4"
kuasha420 commented 3 years ago

@thenickdude I just did the upgrade from 12.6.1 to 12.0.1 and everything went great! Here's my configuration if anyone is interested.

LibVirt Config

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>macOS</name>
  <uuid>2aca0dd6-cec9-4717-9ab2-0b7b13d111c3</uuid>
  <title>macOS</title>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <vcpu placement='static'>8</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='2' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='3' enabled='yes' hotpluggable='yes' order='4'/>
    <vcpu id='4' enabled='yes' hotpluggable='yes' order='5'/>
    <vcpu id='5' enabled='yes' hotpluggable='yes' order='6'/>
    <vcpu id='6' enabled='yes' hotpluggable='yes' order='7'/>
    <vcpu id='7' enabled='yes' hotpluggable='yes' order='8'/>
  </vcpus>
  <cputune>
    <vcpupin vcpu='0' cpuset='2'/>
    <vcpupin vcpu='1' cpuset='8'/>
    <vcpupin vcpu='2' cpuset='3'/>
    <vcpupin vcpu='3' cpuset='9'/>
    <vcpupin vcpu='4' cpuset='4'/>
    <vcpupin vcpu='5' cpuset='10'/>
    <vcpupin vcpu='6' cpuset='5'/>
    <vcpupin vcpu='7' cpuset='11'/>
    <emulatorpin cpuset='0-1,6-7'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-q35-6.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/edk2-ovmf/x64/OVMF_CODE.fd</loader>
    <nvram>/home/kuasha/OSX-KVM/OVMF_VARS-1024x768.fd</nvram>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'>
    <topology sockets='1' dies='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/home/kuasha/macOS/OpenCore-v15.img'/>
      <target dev='sda' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x8'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x9'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0xa'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0xb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0xc'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0xd'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0xe'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x6'/>
    </controller>
    <controller type='pci' index='8' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='8' port='0xf'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:8e:e2:66'/>
      <source bridge='br0'/>
      <target dev='tap0'/>
      <model type='vmxnet3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <input type='mouse' bus='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0e' function='0x0'/>
    </input>
    <input type='keyboard' bus='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0f' function='0x0'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <sound model='ich9'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </sound>
    <audio id='1' type='none'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='no'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='no'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
    </hostdev>
    <memballoon model='none'/>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-device'/>
    <qemu:arg value='isa-applesmc,osk=ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc'/>
    <qemu:arg value='-cpu'/>
    <qemu:arg value='host,vendor=GenuineIntel,+hypervisor,+invtsc,kvm=on,+fma,+avx,+avx2,+aes,+ssse3,+sse4_2,+popcnt,+sse4a,+bmi1,+bmi2'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='hda-micro,audiodev=hda'/>
    <qemu:arg value='-audiodev'/>
    <qemu:arg value='pa,id=hda,server=unix:/run/user/1000/pulse/native'/>
    <qemu:arg value='-object'/>
    <qemu:arg value='input-linux,id=mouse1,evdev=/dev/input/by-id/usb-Logitech_Gaming_Mouse_G502_1263366D3336-event-mouse'/>
    <qemu:arg value='-object'/>
    <qemu:arg value='input-linux,id=kbd1,evdev=/dev/input/by-id/ckb-Corsair_STRAFE_RGB_Gaming_Keyboard_vKB_-event,grab_all=on,repeat=on'/>
  </qemu:commandline>
</domain>

System Information

                     ./o.                  kuasha@kuasha-z490ud 
                   ./sssso-                -------------------- 
                 `:osssssss+-              OS: EndeavourOS Linux x86_64 
               `:+sssssssssso/.            Host: Z490 UD 
             `-/ossssssssssssso/.          Kernel: 5.14.14-arch1-1 
           `-/+sssssssssssssssso+:`        Uptime: 11 hours, 9 mins 
         `-:/+sssssssssssssssssso+/.       Packages: 1208 (pacman) 
       `.://osssssssssssssssssssso++-      Shell: zsh 5.8 
      .://+ssssssssssssssssssssssso++:     Resolution: 1720x1440 
    .:///ossssssssssssssssssssssssso++:    DE: Plasma 5.23.1 
  `:////ssssssssssssssssssssssssssso+++.   WM: KWin 
`-////+ssssssssssssssssssssssssssso++++-   Theme: Breeze Dark [Plasma], Breeze [GTK2] 
 `..-+oosssssssssssssssssssssssso+++++/`   Icons: [Plasma], breeze-dark [GTK2/3] 
   ./++++++++++++++++++++++++++++++/:.     Terminal: konsole 
  `:::::::::::::::::::::::::------``       Terminal Font: MesloLGS NF 20 
                                           CPU: Intel i5-10600 (12) @ 4.800GHz 
                                           GPU: Intel CometLake-S GT2 [UHD Graphics 630] 
                                           GPU: AMD ATI Radeon RX 470/480/570/570X/580/580X/590 
                                           Memory: 14148MiB / 31952MiB 

Screen Shot 2021-10-29 at 3 42 13 PM

sej7278 commented 3 years ago

@kuasha420 could you list the libvirt/qemu version you're using as i'm wondering if its a qemu 6.1 issue as @thenickdude is using 6.0 and it looks like you are too

kuasha420 commented 3 years ago

@sej7278

pacman -Q libvirt qemu edk2-ovmf linux
libvirt 1:7.8.0-1
qemu 6.1.0-5
edk2-ovmf 202108-1
linux 5.14.14.arch1-1

I am using qemu 6.1 as well but the machine type should be 6.0.

machine type 6.1 has issues.

sej7278 commented 3 years ago

just compiled QEMU emulator version 6.1.50 (v6.1.0-1735-gc52d69e7) and that barely even starts macos, changing the machine type doesn't seem to make any difference to me, also tried your commandline. i'm lost, i wonder if its because i'm using virtio-blk instead of sata

sickcodes commented 3 years ago

@sej7278 I've also experienced the maxed out 100% CPU usage check. It seems to be related to macOS trying to do a full APFS fsck or something check after a busted shutdown. I think it's just Monterey.

sej7278 commented 3 years ago

@sickcodes yes it's definitely doing that but I think I'm making it past that stage

sej7278 commented 3 years ago

i think i finally managed to get a screenshot before it crashes, if this makes any sense:

Screenshot from 2021-10-30 00-41-40

thenickdude commented 3 years ago

So that seems to be panicing due to non-monotonic time (clock going backwards).

I wonder if you're getting a warning at VM launch time that "invtsc" isn't actually available on your system. Try removing that from your CPU args if it's currently there.

Can you post the VM command/config you're currently using and also the output of this on the host:

cat /proc/cpuinfo

(You only need to paste the output from a single one of the cores)

thenickdude commented 3 years ago

This user has the same panic on bare-metal:

https://www.reddit.com/r/hackintosh/comments/qhjnly/random_kernel_panics_on_x79/

OpenCore bug tracker:

https://github.com/acidanthera/bugtracker/issues/1676

Although in the case of QEMU I think it's QEMU's job to present a consistent timestamp counter, so in theory TSCAdjustReset shouldn't be needed...

thenickdude commented 3 years ago

Also, your host isn't going to sleep during the install because you aren't moving the mouse to keep it awake, is it?

sej7278 commented 3 years ago

ah i had a problem with kvm-pit with catalina hard crashing the host, the fix was to remove this lot, but i don't have that in my monterey config, i wonder what the defaults are:

<clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
</clock>

cpuinfo:

processor   : 31
vendor_id   : GenuineIntel
cpu family  : 6
model       : 62
model name  : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping    : 4
microcode   : 0x42e
cpu MHz     : 1200.000
cache size  : 20480 KB
physical id : 1
siblings    : 16
core id     : 7
cpu cores   : 8
apicid      : 47
initial apicid  : 47
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
vmx flags   : vnmi preemption_timer posted_intr invvpid ept_x_only ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips    : 5190.81
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

host is a desktop so not going to sleep.

when i got that screenshot i was running OpenCore-boot.sh instead of virt-manager.

i do get this in dmesg, don't know how to fix it though:

dmesg |grep -i kvm
[  215.019676] kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable

cat /sys/devices/system/clocksource/clocksource0/available_clocksource 
hpet acpi_pm 

cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
hpet

removing +invtsc isn't making any difference

ps auxw|grep qemu
simon     499511 99.5 25.5 17803184 16838992 ?   SLl  10:19   0:28 /usr/bin/qemu-system-x86_64 -name guest=monterey,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/home/simon/.config/libvirt/qemu/lib/domain-4-monterey/master-key.aes"} -blockdev {"driver":"file","filename":"/data5/kvm/macos/monterey/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} -blockdev {"driver":"file","filename":"/data5/kvm/macos/monterey/OVMF_VARS-1024x768.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"} -machine pc-q35-6.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format,memory-backend=pc.ram -cpu host,migratable=on -m 16384 -object {"qom-type":"memory-backend-ram","id":"pc.ram","size":17179869184} -overcommit mem-lock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 5068102b-057f-43a2-8e4f-f6ded11ffbac -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=28,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -blockdev {"driver":"file","filename":"/data5/kvm/macos/monterey/OpenCore-v15.img","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"raw","file":"libvirt-2-storage"} -device ide-hd,bus=ide.0,drive=libvirt-2-format,id=sata0-0-0,bootindex=1 -blockdev {"driver":"file","filename":"/data5/kvm/macos/monterey/monterey.qcow2","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":null} -device virtio-blk-pci,bus=pci.1,addr=0x0,drive=libvirt-1-format,id=virtio-disk0,write-cache=on -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:6b:84:02,bus=pci.3,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -audiodev id=audio1,driver=none -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0,audiodev=audio1 -device vfio-pci,host=0000:02:00.0,id=hostdev0,bus=pci.2,multifunction=on,addr=0x0,rombar=1 -device vfio-pci,host=0000:02:00.1,id=hostdev1,bus=pci.2,addr=0x0.0x1,rombar=1 -device usb-host,hostdevice=/dev/bus/usb/004/003,id=hostdev2,bus=usb.0,port=1 -cpu host,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor -device isa-applesmc,osk=ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc -smbios type=2 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
thenickdude commented 3 years ago

Okay, I'm pretty sure that's your problem. On my system the clocksource is set to tsc, but on yours it doesn't even get offered as an option.

What does this return "dmesg | grep -i -e tsc -e clocksource"? Mine reports:

[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3400.180 MHz processor
[    0.164259] TSC deadline timer available
[    0.164329] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.374869] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[    0.394888] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x3102f8d1124, max_idle_ns: 440795299789 ns
[    0.614927] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.929463] clocksource: Switched to clocksource tsc-early
[    0.946617] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    2.711010] tsc: Refined TSC clocksource calibration: 3399.981 MHz
[    2.724715] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x31023c93017, max_idle_ns: 440795261805 ns
[    2.769229] clocksource: Switched to clocksource tsc

I think on my system there was an option buried in the host UEFI settings for TSC synchronisation between sockets. If yours has that too, make sure it's turned on, because otherwise it might cause the TSC to be rejected.

I guess since you only have two clocksources to choose from you could try switching to the other and see if things improve:

echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource 

Finally, check for a BIOS update for your motherboard, since this is the sort of thing they fix there.

thenickdude commented 3 years ago

Also it sounds like recent Linux kernels 5.13, 5.14 have made timing changes that cause it to disable TSC in more situations:

https://www.phoronix.com/forums/forum/software/general-linux-open-source/1283799-linux-5-15-rc5-x86-changes-aim-to-fix-yet-another-hardware-trainwreck

If you can try 5.12 and see if the tsc clocksource comes back that would be interesting.

sej7278 commented 3 years ago

i'll have a look in my bios tomorrow (don't a massive backup right now) but i'm in bios mode not uefi and it is the latest (dell t5610 doesn't get updated very often!). i do recall some time settings but think it was just utc.

# dmesg | grep -i -e tsc -e clocksource
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2593.895 MHz processor
[    0.023471] TSC deadline timer available
[    0.023542] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.097315] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[    0.117335] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x2563b4fbe6b, max_idle_ns: 440795330438 ns
[    0.149339] TSC synchronization [CPU#0 -> CPU#1]:
[    0.149339] Measured 1430554 cycles TSC warp between CPUs, turning off TSC clock.
[    0.149339] tsc: Marking TSC unstable due to check_tsc_sync_source failed
[    0.357884] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.693344] clocksource: Switched to clocksource hpet
[    0.711473] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[  215.019676] kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable

i've had this TSC issue since 5.10 kernel as i recall, its not new to 5.14

also noticed this: https://www.dell.com/community/Precision-Fixed-Workstations/TSC-warp-on-T5610-running-Linux/td-p/7820763

i might try cpu pinning to only use a single socket, but that usually kills performance (oddly enough!)

thenickdude commented 3 years ago

Yeah if the TSC only differs between sockets then pinning to a single socket should solve this.

It sounds like the sockets need their RESET signal to be delivered at the same time to have in-sync TSCs. It also sounds like the BIOS can ruin the sync by attempting to set the TSC register.

I noticed there is an Intel errata for the whole E5 v2 line that says the TSC won't be reset by a warm reboot. If this is true then if your TSCs ever go out of sync they would stay out of sync until you perform a cold boot.

thenickdude commented 3 years ago

This kernel bug report has a patch attached to try to better sync up the TSCs in this case, if firmware updates are not available to fix the core issue:

https://bugzilla.kernel.org/show_bug.cgi?id=202525

sej7278 commented 3 years ago

cpu pinning made some sort of difference (different crash points?). will try looking into tsc/kernel next. nothing in the bios relating to tsc.

Raikerian commented 2 years ago

I am also having troubles with Monterey on my VM, albeit the issue seems to be different:

image

Using QEMU 6.0.0 with this options:

  -enable-kvm
  -m 16G
  -cpu host,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor,+invtsc
  -machine q35,accel=kvm
  -usb -device usb-kbd -device usb-tablet
  -smp 12,cores=2,sockets=3,threads=2,maxcpus=12 # also ran with 4,cores=4,sockets=1,threads=1,maxcpus=4 but it is same problem
  -device usb-ehci,id=ehci
  -device nec-usb-xhci,id=xhci
  -global nec-usb-xhci.msi=off
  -device isa-applesmc,osk=..
  -drive if=pflash,format=raw,readonly=on,file=..
  -drive if=pflash,format=raw,file=..
  -smbios type=2
  -device ich9-intel-hda -device hda-duplex
  -device ich9-ahci,id=sata
  -device ide-hd,bus=sata.4,drive=MacHDD
  -drive id=MacHDD,if=none,file=..,format=qcow2
  -netdev "user,id=net0,hostfwd=tcp::443-:443,hostfwd=tcp::80-:80,hostfwd=tcp::22-:22"
  -device ..,netdev=net0,id=net0,mac=..
  -monitor stdio
  -device VGA,vgamem_mb=128
  -vnc ":0" -k "en-us"

Using CPU passthrough here, however I have also tried emulating Penryn and Skylake, but it had 0 effect.

Most of the times it will come to the kernel panic above (I had to turn on full debugging to get that message, otherwise it was never shown, macos would usually reboot around SMC step). But sometimes it actually even boots macos, but then crases almost immediately. Interesting enough I only have this problem when trying to boot on Intel 3.0 GHz Core i7-4578U, its working perfectly fine on Intel 2.6 GHz Core i5-4278U. However, the i7 machine is booted into linux from external drive connected through USB, which leads me to believe that it is potentially the issue here. Tried playing around with npci=0x2000 boot flag and setting ReleaseUsbOwnership to true, didn't help.

Big Sur is working perfectly fine on both with the same OC and boot options however.

My OVMFs are from OSX-KVM. As for configs, tried different ones already including yours and OSX-KVM ones. Appreciate any help. Perhaps someone can point me in the direction I can look more into myself as well.

Raikerian commented 2 years ago

@thenickdude @sej7278 seems like my problem was also related to clocksource and I was able to solve it, so will leave my notes here, perhaps it will help someone.

With MacOS Monterey and Intel 3.0 GHz Core i7-4578U after many runs most of the times it would crash with the error from the previous message, which is:

** In Memory Panic Stackshot Succeeded ** Bytes Traced 15707 (Uncompressed 48592) **
IOPolledInterface::startIO[0] 0xe00002c7
IOPolledFileWrite(0x0xffffff8546c5f480, 0x0, 0, 0x0) : IOStartPolledIO(0x0xffffff8546c5f480, kIOPolledWrite, 0, 0x24c806000, 8192) returned 0xe00002c7
IOPolledFileWrite(gIOPolledCoreFileVars, 0, 0x0, NULL) returned 0xe00002c7
progress_notify_stage_outproc (during forwarding) returned 0xe00002c7

Sometimes it would actually get stuck around SMC step, and even rarer it would fully boot into macos, but then almost immediately go into reboot. On one of the runs I have seen this error related to clock before the IOPolledFileWrite panic. It was seen pretty rare, and from later findings it seems this one and IOPolledFileWrite panic was actually coming together (but not always displayed due to how fast log updates and I didn't find a way to extract boot log into file as I was not able to boot into macos completely) and are related to the issue linked above https://github.com/acidanthera/bugtracker/issues/1676.

So when I digged into the boot log on the host I have found similar problem to what you guys talked above:

$ dmesg | grep -i -e tsc -e clocksource
[    0.000000] tsc: Detected 3200.000 MHz processor
[    0.000000] tsc: Detected 3199.980 MHz TSC
[    0.071145] TSC deadline timer available
[    0.071160] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[    0.589882] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x2e2036ff8d5, max_idle_ns: 440795275316 ns
[    0.625855] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    1.147854] clocksource: Switched to clocksource tsc-early
[    1.216141] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    2.928837] tsc: Refined TSC clocksource calibration: 3191.998 MHz
[    2.931728] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2e02c314322, max_idle_ns: 440795292111 ns
[    2.936036] clocksource: Switched to clocksource tsc
[   14.563926] clocksource: timekeeping watchdog on CPU7: Marking clocksource 'tsc' as unstable because the skew is too large:
[   14.568574] clocksource:                       'acpi_pm' wd_now: d82b24 wd_last: e42ddf mask: ffffff
[   14.572515] clocksource:                       'tsc' cs_now: 33b5415888 cs_last: 30635be285 mask: ffffffffffffffff
[   14.576891] tsc: Marking TSC unstable due to clocksource watchdog
[   14.579764] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[   14.590465] clocksource: Switched to clocksource acpi_pm
[  272.392417] kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable

$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
acpi_pm

While on the working hardware (Intel 2.6 GHz Core i5-4278U) clocksource would always be tsc. From digging out I found out that in newer versions of kernel watchdog disabler was added during boot in case cpu supports tsc features (constant_tsc, nonstop_tsc and tsc_adjust features to be precise) and has 2 or less sockets, which in my case works for my cpu. So adding tsc=reliable to the boot options in grub seem to be the temporary workaround here which I am testing (this patch that does somewhat similar thing also has some explanation). In that case tsc works fine and the host doesn't switch to acpi_pm anymore.

This solved the problem with hanging completely and now I was getting IOPolledFileWrite panic with clock sync problem 100% of the time. So this is where a thread linked above came to help, as it also has a great explanation on why it happens on Monterey: https://github.com/acidanthera/bugtracker/issues/1676#issuecomment-881884751 Just adding new CpuTscSync kext didn't help, so I also updated Lilu kext to the latest version and put CpuTscSync higher on the list in plist file (not sure if this one is necessary though). Finally Monterey boots correctly and I don't have any issues so far anymore.

A few extra notes:

thenickdude commented 2 years ago

Thanks for those details!

sej7278 commented 2 years ago

looks like my bios is buggy, as i tried the tsc=reliable kernel param and it booted as tsc then switched back to hpet!

# dmesg | grep -i -e tsc -e clocksource
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.15.0-2-amd64 root=UUID=2b82e6c4-442e-4700-a246-3b5c37a722b5 ro tsc=reliable quiet splash slab_common.usercopy_fallback=y intel_iommu=on iommu=pt
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2593.597 MHz processor
[    0.022989] TSC deadline timer available
[    0.023056] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.030289] Kernel command line: BOOT_IMAGE=/vmlinuz-5.15.0-2-amd64 root=UUID=2b82e6c4-442e-4700-a246-3b5c37a722b5 ro tsc=reliable quiet splash slab_common.usercopy_fallback=y intel_iommu=on iommu=pt
[    0.105033] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[    0.125053] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x25629b71302, max_idle_ns: 440795293176 ns
[    0.375011] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.714187] clocksource: Switched to clocksource tsc-early
[    0.730363] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    1.797086] tsc: Refined TSC clocksource calibration: 2593.749 MHz
[    1.797114] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x25632b1b102, max_idle_ns: 440795335950 ns
[    1.797236] clocksource: Switched to clocksource tsc
[  123.639964] tsc: Marking TSC unstable due to KVM discovered backwards TSC
[  123.639981] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[  123.641194] clocksource: Checking clocksource tsc synchronization from CPU 11 to CPUs 0,12,16,18-19,22,30-31.
[  123.641194] clocksource:         CPUs 0,16,18-19,22 ahead of CPU 11 for clocksource tsc.
[  123.641194] clocksource:         CPUs 12,30-31 behind CPU 11 for clocksource tsc.
[  123.641194] clocksource:         CPU 11 check durations 1065ns - 3474ns for clocksource tsc.
[  123.641293] clocksource: Switched to clocksource hpet

# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
hpet
Raikerian commented 2 years ago

Make sure your cpu has tsc_adjust feature, which seems to be most important one for Monterey (according to CpuTscSync readme):

cat /proc/cpuinfo | grep tsc_adjust

Also perhaps some methods from here can help as well https://aws.amazon.com/premiumsupport/knowledge-center/manage-ec2-linux-clock-source/

sej7278 commented 2 years ago

i seem to only have this lot:

tsc
constant_tsc
nonstop_tsc
tsc_deadline_timer
Raphitpt commented 2 years ago

Well, I tried everything but without success, I think it comes from lenovo from what I could read, so I'm waiting for a fix

thenickdude commented 2 years ago

A user on Reddit reported that they had this issue and fixed it by fully shutting down their host and starting it again (warm reboots didn't fix it). Maybe a power cycle is required to resync the TSC between cores.

sej7278 commented 2 years ago

hmm maybe, i have given up on this given that my gt710 won't even work with monterey without complicated fudging, but i might see if i can give it another go once i've found all my notes!

to me the powercycle thing sounds a bit more like a vendor reset issue with the gfx card (which mine won't have), although i do have a similar problem with a qnap 2.5gbe card - it loses its flash settings or something when you just power off the pc, but pull the cable and it boots again fine afterwards!

ku1ik commented 2 years ago

Some data points from me:

I can install both Big Sur and Monterey without problem after host cold boot. After installation they boot fine when host has just been cold booted.

After host cold boot dmesg reports clocksource=tsc and things are fine. However, after some suspend/resume cycles of the host I can't boot macOS (neither Big Sur nor Monterey). Warm rebooting doesn't help, the host reports Marking clocksource 'tsc' as unstable in dmesg and /sys/devices/system/clocksource/clocksource0/current_clocksource returns hpet.

So it seems a combination of suspend/resume/reboot makes TSC unstable (I do a lot of suspending so not sure yet if just warm reboot without suspending makes it bad as well). Either way, when my system is in this unstable TSC condition I can install Big Sur just fine, although it won't boot after installation. Monterey won't even install.

I tried CPU pinning but no dice.

tsc_adjust is not present in /proc/cpuinfo for my CPU.


Motherboard: ASUS x570 Pro Creator AGESA: 1.2.0.7 CPU: Ryzen 5700G

sej7278 commented 2 years ago

sounds like some sort of way of making the linux kernel (or qemu?) ignore the tsc problem and allow it to be assigned as a stable clocksource is the workaround we need.

thenickdude commented 2 years ago

No, that won't fix the problem since the guest will still observe the skewed TSC and panic.

The TSCs need to be resynced on the host

sej7278 commented 2 years ago

Ah I see, assumed it was just kernel/qemu not passing it to the VM at all

ku1ik commented 1 year ago

Due to other issues I had with my PC I've replaced my Corsair Vengeance LPX DDR4 sticks with G.Skill Ripjaws V ones and suddenly I'm not experiencing any of the above issues anymore. I still see messages about unstable clocksource in dmesg output after warm reboots but they don't seem to affect the ability to boot macOS whatsoever. I can boot and reboot Big Sur now without a problem every single time. So I'll assume the clocksource thing was a red herring in my case.

sej7278 commented 1 year ago

Yeah bigsur is not a problem for me, just Monterey (or later?)

mchrostowski commented 1 year ago

I just successfully resolved this issue on my own system by updating my BIOS.

For me it was indeed a BIOS TSC sync with a lack of TSC Adjust on the affected CPU. I could see my Linux kernel avoiding TSC due to this specifically but macOS would do the non-monotonic panic.

I didn't have to make any further adjustments to my BIOS and I went to stock settings from the downstream project I'm using to launch, Docker-OSX. No extra CPU flags for hypervisor or the like.

kszczek commented 1 year ago

Same issue on my Lenovo Legion laptop, although I don't expect to get a BIOS update with the fix - Lenovo seems to focus on the "Linux certified" laptops with their patches. Fortunately there is a workaround but it requires a bit of tinkering. The workaround consists of a set of kernel patches which implement an alternative method of TSC synchronization which fixes this issue on my system. More details in this Reddit post.