canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 931 forks source link

Windows VM or ISO boot freeze when starting #11999

Closed markrattray closed 1 year ago

markrattray commented 1 year ago

Required information

Issue description

Linux VM, even a desktop are fine.

Freshly rebuilt and repurposed Dell R740xd host dedicated to LXD. Now running a couple of Ubuntu containers just fine.

Tried to deploy a Windows Server 2022 VM using an imported image from obtained another server, but freezes during startup.

Thinking that the image had been corrupted during transit, I tried to create a completely new image by booting off a new ISO downloaded directly to the site from Microsoft, but within a few seconds of the ISO booting it just freezes as well. I have also generated an ISO via distrobuilder, with the same freeze result.

Installed linux-image-5.15.0-76-generic on host and booted into it but no difference. This kernel version is currently working with another host and Windows VMs.

Steps to reproduce

  1. lxc init ws2022std-image-template --empty --vm -c limits.cpu=4 -c limits.memory=6GiB -c security.secureboot=false -d root,size=60GiB -d winiso,boot.priority=10,source=/iso/ws2022std-lxd.iso,type=disk
  2. lxc start ws2022std-image-template
  3. Open a VGA console window and press any key to get it to boot of the ISO
  4. Spinning circle of dots below the UEFI boot LXD logo will freeze
  5. lxc list status will show error

Information to attach

markrattray commented 1 year ago

I have tried disabling tdp_mmu, and the ISO still quickly freezes during boot: https://github.com/canonical/lxd/issues/11520

The server does have NIC bond enabled which the macvlan interfaces are using, so will try disable that and see if it helps.

markrattray commented 1 year ago

disabling NIC bond didn't help.

markrattray commented 1 year ago

Well I've discovered that stipulating CPU Passthrough in raw.qemu gets around this issue and I'm actually able to get to the Install Windows screen, and all the way through to a working VM.

Here is some more info from the physical host:

somehost:~$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  80
  On-line CPU(s) list:   0-79
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
    CPU family:          6
    Model:               85
    Thread(s) per core:  2
    Core(s) per socket:  20
    Socket(s):           2
    Stepping:            4
    BogoMIPS:            4800.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fa
                         ult epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d arch_capabilities
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   1.3 MiB (40 instances)
  L1i:                   1.3 MiB (40 instances)
  L2:                    40 MiB (40 instances)
  L3:                    55 MiB (2 instances)
NUMA:
  NUMA node(s):          2
  NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78
  NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79
Vulnerabilities:
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
  Mds:                   Mitigation; Clear CPU buffers; SMT vulnerable
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT vulnerable
  Retbleed:              Mitigation; IBRS
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Mitigation; Clear CPU buffers; SMT vulnerable
markrattray commented 1 year ago

tdp_mmu disablement is still needed

tomponline commented 1 year ago

What did you set raw.qemu to to get it to work?

markrattray commented 1 year ago

Hi

Per WS2022 instance on the Dell R740xd with Intel 6148 CPU, requires raw.qemu: -cpu hostto stop the ISO (original or distrobuilder) or new instance from image freezing a few seconds into boot. This isn't needed for WS2019 ISO on this host. I'm able deploy a WS2019 instance via ISO (distrobuilder) without this. Older hosts with Intel E5-2680 v2 do not need this with WS2022 instances.

For any of our physical hosts so far, the Intel PET flag needs to be disabled to stop the frequent random freezing... using the modprobe method: https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0#Older_Hardware_and_New_5.15_Kernel Do you think it's possible to set this per instance rather than on the host using raw.qemu? I've not been able to find anything, and what I've tried was incorrect and the VM wouldn't start. I don't know whether this is needed for WS2019 because I'm doing a clean migration to new WS2022 instances, with new domains and conventions.

The Proxmox forum has a lot more posts on WS2022 and your link to one of their posts via another Github LXD issue lead me to the tdp_mmu work-around.

tomponline commented 1 year ago

What do you mean "per-instance", as raw.qemu is a per-instance setting?

markrattray commented 1 year ago

In the instance config.

tomponline commented 1 year ago

In the instance config.

Yes you can do that using lxc config set <instance> raw.qemu=...

markrattray commented 1 year ago

sorry my question was whether the tdp_mmu disablement could be done via raw.qemu as well. I did try: raw.qemu: -cpu host,kvm.tdp_mmu=N also: raw.qemu: -cpu host,kvm.tdp_mmu=off but they would not start with the kvm.tdp_mmu key.

Looks this tdp_mmu issue will be fixed in kernel 6.2 according to this: https://gitlab.com/qemu-project/qemu/-/issues/1198 but there is more to it that I also need raw.qemu: -cpu host on another processor.

tomponline commented 1 year ago

My understanding is that LXD uses the equivalent of -cpu host by default anyway, so its odd you're needing to pass it explicitly. I suspect its one of the extensions that is being disabled by doing that.

Can you try setting lxc config set <instance> migration.stateful=true and seeing if that helps, as that disables some of the extensions.

Possibly hv_passthrough or topoext

monstermunchkin commented 1 year ago

sorry my question was whether the tdp_mmu disablement could be done via raw.qemu as well. I did try: raw.qemu: -cpu host,kvm.tdp_mmu=N also: raw.qemu: -cpu host,kvm.tdp_mmu=off but they would not start with the kvm.tdp_mmu key.

LXD currently doesn't support custom CPU flags.

tomponline commented 1 year ago

sorry my question was whether the tdp_mmu disablement could be done via raw.qemu as well. I did try: raw.qemu: -cpu host,kvm.tdp_mmu=N also: raw.qemu: -cpu host,kvm.tdp_mmu=off but they would not start with the kvm.tdp_mmu key.

LXD currently doesn't support custom CPU flags.

Well only by raw.qemu ;)

monstermunchkin commented 1 year ago

Well only by raw.qemu ;)

Since the raw.qemu flags are appended, will raw.qemu="-cpu host,kvm.tdp_mmu=N" override the fixed -cpu host flag? I don't know how QEMU handles duplicate flags.

markrattray commented 1 year ago

Trying your suggestions @tomponline

markrattray commented 1 year ago

@tomponline already better with the ISO. Got to the first screen where you choose the language an input. Couldn't get this far withou that -cpu host setting. Now installing as well.

markrattray commented 1 year ago

@tomponline that worked, WS2022 desktop installed and now running Windows Update to put some stress on it.

I will try an existing lxd image which has been working since finding raw.qemu: -cpu host.

markrattray commented 1 year ago

@monstermunchkin I'm only setting raw.qemu: -cpu host to get the WS2022 ISO/instances to boot, without it they will freeze a few seconds into the boot process on this CPU. The kvm.tdp_mmu append was an experiment to see whether I could disable the feature/flag at the instance level instead of the host level. I think I'm way off but it was worth a try.

@tomponline all seems fine with the setting: migration.stateful=true on both ISO and instance.

markrattray commented 1 year ago

I think I'm beginning to understand what's going on.

Apparently the tdp_mmu feature is Intel EPT not PET as I got mixed up with. EPT (lowercase) is listed twice in the Flags section of the lscpu output above.

So I tried setting raw.qemu: -cpu host,ept=off and got this when trying to start the VM:

qemu-system-x86_64: can't apply global host-x86_64-cpu.ept=off: Property 'host-x86_64-cpu.ept' not found

So host-x86_64-cpu is the QEMU CPU and by setting CPU passthrough (-cpu host) Windows is getting the correct set of flags. Unfortunately this does not solve the EPT feature crashes that come later then the VM is up. I gather that what Thomas suggested disables a set of QEMU CPU flags (properties) that are the problem in this scenario, and allow the VM to boot as well.

For the tdp_mmu / EPT feature, this is more tricky because I am now passing the host CPU flags which contains EPT ones, and would need the QEMU CPU to disable these features. TBH the kernel 6.2 option seems to be the best one for this because current 5.15 and 5.19 ones doesn't help: https://gitlab.com/qemu-project/qemu/-/issues/1198

tomponline commented 1 year ago

Is there anything left to do on this issue or can it be closed now?

markrattray commented 1 year ago

Hi

Well I'm not certain sorry...

In the one hand I see that you good dev folks have the impression that -cpu host is being passed through, but perhaps it's not, because on this CPU I cannot get WS2022 to boot without specifying it or your migration.stateful suggestion in the instance conf. In the other, I see that these could be QEMU and kernel bugs and need to be dealt with upstream.

Depends really on whether you want to lift some stones to see whether this will come back to bite you somewhere else more important.... e.g. with a big customer.

For me I have 2 workarounds, as we've discovered so my need is catered for. Up to you.

Hope you all have a great weekend !

tomponline commented 1 year ago

Yes it is passed through here:

https://github.com/canonical/lxd/blob/main/lxd/instance/drivers/driver_qemu.go#L1396-L1413

Given that setting migration.stateful fixes it its likely to be this one that was causing problems:

https://github.com/canonical/lxd/blob/main/lxd/instance/drivers/driver_qemu.go#L1401

markrattray commented 1 year ago

Thanks for the info. So if I understand that correctly migration.stateful feature only disables the hv_time flag?

Regardless now, this can be closed.

tomponline commented 1 year ago

Yes that's correct, so perhaps you manually passing the host CPU flag also wiped it out.

tomponline commented 1 year ago

I've got similar issues with running Windows on my CPU (Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz)

I found this helped:

echo N | sudo tee /sys/module/kvm/parameters/tdp_mmu
markrattray commented 1 year ago

I've got similar issues with running Windows on my CPU (Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz)

I found this helped:

echo N | sudo tee /sys/module/kvm/parameters/tdp_mmu

Good afternoon. Thanks for the info. Just repeating to collate if someone's reading just this part:

  1. All my LXD servers need TDP disabled to stop WS2022 random freezes when up and running. This is required regardless of # 2 below.
  2. Dell R740xd with dual Intel 6148 CPU requiresraw.qemu -cpu host or your instance config suggestion migration.stateful to get WS2022 ISO or instance to boot.
tomponline commented 1 year ago

Agreed, this is what I have to use too.

markrattray commented 1 year ago

Good morning @tomponline

I see that kernel 6.2 has been released to Ubuntu 22.04 yesterday which might make this TDP workaround redundant. I cannot test in the next week because I'm taking a break but will plan to try thereafter.

Hope you have a great weekend!

markrattray commented 1 year ago

Good morning @tomponline

On all LXD hosts, I have upgraded to kernel 6.2 via 22.04 HWE and re-enabled TDP. It's only been a day so a bit soon to tell although normally we'd see some WS2022 VMs fatally stopping.

CPU flags are still an issue on the Dell R740xd host with Intel 6148 CPU, so one of the following instance configs are still required: