xcp-ng / xcp

Entry point for issues and wiki. Also contains some scripts and sources.
https://xcp-ng.org
1.24k stars 73 forks source link

Nested Virtualisation (XCP-ng hosting other virtualized hypervisors) doesn't work with Hyper-V #105

Open lurendrejer opened 5 years ago

lurendrejer commented 5 years ago

I've been fiddling with nestedHVM and HAP via cli for quite some time, without much luck. Thinking that I was the problem, I told no one.

Now that XenOrchestra supports these options (enable nested virtualisation) - and it still doesn't work i was thinking that something might be up with XCP-ng.

XCP-ng 7.6 CPU= X5670 (supports SLAT) Hardware platform: HP BL640c G7

Installing Hyper-V and creating a new VM gives me the following error: image

I've tried different CPU's XCP-ng 7.5 Different VM's

Without any luck. A nested VMware gives the same (unable to start vm's) error and just freezes up.

olivierlambert commented 5 years ago

Please double check your BIOS have all the virt options enabled.

lurendrejer commented 5 years ago

Just did, thank you. Everything is enabled. Virtual extensions, VT-D, etc.

image

olivierlambert commented 5 years ago

Does XCP-ng displays an error message when you try to install it in a nested VM? (it should tell you it can use HVM in the installer screen).

Also, double check you are using (in the nested VM) the same amount of RAM (dynamic min = dynamic max = static max)

lurendrejer commented 5 years ago

Will try XCP tomorrow. Thank you.

Ram is static.

/Mobile

lurendrejer commented 5 years ago

argh, fudge.... I made a quick test, I can't create new vm's before my 7.5 to 7.6 upgrade is done.

I'll se if I can get'r'done tomorrow.

I tried starting the installer on an existing VM, it got to where i select the install disk. But the disk doesn't have enough space (32gb).

Sorry.

lurendrejer commented 5 years ago

well, since we don't have mission critical servers like most other companies - i went ahead and upgraded the whole pool via http. Everything is working - i'm installing the xcp-VM now.

lurendrejer commented 5 years ago

Well, it works with XCP-ng. Which is nice, but since we strive to educate our students in multiple hypervisors - this is a problem.

lurendrejer commented 5 years ago

And another thing - why the heck doesn't xcp-ng come with xcp-tools preinstalled? :) It only works with intel e1000 emulation, realtek emulated cards stays disconnected after bootup.

olivierlambert commented 5 years ago
  1. I can't answer why HyperV and VMWare cannot work in a nested XCP-ng situation. This is outside my domain of expertise. Does KVM works?
  2. No VM Tools in XCP-ng: because… it's meant to run on bare metal and not in a VM? :wink:
lurendrejer commented 5 years ago

It would seem that the hyper-v role tries to update the CPU's microcode. It works with vmwares virtual hardware version 11, not 10 - so i guess this could be a Xen/Qemu problem or a hardware issue.

// https://communities.vmware.com/thread/525611

quote:: One interesting thing to note is that the log file indicates that Windows 2016 tried to update your CPU's microcode patch level from 0x70b to 0x710. ESX does not allow a virtual machine to update the microcode of the CPU. I'm speculating here, but it's possible that Hyper-V will not start the hypervisor in the presence of an erratum fixed by microcode patch 0x710 (erratum BT248).

olivierlambert commented 5 years ago

Yes, this is probably the reason. If you want more insights, you can always post on the Xen mailing list, because it's a very specific and "low level" Xen question :)

lurendrejer commented 5 years ago

Does this make any sense to you when it comes to XCP? https://wiki.xenproject.org/wiki/Nested_Virtualization_in_Xen

as far as i can see, commit:58f5bcaf solves the problem.

olivierlambert commented 5 years ago
  1. As explained in the doc, you can't use L1 guest in PV, you must use HVM (as I expected)
  2. To boot HyperV or VMWare, it seems you need to mask the CPU, eg cpuid = ['0x1:ecx=0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'] But this is a Xen setting, IDK if you can pass it via XAPI in XCP-ng
lurendrejer commented 5 years ago

oh, I'll give it a whirl. You are right, this must be a L1 -setting, I misunderstood. There was some change regarding forced pool cpu-masking around 6.5-7.0, irrc.

lurendrejer commented 5 years ago

This is a poopshow, I'm giving up - I've been banging my head against the wall with xe param-set, xapi and very last, but not least, poorly formatted XCP-template-xml-export-files.

In my struggle to set the CPU mask, i noticed that you set the exp-nested-hvm flag when changing nested virtualisation in XOA, not HAP and NestedHVM - why is that?

Both does seem to work, but I just figured the the EXP-Nested-HVM flag would be depricated when it is no longer experimental. :)

olivierlambert commented 5 years ago

This is a long story. You can find "guides" for Xen (the hypervisor) but XS/XCP-ng aren't just Xen but the whole thing built around. That's why I said I have no idea how to pass cpuid. If it's not exposed in XAPI, then you are doomed to dig for days on how to do so.

lurendrejer commented 5 years ago

Hi, and thank you.

Before the feature was added to XOA, I tested with xe param-set. Adding NestedHVM and HAP worked just like the Exp-nested flag. I just figured that EXP-nested would be removed from Xen one day.

/edit What I was trying to say is: NestedHVM and HAP can be set via Xapi :)

lurendrejer commented 5 years ago

And I think, that someone more skilled than me in the art of XML - would be able to create a XCP-template with the CPU-mask included.

olivierlambert commented 5 years ago

Maybe there is an option is "other vm param" of VM object that's read by Xen, try to dig on Google.

lurendrejer commented 5 years ago

CPUID can be added anywhere, without any sign of verification that it was the right place. The following doesn't work, and i could be trying more or less forever since XEN just accepts any given parameter. :)

xe vm-param-list uuid=3ea123c5-f36f-93f3-6979-802cc57c6dcd|grep cpuid

platform (MRW): timeoffset: 0; exp-nested-hvm: true; cpuid: ['0x1:ecx=0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx']; videoram: 8; hpet: true; device-model: qemu-upstream-compat; apic: true; device_id: 0002; cores-per-socket: 2; pae: true; vga: std; nx: true; viridian_time_ref_count: true; viridian: true; acpi: 1; viridian_reference_tsc: true

lurendrejer commented 5 years ago

Maybe something you should look into: the article at (https://wiki.xenproject.org/wiki/Nested_Virtualization_in_Xen) states that HAP should be set to 1 - to avoid poor performance on the nested hypervisor.

I don't know if the flag you set en XO (exp-nested-hvm) sets both of these options, I wouldn't even know how to test it either.

olivierlambert commented 5 years ago

Again, Xen is the engine. We don't really have control on what's passed to the hypervisor except via XAPI. Using the nested flag should enable everything in the hypervisor parameters on VM boot.

tukusejssirs commented 3 years ago

Sorry to bump an old issue, however, it seems too me that because of nested virtualisation issues, WSL2 on Win 10 does not work (WSL1 works). When I try, for example, import a distro as WSL2, I get the following error (same thing happens when I try to install a fresh distro image):

I have enabled nested virtualisation, also in BIOS I have enabled virtualisations

> wsl --import centos_7_v2 "$env:userprofile\centos_7_v2"
  "$env:userprofile\Desktop\centos_7.tar" --version 2
Please enable the Virtual Machine Platform Windows feature and ensure virtualization
  is enabled in the BIOS.
For information please visit https://aka.ms/wsl2-install

Also note that I have enabled all required Windows features, even Hyper-V, but that error still persists.

Is there anything I could do in order to make WSL2 work? Thanks. :smiley:

michael-newsrx commented 3 years ago

I also am trying to figure out how to get wsl2 working in a windows xen guest.

Any updates?

michael-newsrx commented 3 years ago

OK, the VMs get ACPI devices, but Hyper-V requires an APIC and not an ACPI...

michael-newsrx commented 3 years ago

See: https://docs.microsoft.com/en-us/troubleshoot/windows-server/virtualization/vmbus-device-not-load

tukusejssirs commented 3 years ago

Thanks, @michael-newsrx!

Thanks to your link I have found this website that I can run (as an admin) bcdedit /set detecthal true in order to enable the HAL detection.

Note that the website I linked above states (implicitly) it should not work on Win 10, however it worked on Win 10 x64 2004 Build 19041.746.

Also note that the Microsoft documentation navigates us using msconfig, but I have no Detect HAL option in there.

Update: Could this setting be set in XCP VM configuration? It’d be awesome! :wink:

michael-newsrx commented 3 years ago

Hrm.. I ran the bcdedit command, rebooted, but it still shows ACPI and not APIC ?

Do I need to do something to the VM's configuration via xe?

tukusejssirs commented 3 years ago

Actually, neither I have APIC entry in devmgmt.msc; see below.

Anyway, I have just noticed that there is warning emblem on Microsoft Hyper-V Virtual Machine Bus Provider with the following warning:

Windows cannot initialise the device driver for this hardware. (Code 37)

The request is not supported.

xcp_hal_acpi_acip_wsl2

I have no idea how could I solve this.


On the other hand, I successfully run wsl --set-default-version 2 command. Also I created a test VM in Hyper-V Manager (with no vHDD with a virtual DVD drive with Win 10 installation ISO inserted) and started the machine–unsuccessfully. A warning (different one from the one in the OP); see below. I think The issue is the same here as in the devmgmt.msc.

hyper_v_error


Update:

Similar issue for VMware: https://communities.vmware.com/t5/VMware-Fusion-Discussions/Driver-conflict-MS-Hyper-V-Virtual-Machine-Bus-Provider/td-p/939114

BTW, I’ve just tried to uninstall Microsoft Hyper-V Virtual Machine Bus Provider and scanned for hardware changes in devmgmt.msc and now it disappeared and did not re-appear.

Then I restarted the VM, removed Hyper-V from system features, restarted the VM, installed Hyper-V into system features, restarted the VM. It didn’t help to solve the issue.

Also, based on a comment from @lurendrejer, I double-checked if I have set apic: true in platform. It is set to true.

michael-newsrx commented 1 year ago

Any updates on this?

aleplu83 commented 1 year ago

Hi All

I'm also struggling into getting Hyper-V nested into XEN but I get this message:

immagine

this is my xen VM configuration:

boot='cd' type = "hvm" cpuid = ['0x1:ecx=0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx']

name = "nlentsv1001" bios="ovmf" firmware="uefi" device_model_version="qemu-xen"

memory = 8192 pae=1 hap=1 vcpus = 4 hdtype="ahci" acpi=1 nx=1 nestedhvm=1 xen_platform_pci=1 viridian=1 apic=1 hpet=1

vif = [ 'type=vif,bridge=blan,mac=00:16:3e:0f:bb:11' ]

disk = [ 'phy:/dev/vmstor/nlentsv1001,xvda,rw','file:/mnt/iso/20348.169.210806-2348.fe_release_svc_refresh_SERVER_EVAL_x64FRE_en-us.iso,hdc:cdrom,r','file:/mnt/iso/winpvguest.iso,hdd:cdrom,r' ]

vnc = 1 vnclisten = '0.0.0.0' vncdisplay = '2' keymap = 'fr-be'

usbdevice='tablet'

on_poweroff = 'destroy' on_reboot = 'restart' on_crash = 'restart'

in the event viewer I get this error:

immagine
olivierlambert commented 1 year ago

Hi,

XCP-ng isn't just Xen. Anyway, I think we can consider HyperV nested in Xen broken at the moment. Hopefully, someone or some company will invest into it next year. We'll continue to monitor the situation.

michael-newsrx commented 5 months ago

It's been more than a year, has any progress been reported upstream on this?

It is starting to make XCP-ng look like a dead project for Windows guests as I understand M$ continues to incorporate more and more into their hypervisor component as a requirement. We are reaching the point in the small office I work in that we will need to try and find an alternative solution as we can't run WSL2 or Docker in Windows 11 or Windows 10.

Any hope?

olivierlambert commented 5 months ago

It's not really nice to hear we are looking like a dead project :disappointed:

This part is a responsibility of the upstream (Xen), which we are contributing to, sure, but we are not alone. Also, this is far from being a simple "problem"/limitation to fix. Feel free to ask and push for your requirement at https://matrix.to/#/#XenProject:matrix.org

michael-newsrx commented 5 months ago

I understand the feeling about looking like a "dead project", but, (IMHO) the stack is losing usability and relevance because of this. Sorry.

FYI: Really not interested in signing up for another online service. I was only wanting to know if anything had been reported or noticed from upstream.

Based on your response, I assume the issue isn't really on any contributor's radar and probably won't be anytime soon.

May you and yours enjoy prosperity.

ddelnano commented 5 months ago

Note: I'm not a Xen maintainer but I did inquire about the status of nested virt plans at the beginning of Feb in Matrix (thread).

The existing nested virt implementation is described as experimental. The maintainers are currently working on fixing this (Nested Virt Revamp effort). They are starting with revamping AMD support and will later work on the Intel implementation. In terms of time line, maintainers mentioned that it would likely take 2 years. That 2 year time frame includes what it would take to productionize the new solution (includes testing time and addressing bugfixes).

From my anecdotal evidence, it also seems that those involved with QubeOS are very keen on these enhancements since Windows nested virt support would benefit their project greatly.

So upstream is working on it and understands that this is an important feature to users from what I can tell.

michael-newsrx commented 4 weeks ago

The inability to run WSL2 on Windows guest machines is becoming more and more of an issue.

Is there any type of kludge, registry setting, other, that can be done while waiting on this long term fix to get nested virtualization working in Windows? It seems to work for other OS's?

olivierlambert commented 4 weeks ago

There is no shortcut, but it's an active topic in the Xen community. Everybody is aware on this issue becoming more pressing. We'll continue to monitor the situation and contribute where we can.