Azure / AzureStackHCI-EvalGuide

Welcome to the Azure Stack HCI Evaluation Guide!
Creative Commons Attribution 4.0 International
141 stars 84 forks source link

Indicate that Physical deployment is ONLY for Intel CPUs #18

Closed AnilDesai closed 4 years ago

AnilDesai commented 4 years ago

I created my entire management infrastructure using the guides (which are very well written - both GUI and PowerShell), before I learned that nested virtualization is not supported on AMD CPUs. It's by far the most requested feature on User Voice (https://windowsserver.uservoice.com/forums/295047-general-feedback/suggestions/31734808-nested-virtualization-for-amd-epyc-and-ryzen), and I'm sure this will be a frequent issue for testers. The official documentation that states this is at https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/user-guide/nested-virtualization. There may be other issues that users will run into, so referencing that document might be useful.

I'd recommend adding the restriction information to the very beginning of the physical deployment guide (along with the discussion of hardware memory requirements and other caveats). Depending on steps taken, this could waste many hours of IT Pros' time.

The good (or at least, better) news is that support has reportedly been added in recent preview builds (I haven't checked this myself). As of right now, though, AMD-based users can't take advantage of any of the "physical" deploment options. Had I known, I would have considered setting up an Azure test environment, instead.

mattmcspirit commented 4 years ago

Hi Anil,

My apologies - I'll take care of this right now. You are correct, support for Nested Virtualization with AMD CPUs has been added to Windows 10 Insider builds, as per here: https://techcommunity.microsoft.com/t5/virtualization/amd-nested-virtualization-support/ba-p/1434841, however I don't believe this is in the Windows Server Insider builds at this time.

I'll update the guide to reflect this additional option.

Apologies again.

AnilDesai commented 4 years ago

Thanks, Matt - especially for responding within minutes! If I can help with providing error message or any other testing, please feel free to let me know! If anyone's interseted, it is possible to set up the management infrastructure (DC01 and MGMT01) for testing purposes using the great GUI and PowerShell guides.

mattmcspirit commented 4 years ago

One thing to note, you only need nested if you want to start VMs on your Azure Stack HCI hosts, otherwise you should still be able to deploy the Azure Stack HCI nodes, and enable Hyper-V with the PowerShell command (I think!) however I'm not 100% sure what will happen when you run this:

Set-VMProcessor -VMName <VMName> -ExposeVirtualizationExtensions $true

on an AMD system, so if you could validate that, that would be great! If that command works, and the Hyper-V role enables correctly, Windows Admin Center should allow you to build the cluster. You just wouldn't be able to deploy a VM onto your nested nodes as the hypervisor wouldn't run. I'm speculating a bit here, so apologies if that doesn't work - I don't have an AMD system to test on unfortunately.

I'm about to PR the edit to the changes - apologies again, and thanks for your understanding.

AnilDesai commented 4 years ago

Matt,

      Thanks, again for you quick response.  Actually, the specific line you mentioned is the first one that failed for me.  I was starting to setup the nodes and was able to create the VMs (I had DC01 and MGMT01 already running fine).  I forgot the exact error message that appeared, but it is in the documentation somewhere (about being unable to enable virtualization extensions).  I’m running on a 3rd Generation AMD Ryzen 5-3600 with 32GB of RAM.  I think the issue is lack of support for AMD-V extensions (vs. Intel’s VT).

      I’d be happy to send you a SystemInfo or any a screenshot or text of the complete message if it would help.  I still have my test environment setup up to that step.  I have done lot of writing/speaking about Hyper-V and Azure stuff (and I was 14-time MVP, most recently in the Cloud/Datacenter area), so I’d be happy to help!

      Thanks, again!

From: Matt McSpirit notifications@github.com Sent: Monday, July 27, 2020 10:18 PM To: Azure/AzureStackHCI-EvalGuide AzureStackHCI-EvalGuide@noreply.github.com Cc: Anil Desai Anil@anildesai.net; Author author@noreply.github.com Subject: Re: [Azure/AzureStackHCI-EvalGuide] Indicate that Physical deployment is ONLY for Intel CPUs (#18)

One thing to note, you only need nested if you want to start VMs on your Azure Stack HCI hosts, otherwise you should still be able to deploy the Azure Stack HCI nodes, and enable Hyper-V with the PowerShell command (I think!) however I'm not 100% sure what will happen when you run this:

Set-VMProcessor -VMName -ExposeVirtualizationExtensions $true

on an AMD system, so if you could validate that, that would be great! If that command works, and the Hyper-V role enables correctly, Windows Admin Center should allow you to build the cluster. You just wouldn't be able to deploy a VM onto your nested nodes as the hypervisor wouldn't run. I'm speculating a bit here, so apologies if that doesn't work - I don't have an AMD system to test on unfortunately.

I'm about to PR the edit to the changes - apologies again, and thanks for your understanding.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Azure/AzureStackHCI-EvalGuide/issues/18#issuecomment-664751998, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB3BZKE32IQQU5CIT3PI6ADR5Y7ORANCNFSM4PJ7C2OQ.

mattmcspirit commented 4 years ago

Hey - I thought it might fail, however, if you skip/comment that line, were you able to enable Hyper-V, at this step?

https://github.com/Azure/AzureStackHCI-EvalGuide/blob/main/nested/steps/3b_AzSHCINodesPS.md#enable-the-hyper-v-role-on-your-azure-stack-hci-node

I'm curious if you skip the extensions, and successfully enable Hyper-V, whether WAC will allow you to start the cluster creation. The section where it checks for "Hyper-V enabled" is right at the start of the cluster creation wizard. In the past, I've seen a message that the Virtualization extensions aren't enabled on a node, so it wouldn't allow you to continue, but with the Hyper-V role already enabled, it may allow you to proceed.

Could you see if that works?

Also, if it doesn't, it may work for Gen1 nodes, so that may be an alternative approach.

The alternative to all of this could be to export your existing DC01 and MGMT01 VMs, reimage the host with the W10 Insider, then re-import them and proceed on from there?

mattmcspirit commented 4 years ago

OK, could you try this? For the enable Hyper-V role with PowerShell command, run this one instead:

# Provide the domain credentials to log into the VM
$domainName = "azshci.local"
$domainAdmin = "$domainName\labadmin"
$domainCreds = Get-Credential -UserName "$domainAdmin" -Message "Enter the password for the LabAdmin account"
# Define node name
$nodeName = "AZSHCINODE01"
Invoke-Command -VMName "$nodeName" -Credential $domainCreds -ScriptBlock {
    # Enable the Hyper-V role within the Azure Stack HCI OS
    Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V
    Install-WindowsFeature -Name Hyper-V -IncludeManagementTools -restart
}

The first one enables Hyper-V without doing the "full checks" and the second enables the PowerShell stuff (which the first command won't do for some reason).

When you reboot, and try to add to WAC, it will recognize that the Hyper-V role is enabled, and should allow you to proceed, or at least it has for me, but with an Intel system. I haven't run the Set-VMProcessor -VMName -ExposeVirtualizationExtensions $true command, so we should be in a similar place.

Update - cluster creation completed successfully! Give it a try!

AnilDesai commented 4 years ago

Matt,

      Sure – I’ll try this in about an hour or so and get back to you with the results.

      Thanks,

From: Matt McSpirit notifications@github.com Sent: Tuesday, July 28, 2020 1:50 PM To: Azure/AzureStackHCI-EvalGuide AzureStackHCI-EvalGuide@noreply.github.com Cc: Anil Desai Anil@anildesai.net; Author author@noreply.github.com Subject: Re: [Azure/AzureStackHCI-EvalGuide] Indicate that Physical deployment is ONLY for Intel CPUs (#18)

OK, could you try this? For the enable Hyper-V role with PowerShell command, run this one instead:

Provide the domain credentials to log into the VM

$domainName = "azshci.local"

$domainAdmin = "$domainName\labadmin"

$domainCreds = Get-Credential -UserName "$domainAdmin" -Message "Enter the password for the LabAdmin account"

Define node name

$nodeName = "AZSHCINODE01"

Invoke-Command -VMName "$nodeName" -Credential $domainCreds -ScriptBlock {

# Enable the Hyper-V role within the Azure Stack HCI OS

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V

Install-WindowsFeature -Name Hyper-V -IncludeManagementTools -restart

}

The first one enables Hyper-V without doing the "full checks" and the second enables the PowerShell stuff (which the first command won't do for some reason).

When you reboot, and try to add to WAC, it will recognize that the Hyper-V role is enabled, and should allow you to proceed, or at least it has for me, but with an Intel system. I haven't run the Set-VMProcessor -VMName -ExposeVirtualizationExtensions $true command, so we should be in a similar place.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Azure/AzureStackHCI-EvalGuide/issues/18#issuecomment-665214234, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB3BZKCLQEOCX455Y3FOIRDR54MUBANCNFSM4PJ7C2OQ.

AnilDesai commented 4 years ago

Matt,

      I do have some more information for you.  The MGMT01 and DC01 are working fine, as I left them yesterday.  First, when I try to startup AZSHCINODE01, I receive the following error message via Hyper-V (on the my local, physical host):

[cid:image001.png@01D664FB.57533C90]

      It does appear that the configuration change (enabling Hyper-V) was attempted, but now I can’t start that VM.  Also, it looks like the command I ran yesterday did seem to run properly from the physical host (I got the expected response of “VERBOSE: Set-VMProcessor will configure the processor settings of the virtual machine "AZSHCINODE01").  I just happened to have this console window still open.

      I think there are several options for troubleshooting.  One option would be to start by creating a NODE02 VM using the same evaluation version of Windows Server 2019.  I think I’d run into the same problem, though: Whether or not the command returned an error, I wouldn’t be able to start it on my AMD-based host.  I could also move the VM to an Intel-based host machine (I’ll need to configure onbe), and try moving it back.  However, it seems like the same issue would occur.

      It seems like the most likely workaround is to try the Windows Server Insider build, as you mentioned.  I have read several blog posts and comments from people stating that nested virtualization is working properly for them on AMD-based machines on that build (all have been posted within the last month or so).  I can try following the steps using the latest Insider build that’s available to me to see if that works.

      Let me know if/how you’d like me to proceed.

      Thanks,

From: Matt McSpirit notifications@github.com Sent: Tuesday, July 28, 2020 1:50 PM To: Azure/AzureStackHCI-EvalGuide AzureStackHCI-EvalGuide@noreply.github.com Cc: Anil Desai Anil@anildesai.net; Author author@noreply.github.com Subject: Re: [Azure/AzureStackHCI-EvalGuide] Indicate that Physical deployment is ONLY for Intel CPUs (#18)

OK, could you try this? For the enable Hyper-V role with PowerShell command, run this one instead:

Provide the domain credentials to log into the VM

$domainName = "azshci.local"

$domainAdmin = "$domainName\labadmin"

$domainCreds = Get-Credential -UserName "$domainAdmin" -Message "Enter the password for the LabAdmin account"

Define node name

$nodeName = "AZSHCINODE01"

Invoke-Command -VMName "$nodeName" -Credential $domainCreds -ScriptBlock {

# Enable the Hyper-V role within the Azure Stack HCI OS

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V

Install-WindowsFeature -Name Hyper-V -IncludeManagementTools -restart

}

The first one enables Hyper-V without doing the "full checks" and the second enables the PowerShell stuff (which the first command won't do for some reason).

When you reboot, and try to add to WAC, it will recognize that the Hyper-V role is enabled, and should allow you to proceed, or at least it has for me, but with an Intel system. I haven't run the Set-VMProcessor -VMName -ExposeVirtualizationExtensions $true command, so we should be in a similar place.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Azure/AzureStackHCI-EvalGuide/issues/18#issuecomment-665214234, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB3BZKCLQEOCX455Y3FOIRDR54MUBANCNFSM4PJ7C2OQ.

mattmcspirit commented 4 years ago

Hmm - i can't see the image but nonetheless, if you could test with a new VM (just follow the PowerShell guide, and don't worry about a domain join) it should only take a few mins to create the new node and run the scripts to try to enable Hyper-V. If that still doesn't work to get Hyper-V enabled and configured, I'd just go ahead and go with the Insider build - entirely your choice!

AnilDesai commented 4 years ago

No problem - I'll try that now. I think I might need to correct something below. Based on the issues that I’m experiencing, the issue seems to be with the physical host OS (Windows 10, in my case). Based on the Microsoft post AMD Nested Virtualization Support, I would need to upgrade my version of Windows 10. Unfortunately, I can’t upgrade my current Windows 10 host at the moment due to other work dependencies.

I've attached a copy of the screenshot with the error message via GitHub, so hopefully you can see it now.

AMD Hyper-V Nested Virtualization Error

I was just following the guide for testing purposes, so I will try the Insider build when I get a chance. Just let me know if you want me to send you any further feedback (I might not be able to get to it until tomorrow).

Thanks, again!

mattmcspirit commented 4 years ago

Can you disable the virtualization extensions for the VM, and try launching again?

Set-VMProcessor -VMName -ExposeVirtualizationExtensions $false

AnilDesai commented 4 years ago

I just tried that, and it does resolve the issue with the VM not starting. With the NODE01 VM offline (since I couldn't start it), I ran the following (added -Verbose since I didn't get any output):

Set-VMProcessor AZSHCINODE01 -ExposeVirtualizationExtensions $false -Verbose

Response: VERBOSE: Set-VMProcessor will configure the processor settings of the virtual machine "AZSHCINODE01".

I retried enabling Hyper-V with the script you provided, and I got the following error after about a minute:

An error has occurred which Windows PowerShell cannot handle. A remote session might have ended.

The VM remained running, but the console shows it has power-cycled (restarted). I turned the VM off and the back on. This time, it's just stuck at the black boot screen (attempting PXE or DVD-based booting). It looks like the "trick" worked partially; at least my physical host will attempt to start the VM. Unfortunately, it looks like the VM will not actually boot into the OS in this configuration. I hope this helps. Let me know if you'd like me to try anything further.

mattmcspirit commented 4 years ago

How much memory does this VM have?

AnilDesai commented 4 years ago

It's configured to 4096MB (4GB) with dynamic memory disabled for the Node01 VM. The host has 32GB and is running on NVME storage.

mattmcspirit commented 4 years ago

OK - i had issues with Hyper-V enabled and memory configured with less than 4GB, but with 4GB set, all should be fine.

I'm stumped on that one - don't waste your time on it too much though - if you have a spare few mins, deploy a fresh node and enable Hyper-V with the step above, but short of that, perhaps deploy one in Azure to test further.

Thanks again for the support here!

AnilDesai commented 4 years ago

Thanks for the update and quick response to the issue! I will give it another shot when I get a chance. Regardless, everything in the guides has been really well-written (I've reviewed both the GUI and PowerShell versions before getting started). I think using Azure, an Intel-based host, or an Insider build should all work fine.