Azure / aksArc

# Welcome to the Azure Kubernetes Service on Azure Stack HCI repo This is where the AKS-HCI team will track features and issues with AKS-HCI. We will monitor this repo in order to engage with our community and discuss questions, customer scenarios, or feature requests. Checkout our projects tab to see the roadmap for AKS-HCI!
MIT License
109 stars 45 forks source link

[BUG] AKS installation stuck on "Validate KVA" #363

Closed arp-mbender closed 7 months ago

arp-mbender commented 7 months ago

Describe the bug When attempting to install Azure Kubernetes Service via Windows Admin Center on a Windows Server 2022 the process seems to be stuck at the "Validate KVA" validation step on the "Review and Validation" phase of the process. Since the process never finishes it is also impossible to download logs.

To Reproduce Steps to reproduce the behavior:

  1. Proceed with the basic AKS setup from Windows Admin Center, using a new VM with Server 2022 Standard Core as the target.
  2. Using DHCP for the network assignment with no VLAN; i.e. simplest setup possible
  3. Get stuck during validation

Expected behavior Validation finishes in a reasonable time or throws an error in case something is wrong.

Screenshots Validation still ongoing after 10 minutes (where everything else completed in 30 seconds or so): Zrzut ekranu 2023-11-19 131356 I can wait even longer, but the result will be the same.

Environment (please complete the following information):

markamber commented 7 months ago

I can reproduce this! On server 2019 clustered and 2022 standalone.

Looks like a bootstrap VM boots and gets its cloud config but then nothing happens. I was using static addresses , no VLAN, and static addresses yes VLAN.

Seems like it might just be broken

arp-mbender commented 7 months ago

@markamber Thanks for confirming.

Quite annoying, TBH. I keep trying to get some experience with Kubernetes by setting up a basic environment, and I keep hitting strange roadblocks. Processes that should automate this... don't work (like the one described here), while manual attempts at setting things up fail due to versioning issues between some tutorial and my own environment...

abhilashaagarwala commented 7 months ago

@SummerSmith @walterov FYI

Elektronenvolt commented 7 months ago

I've noticed the same thing on one of our setups. The validation reports an issue at KVA - a VM gets created but it's not reachable by ssh, e.g. But - I'm not sure if it's a bug or the validation checks are just doing their job!

Its only one machine where it fails, no problems on many other machines. We didn't find the root cause so far, but we focus on the Hyper-V virtual switch / network card settings. I was not able to ping the VM created by KVA validation check - what usually works.

@arp-mbender @markamber - try to create a custom VM with the same virtual switch you use for AKS Hybrid and check if you get connectivity. May you have an issue on the virtual switch

At PowerShell it looks like this: image

arp-mbender commented 7 months ago

Aye, it turned out to be a connectivity issue. I'm not sure how I missed it. :(

Elektronenvolt commented 7 months ago

@arp-mbender in our case it was a network card not supporting SR-IOV, Virtual Machine Queues, e.g. Replaced it now with a "Hyper-V certified" card.

arp-mbender commented 7 months ago

@Elektronenvolt I had the EXACT same issue. Well, SR-IOV was supported, but it doesn't support MAC address spoofing (but there are no errors enabling this; it just doesn't work in the end).