Open OEI-Cgray opened 1 month ago
@OEI-Cgray I'm familiar with quite some error messages at stage "Validate KVA" but I've never seen URLs .github.com and .visualstudio.com at AKS Arc setups. What exactly are you trying to do?
@Elektronenvolt , trying to install it. I couldn't even set the config, let alone run install-akshci.
This is the same issue I've seen before, downloads in my environment, with no restrictions on outgoing traffic, fail to download some file. It's not always the same file. If I enable logging and verbose output, I can find the URL that fails to download, and then download it with invoke-webrequest from the same machine, even the same powershell session. But using the scripts, which download with the DownloadSDK module, fails.
Because the download size did not match the expected download size, downloadsdk panicked, and killed the process.
The file paths are for the DownloadSDK module, which I assume at the moment is a DLL, those would be the local source folders of whatever machine built the release DLL, or something along those lines. They are there because of the panic crash.
@OEI-Cgray - I never have seen that issue in any of the setups I did. Looks like an issue with the underlying storage. Your download with Invoke-Webrequest to the same storage you've set for AKS Arc? And these days - endpoint protections got better - download from PS may treated as malicious activity. What are the specs of the storage? SSDs only?
@Elektronenvolt
The storage is all enterprise class SSDs, and works fine when we were able to install this over a year ago before these same download issues made us abandon it. And yes, downloaded to the same storage.
Defender is our endpoint protection, and it does not stop the invoke-webrequest test. That's not to say the activity wasn't flagged when downloading via DownloadSDK, but when checking, Defender currently says it's never detected a threat on the three machines that make up the cluster. The machines have no threat detection history according to the various Azure portals either.
I'm happy to generate logs or whatever else needed to troubleshoot this... meanwhile we'll also be setting up all the things to basically do this manually by installing k8s on some vms and then installing the arc integrations afterwards... which is all less ideal than hopefully getting this issue resolved.
@OEI-Cgray Interesting issue - I'm always curious to know the root cause, a lot of issues I've seen with AKS Arc the last years had been caused by our infrastructure or 'limitations' like Firewalls, permissions, ... In case it comes down to be infra related, would be nice to know the root cause.
In which Azure region do you run the setup? I'm in West Europe
Well - try to run with debug and verbose flags on, may you see anything interesting in the debug and verbose output.
$VerbosePreference = "continue"
$DebugPreference = "continue"
If that doesn't point you to something interesting, looks like it's time for a support case.
@OEI-Cgray - by searching for something else I've seen that AKS Arc is right now supported on these three regions only: Azure-Regions You may run the setup in a non-supported region?
@Elektronenvolt The resource group is in west us, but the download error occurs on the Set-AksHciConfig, as far as I'm aware, you specify the resource group on the next command, Set-AksHciRegistration.
I'll have some time finally towards the end of this week for more testing.
@OEI-Cgray According to docs West US is not supported. Yes, true - would also expect a region-based error at the step where you set the resource group at Set-AksHciRegistration
Continue testing - the June release is out and, in your case - try the offline download and offsite setup
If that works, consider Internet connectivity issues like https://github.com/Azure/aksArc/issues/355
Hello again, I finally got some time to try the offline download. Attached is the end of the results. It did pass all the 9 tests without any issues.
Test-OfflineDownloadFiles : Cannot bind argument to parameter 'DifferenceObject' because it is null.
At C:\Program Files\WindowsPowerShell\Modules\AksHci\1.2.4\AksHci.psm1:7656 char:17
+ ... Test-OfflineDownloadFiles -destination $destination -vers ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (:) [Test-OfflineDownloadFiles], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,Test-OfflineDownloadFiles
Digging into this more, this function doesn't seem to exist.
It's referenced a few lines before the line 7656 error.
$updateReleases = Get-ProductReleasesUptoVersion -Version $startVersion -moduleName $moduleName
$updateReleases | ForEach-Object {
$version = $_.Version
if ([System.Version]$version -ge [System.Version]$startVersion)
{
$destination = $global:config[$modulename]["stagingShare"] + "\" + $version
if (Test-Path $destination) {
Test-OfflineDownloadFiles -destination $destination -version $version
} else {
New-Item -Path $destination -ItemType Directory
Get-ReleaseContent -version $version -activity $activity -destination $destination -moduleName $moduleName -mode $mode
}
}
}
This does not seem to exist.
Get-ProductReleasesUptoVersion
Get-ProductReleasesUptoVersion : The term 'Get-ProductReleasesUptoVersion' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path
was included, verify that the path is correct and try again.
At line:1 char:1
+ Get-ProductReleasesUptoVersion
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (Get-ProductReleasesUptoVersion:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
Line 7656 is Test-OfflineDownloadFiles -destination $destination -version $version
and it complains that DifferenceObject
is null.
If you go to the function in line 8450 - it wants to get the name of the downloaded files:
$missingFiles = Compare-Object -ReferenceObject $releaseFiles.Name -DifferenceObject $downloadedFiles.Name
In your log the files I see a user profile path - like: Saving to: "C:\\Users\\myuser-~1\\AppData\\Local\\Temp\\dsdk-330135\\manifest.cab
May you simply have permission issues on file access.
Did you specify directories like in the docs? https://learn.microsoft.com/en-us/azure/aks/hybrid/offline-download-22h2#step-4-configure-the-deployment-onsite I don't remember these files ending up in user profile temp folders e.g. -> did you use new / non existing folder names like in the docs?
Connect-AzAccount -TenantId '6e0faf7f-ffe0-4345-bf85-9f1011650754' -Subscription '80f4e056-b7b8-4f94-b84a-469cdfd59b48'
Import-Module AksHci
$vnet = New-AksHciNetworkSetting -name k8sdhcpvnet -vswitchName "10.42.x.x" -vipPoolStart "10.42.24.40" -vipPoolEnd "10.42.25.240" -vlanid 24
Set-AksHciConfig -offlineDownload $true -mode full -stagingShare 'C:\clusterstorage\containerstorage\AKS-Staging' -imageDir 'c:\clusterstorage\containerstorage\AKS-Images' -cloudConfigLocation 'c:\clusterstorage\containerstorage\AKS-Config' -workingDir 'c:\clusterstorage\containerstorage\AKS-Working' -vnet $vnet -cloudservicecidr '10.42.0.129/23'
The folders were all empty when starting.
I don't see this function, which is called before Test-OfflineDownloadFiles, in the PSM: Get-ProductReleasesUptoVersion
I can get the initial version, but then it fails due to the non-existing (as far as I can tell) function:
PS C:\Temp> Get-AksHciVersion
VERBOSE: [06/25/2024 13:53:43] [AksHci] Initializing environment
VERBOSE: [06/25/2024 13:53:43] [AksHci] Importing Configuration
VERBOSE: [06/25/2024 13:53:43] [Moc] Importing Configuration
VERBOSE: [06/25/2024 13:53:43] [Moc] Importing Configuration Completed
VERBOSE: [06/25/2024 13:53:44] [Moc] Validating configuration
VERBOSE: [06/25/2024 13:53:44] [Moc] Get MOC Configuration
VERBOSE: [06/25/2024 13:53:44] [Moc] Installation state is: NotInstalled
VERBOSE: [06/25/2024 13:53:44] [Kva] Importing Configuration
VERBOSE: [06/25/2024 13:53:44] [Kva] Importing Configuration Completed
VERBOSE: [06/25/2024 13:53:44] [Kva] Getting configuration for Kva
VERBOSE: [06/25/2024 13:53:44] [Kva] Installation state is: NotInstalled
VERBOSE: [06/25/2024 13:53:44] [AksHci] Importing Configuration Completed
VERBOSE: [06/25/2024 13:53:44] [AksHci] Saving Configuration for Module AksHci to configuration file
VERBOSE: [06/25/2024 13:53:44] [Moc] Discovering configuration
VERBOSE: [06/25/2024 13:53:44] [Moc] Importing Configuration
VERBOSE: [06/25/2024 13:53:45] [Moc] Importing Configuration Completed
VERBOSE: [06/25/2024 13:53:45] [Moc] Validating configuration
VERBOSE: [06/25/2024 13:53:45] [Moc] Applying configuration
VERBOSE: [06/25/2024 13:53:45] [Moc] Saving Configuration for Module Moc to configuration file
VERBOSE: [06/25/2024 13:53:45] [Moc] Saving Configuration for Module Moc to configuration file
VERBOSE: [06/25/2024 13:53:45] [AksHci] Uninitializing environment
VERBOSE: [06/25/2024 13:53:45] [AksHci] Saving Configuration for Module AksHci to configuration file
1.0.23.10605
PS C:\Temp> $startVersion = (Get-AksHciVersion)
VERBOSE: [06/25/2024 13:55:18] [AksHci] Initializing environment
VERBOSE: [06/25/2024 13:55:18] [AksHci] Importing Configuration
VERBOSE: [06/25/2024 13:55:18] [Moc] Importing Configuration
VERBOSE: [06/25/2024 13:55:18] [Moc] Importing Configuration Completed
VERBOSE: [06/25/2024 13:55:18] [Moc] Validating configuration
VERBOSE: [06/25/2024 13:55:18] [Moc] Get MOC Configuration
VERBOSE: [06/25/2024 13:55:18] [Moc] Installation state is: NotInstalled
VERBOSE: [06/25/2024 13:55:19] [Kva] Importing Configuration
VERBOSE: [06/25/2024 13:55:19] [Kva] Importing Configuration Completed
VERBOSE: [06/25/2024 13:55:19] [Kva] Getting configuration for Kva
VERBOSE: [06/25/2024 13:55:19] [Kva] Installation state is: NotInstalled
VERBOSE: [06/25/2024 13:55:19] [AksHci] Importing Configuration Completed
VERBOSE: [06/25/2024 13:55:19] [AksHci] Saving Configuration for Module AksHci to configuration file
VERBOSE: [06/25/2024 13:55:19] [Moc] Discovering configuration
VERBOSE: [06/25/2024 13:55:19] [Moc] Importing Configuration
VERBOSE: [06/25/2024 13:55:19] [Moc] Importing Configuration Completed
VERBOSE: [06/25/2024 13:55:19] [Moc] Validating configuration
VERBOSE: [06/25/2024 13:55:19] [Moc] Applying configuration
VERBOSE: [06/25/2024 13:55:20] [Moc] Saving Configuration for Module Moc to configuration file
VERBOSE: [06/25/2024 13:55:20] [Moc] Saving Configuration for Module Moc to configuration file
VERBOSE: [06/25/2024 13:55:20] [AksHci] Uninitializing environment
VERBOSE: [06/25/2024 13:55:20] [AksHci] Saving Configuration for Module AksHci to configuration file
PS C:\Temp>
PS C:\Temp> $updateReleases = Get-ProductReleasesUptoVersion -Version $startVersion -moduleName $moduleName
Get-ProductReleasesUptoVersion : The term 'Get-ProductReleasesUptoVersion' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path
was included, verify that the path is correct and try again.
At line:1 char:23
+ $updateReleases = Get-ProductReleasesUptoVersion -Version $startV ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (Get-ProductReleasesUptoVersion:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
At New-AksHciNetworkSetting
I usually add settings for -ipAddressPrefix
, -gateway
, -dnsServers
like its documented here: https://learn.microsoft.com/en-us/azure/aks/hybrid/kubernetes-walkthrough-powershell#step-2-create-a-virtual-network
Otherwise I don't have any connectivity to other networks or Internet. May that's the missing thing to get it working.
@Elektronenvolt that's for a static IP setup. The existing powershell process that's failing the download is running using the OS's current DNS servers. These servers do have unrestricted outgoing internet access, and I've checked the firewall to ensure it's not blocking anything due to any of it's internal threat protection mechanisms.
Ok, I'll try the offline mode on one of my test setups - can combine it with a few other topics. But, try offline and offsite. https://learn.microsoft.com/en-us/azure/aks/hybrid/offline-download-22h2#use-offline-download-to-install-offsite You can download the images somewhere else and then copy the image to the stagingshare. It would be interesting if this works.
Describe the bug DownloadSDK passes its tests during the Set-AksHciConfig, but still fails to actually download the entire file. When it fails, it completely closes the existing powershell window as well, with no mention of a log file... To even see the error message, I had to run powershell within powershell so on the first process would be killed, leaving the error visible to see.
To Reproduce Steps to reproduce the behavior:
Expected behavior Files to actually download.
Screenshots If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Collect log files