Azure / aksArc

# Welcome to the Azure Kubernetes Service enabled by Azure Arc (AKS Arc) repo This is where the AKS Arc team will track features and issues with AKS Arc. We will monitor this repo in order to engage with our community and discuss questions, customer scenarios, or feature requests. Checkout our projects tab to see the roadmap for AKS Arc!
MIT License
115 stars 45 forks source link

[BUG] Failing to download #202

Open OEI-Cgray opened 2 years ago

OEI-Cgray commented 2 years ago

This is similar to the issues in #194. Downloads are inconsistent. Downgrading to 1.1.32 was the only way I could complete a reinstall.

The files download via invoke-webrequest from the same machine without issue.

WARNING: Install-Moc failed. Exception [GetRelease error returned by API call: run: Corrupted download: ExpectedSize=60633811 SavedSize=40422540] Stacktrace [at Test-DownloadSDKResponse, C:\Program Files\WindowsPowerShell\Modules\DownloadSdk\1.0.13\DownloadSdk.psm1: line 390 at Get-DownloadSdkRelease, C:\Program Files\WindowsPowerShell\Modules\DownloadSdk\1.0.13\DownloadSdk.psm1: line 173 at Get-MocReleaseContent, C:\Program Files\WindowsPowerShell\Modules\Moc\1.0.28\Moc.psm1: line 2381 at Get-MocRelease, C:\Program Files\WindowsPowerShell\Modules\Moc\1.0.28\Moc.psm1: line 2328 at Install-MocInternal, C:\Program Files\WindowsPowerShell\Modules\Moc\1.0.28\Moc.psm1: line 1202 at Install-Moc, C:\Program Files\WindowsPowerShell\Modules\Moc\1.0.28\Moc.psm1: line 221 at Install-AksHciInternal, C:\Program Files\WindowsPowerShell\Modules\AksHci\1.1.36\AksHci.psm1: line 4554 at Install-AksHci, C:\Program Files\WindowsPowerShell\Modules\AksHci\1.1.36\AksHci.psm1: line 804 at <ScriptBlock>, <No file>: line 1]

It does randomly complete the download of the Moc, only then to fail on the next download.

In the downloadsdk.log: 2022-07-28T19:19:25-07:00 DEBUG Creating a new SFS provider 2022-07-28T19:19:25-07:00 DEBUG Will download release mocstack-stable version 1.0.11.10627 to c:\clusterstorage\containerstorage\AKS-WorkingDir\1.0.11.10707 2022-07-28T19:19:25-07:00 DEBUG Get verified files for mocstack-stable to c:\clusterstorage\containerstorage\AKS-WorkingDir\1.0.11.10707\dsdk-42733605 2022-07-28T19:19:25-07:00 DEBUG Getting download information from https://msk8s.api.cdp.microsoft.com/api/v1.1/contents/default/namespaces/default/names/mocstack-stable/versions/1.0.11.10627/files?action=generateDownloadInfo&ForegroundPriority=True 2022-07-28T19:19:25-07:00 DEBUG [DEBUG] POST https://msk8s.api.cdp.microsoft.com/api/v1.1/contents/default/namespaces/default/names/mocstack-stable/versions/1.0.11.10627/files?action=generateDownloadInfo&ForegroundPriority=True 2022-07-28T19:19:25-07:00 DEBUG Discovered 2 download(s) 2022-07-28T19:19:25-07:00 DEBUG Skipping download of file manifest.cab 2022-07-28T19:19:25-07:00 DEBUG Preparing to download: mocstack.cab (60633811 bytes) 2022-07-28T19:19:25-07:00 DEBUG Downloading http://msk8s.b.tlu.dl.delivery.mp.microsoft.com/filestreamingservice/files/494bba95-fdf8-449e-9e6a-e06512db5140?P1=1659665966&P2=404&P3=2&P4=cQvKX1wi0Ry%2b6mhupb6unyzyMARMMHu%2bZbpx7iW4pWDWZm9UujQuN725NC1Y%2b6VuLli9N9bbYGtwa3uxLJNTVQ%3d%3d to c:\clusterstorage\containerstorage\AKS-WorkingDir\1.0.11.10707\dsdk-42733605\mocstack.cab 2022-07-28T19:19:27-07:00 DEBUG Download attempt 1 of 5 failed with error: run: Corrupted download: ExpectedSize=60633811 SavedSize=40422540 2022-07-28T19:19:27-07:00 DEBUG Download will be retried in 10s 2022-07-28T19:19:38-07:00 DEBUG Download attempt 2 of 5 failed with error: run: Corrupted download: ExpectedSize=60633811 SavedSize=38692421 2022-07-28T19:19:38-07:00 DEBUG Download will be retried in 20s 2022-07-28T19:19:59-07:00 DEBUG Download attempt 3 of 5 failed with error: run: Corrupted download: ExpectedSize=60633811 SavedSize=40422540 2022-07-28T19:19:59-07:00 DEBUG Download will be retried in 40s 2022-07-28T19:20:40-07:00 DEBUG Download attempt 4 of 5 failed with error: run: Corrupted download: ExpectedSize=60633811 SavedSize=40422540 2022-07-28T19:20:40-07:00 DEBUG Download will be retried in 1m0s 2022-07-28T19:21:42-07:00 DEBUG Download attempt 5 of 5 failed with error: run: Corrupted download: ExpectedSize=60633811 SavedSize=40422540 2022-07-28T19:21:42-07:00 DEBUG Download will be retried in 1m0s

To Reproduce Steps to reproduce the behavior:

  1. Run Install-Akshci
  2. See error

Expected behavior Successful downloads.

Environment (please complete the following information):

Additional context Add any other context about the problem here.

downloadsdk.log

madhanrm commented 2 years ago

Can we try with the latest build and PS

@nwoodmsft - fyi

PragyaDw commented 1 year ago

@madhanrm can we close this?

OEI-Cgray commented 1 year ago

You can close this if you want.

Personally, I think it's a horrible idea to download the files after you've started making changes to the service via the installation process. Download everything first, then proceed with the installation. Breaking the install, then failing to download the files, just seems ripe for these sorts of issues.

I've basically given up on this product, every single month, it's some issue with upgrading.

Eventually we might spin up a k8s cluster and tie it to Arc manually.

I could go on for probably an hour on various odd decisions here, like no way to import or detect existing storage containers on a reinstallation, requiring you to re-create it and then manually copy the contents.

Elektronenvolt commented 1 year ago

Hi @OEI-Cgray

I'm using AKS-Hybrid for 3 years now, and yes there are a few things what can hurt and are not obvious at the beginning. Most of them are connected to Firewalls, Proxy (avoid it wherever you can), own certificates, restrictions in company networks, e.g. all the custom things every company has on its own infrastructure.

Regarding your download issue - #194 was about getting corrupted images to download. Can remember it, this happened once so far.

The download issue you describe sounds like what I see on our clusters. Downloads are done over one public provider IP and there is a download rate limit at the file streaming service. If I update multiple clusters at the same time, the first image downloads are fast, after a while it gets slower and at a second / third cluster update the downloads begin to run into timeouts. If I wait over night, everything is super fast again. You only see it at Powershell with debug and verbose flags on. I've reported this already, but didn't create a Github issue so far.

As a workaround until fixed / solved / improved we use the offline download feature to fetch the binaries once for all clusters before starting any update operation. https://learn.microsoft.com/en-us/azure/aks/hybrid/offline-download