Closed paulmrozowski closed 4 years ago
@paulmrozowski
Which version of AzCopy do you use? Does this consistently repro? (It seems the download is not finished and fail/stop in the middle.)
Would you please run it with following command and see if any error happens , if the file still not download correctly?
Get-AzStorageBlobContent -Container $StorageContainerName -Blob $FileToDownload.FileName -Destination $DownloadPath -Context $StorageContext
Is it possible that the $DownloadError is overwritten by other command?
Besides that , do you mind to share the source file URI and download time, so I can see the server log to investigate it.
AzCopy works fine (v10.2.1), but the script uses Get-AzStorageBlobContent and that's where the problem is. Yes, this now consistently fails after running successfully for a few months.
We know the download is corrupt (and NEVER successfully downloads anymore) because I've tried this manually - the script does an MD5 verification against the file. I also manually downloaded the file via Get-AzStorageBlobContent and another copy via AzCopy and did a file comparison while trying to fix this.
I don't believe there is any other place where $DownloadError would be changed. This download is wrapped in a do...while to automatically attempt 1 retry on the download. We push this info into a log and into Slack.
do
{
# Code to download file here
} while ($DownloadError[0].Exception.Message -like "*The client could not finish the operation within specified timeout*")
if (-not($DownloadError[0]) -and -not($DownloadTimeout))
{
Write-Log -Level 'SLACK' -Message "*Nightly Restore:* Successfully Downloaded [ *{0}* ]"`
-Arguments $FileToDownload.FileName
return $True
# Write-Log -Level 'SLACK' -Message "*Nightly Restore:* Successfully Downloaded [ *{0}* ], Time Taken: *{1}* Minutes"`
# -Arguments $FileToDownload.FileName, $([Math]::Round($Script:DownloadTime.TotalMinutes, 2))
# return $True
}
elseif ($DownloadTimeout)
{
Write-Log -Level 'SLACK' -Message "*Nightly Restore:* Download timeout after two attempts of [ {0} ]. ``ERROR:`` *{1}*"`
-Arguments $FileToDownload.FileName, $DownloadError[0].Exception.Message
return $False
}
else
{
Write-Log -Level 'SLACK' -Message "*Nightly Restore:* There was an error downloading [ {0} ]. ``ERROR:`` *{1}*"`
-Arguments $FileToDownload.FileName, $DownloadError[0].Exception.Message
return $False
}
Is there a way I can pull the server log for this vs. sharing the URL?
@paulmrozowski I have tested Get-AzStorageBlobContent with Az.Storage module 1.8.0 on my machine, and not see this issue repro by download one 100GB blob. After the download fail with "The client could not finish the operation within specified timeout.", it will put the error to $DownloadError, and per your code, it should retry.
And I don't see you add "-Force" parameter to "Get-AzStorageBlobContent". If so, retry will get a pops up to ask for overwrite the existing file, and the script can't be run with none interactive.
Could you please share the real script you run for "# Code to download file here", please hide any credential?
And for the server log, if you can give the account name, container name, blob name, and the repro time, I can search the server log. If you can't past it here, you can send to my mail weiwei@microsoft.com.
This is the full method. We delete the file in the real code before the retry.
function Get-DownloadFiles
{
param ($FileToDownload)
$CurrentFiles = Get-ChildItem -Path $DownloadPath -File -Filter *.bak
Write-Log -Level "INFO" -Message "Processing {0} file(s) for Download" -Arguments @($FileToDownload).Count
if ($CurrentFiles | Where-Object { $_.Name -eq $FileToDownload.FileName })
{
Write-Log -Level 'INFO' -Message "File found locally so will skip"
return $True
}
else
{
Write-Log -Level 'INFO' -Message "Downloading: {0}" -Arguments $FileToDownload.FileName
# TODO: Wrap this in a DoWHile loop for the Client Timeout error but keep track and maybe only try like 3 times at most, then throw an error
# Need to also cleanup the failed file else there won't be enough spae and it is corrupt anyway
$DownloadCounter = 0
$DownloadTimeout = $False
do
{
$DownloadCounter ++
if ($DownloadCounter -gt 1)
{
Write-Log -Level 'INFO' -Message "Error Counter Value: [$DownloadCounter of 2]. Deleting existing download to retry"
Remove-BackupFiles -FileNameToDelete $FileToDownload.FileName -CallingFunction "Get-DownloadFiles"
}
if ($DownloadCounter -gt 2)
{
Write-Log -Level 'INFO' -Message "Error Counter Value: [$DownloadCounter of 2]. Download Operation Timeout"
$DownloadTimeout = $True
Break
}
else
{
Get-AzStorageBlobContent -Container $StorageContainerName -Blob $FileToDownload.FileName -Destination $DownloadPath -Context $StorageContext -ErrorAction SilentlyContinue -ErrorVariable DownloadError
}
} while ($DownloadError[0].Exception.Message -like "*The client could not finish the operation within specified timeout*")
if (-not($DownloadError[0]) -and -not($DownloadTimeout))
{
Write-Log -Level 'SLACK' -Message "*Nightly Restore:* Successfully Downloaded [ *{0}* ]"`
-Arguments $FileToDownload.FileName
return $True
Write-Log -Level 'SLACK' -Message "*Nightly Restore:* Successfully Downloaded [ *{0}* ], Time Taken: *{1}* Minutes"`
-Arguments $FileToDownload.FileName, $([Math]::Round($Script:DownloadTime.TotalMinutes, 2))
return $True
}
elseif ($DownloadTimeout)
{
Write-Log -Level 'SLACK' -Message "*Nightly Restore:* Download timeout after two attempts of [ {0} ]. ``ERROR:`` *{1}*"`
-Arguments $FileToDownload.FileName, $DownloadError[0].Exception.Message
return $False
}
else
{
Write-Log -Level 'SLACK' -Message "*Nightly Restore:* There was an error downloading [ {0} ]. ``ERROR:`` *{1}*"`
-Arguments $FileToDownload.FileName, $DownloadError[0].Exception.Message
return $False
}
}
}
Will follow up by mail, and update the issue later.
Discussed with @paulmrozowski in mail:
So we will close the issue now.
Thanks for working with Microsoft on GitHub! Tell us how you feel about your experience using the reactions on this comment.
Description
We are using Get-AzStorageBlobContent in a Powershell script. It seemed to work consistently well for a while but now that one of the blobs we download is over approx 95GB we are seeing that the file downloaded is corrupt - it appears that around #16 3300 000 (hex) suddenly the file is just filled with 00 values. We don't see any errors in $DownloadError after it completes. As far as it's concerned, it was successful.
If we manually download the file using AzCopy instead, it downloads successfully.
Steps to reproduce
Environment data
Module versions
Debug output
This didn't show anything beyond "Yep, starting 20 remote calls", then "Finished 20 remote calls" along with an operation ID. No errors.
Error output
This is not generating any errors.