Azure / azure-storage-azcopy

The new Azure Storage data transfer utility - AzCopy v10
MIT License
610 stars 220 forks source link

azcopy from DataLake Gen2 to v1 Storage Account fails due to blob access tier not support (log output also gives incorrect 403 Authorization Failure response error) #1616

Closed jlester-msft closed 2 years ago

jlester-msft commented 2 years ago

Which version of the AzCopy was used?

azcopy version 10.13.0

Which platform are you using? (ex: Windows, Mac, Linux)

Windows PowerShell

What command did you run?

$datalake_sas_token="<SAS token from https://docs.microsoft.com/en-us/azure/open-datasets/dataset-genomics-data-lake>"
$wdl_sas_token="<storage account sas token with all permissions>"
.\azcopy copy "https://dataset1000genomes.blob.core.windows.net/dataset/data_collections/1000_genomes_project/data/ACB/HG01879/exome_alignment/HG01879.alt_bwamem_GRCh38DH.20150826.ACB.exome.cram?$datalake_sas_token" "https://wdltestcf6818421e7f.blob.core.windows.net/inputs?$wdl_sas_token"

What problem was encountered?

Azcopy starts the copy (this is a 6GB file) with a warning: "Failed to create one or more destination container(s). Your transfers may still succeed if the container already exists". Attempts the copy and then fails. Looking at the log output it shows a "403 This request is not authorized to perform this operation." response error at the top. Which is confusing because the same operation works if you copy from a different source like the local file system. So it did not seem like an authorization error.

Checking further in the log file shows the true error at the bottom: "400 Blob access tier is not supported on this storage account type." Which seems to be what is preventing the transfer from happening. azcopy is setting "X-Ms-Access-Tier: [Hot]" and the response it's getting back is that this is a v1 Storage Account which does not support access tiers.

There's also a more verbose error message with the true failure: 400 Blob access tier is not supported on this storage account type.. When Committing block list

2021/11/08 21:23:14 ERR: [P#0-T#0] COPYFAILED: https://dataset1000genomes.blob.core.windows.net/dataset/data_collections/1000_genomes_project/data/ACB/HG01879/exome_alignment/HG01879.alt_bwamem_GRCh38DH.20150826.ACB.exome.cram?si=prod&sig=-REDACTED-&sr=c&sv=2019-10-10 : 400 : 400 Blob access tier is not supported on this storage account type.. When Committing block list. X-Ms-Request-Id: 959c29be-101e-0018-66e6-d43f99000000

If you attempt to run the same command with --block-blob-tier "None" --page-blob-tier "None" it still sets the access tier to Hot and the transfer fails. So there seems to be some bug with azcopy not realizing that the destination is a v1 Storage Account which doesn't support access tiers and/or that azcopy doesn't honor the specified block-blob-tier/page-blob-tier

Console output:

INFO: Scanning...
INFO: Failed to create one or more destination container(s). Your transfers may still succeed if the container already exists.
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support

Job 2a578b5e-d116-4c41-485c-40284e5dbddc has started
Log file is located at: C:\Users\jlester\.azcopy\2a578b5e-d116-4c41-485c-40284e5dbddc.log

100.0 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total,

Job 19468e1a-c3a7-004b-56b9-4ac705d37277 summary
Elapsed Time (Minutes): 0.3347
Number of File Transfers: 1
Number of Folder Property Transfers: 0
Total Number of Transfers: 1
Number of Transfers Completed: 0
Number of Transfers Failed: 1
Number of Transfers Skipped: 0
TotalBytesTransferred: 0
Final Job Status: Failed

First error message inside the log file:

2021/11/08 21:22:54 WARN: failed to initialize destination container inputs; the transfer will continue (but be wary it may fail): -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, /home/vsts/go/pkg/mod/github.com/!azure/azure-storage-blob-go@v0.13.1-0.20210823171415-e7932f52ad61/azblob/zc_storage_error.go:42
===== RESPONSE ERROR (ServiceCode=AuthorizationFailure) =====
Description=This request is not authorized to perform this operation.
RequestId:5a5f4965-b01e-0073-1ce6-d4b86d000000
Time:2021-11-08T21:22:54.7298297Z, Details: 
   Code: AuthorizationFailure
   PUT https://wdltestcf6818421e7f.blob.core.windows.net/inputs?restype=container&se=2021-11-09t05%3A11%3A22z&sig=-REDACTED-&sp=racwdli&sr=c&st=2021-11-08t21%3A11%3A22z&sv=2020-08-04&timeout=180
   User-Agent: [AzCopy/10.13.0 Azure-Storage/0.14 (go1.16; Windows_NT)]
   X-Ms-Client-Request-Id: [32327337-ec1a-4b7e-425e-89132f82799f]
   X-Ms-Version: [2019-12-12]
   --------------------------------------------------------------------------------
   RESPONSE Status: 403 This request is not authorized to perform this operation.
   Content-Length: [246]
   Content-Type: [application/xml]
   Date: [Mon, 08 Nov 2021 21:22:53 GMT]
   Server: [Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0]
   X-Ms-Client-Request-Id: [32327337-ec1a-4b7e-425e-89132f82799f]
   X-Ms-Error-Code: [AuthorizationFailure]
   X-Ms-Request-Id: [5a5f4965-b01e-0073-1ce6-d4b86d000000]
   X-Ms-Version: [2019-12-12]

Log message at the end of the log:

2021/11/08 21:23:14 ==> REQUEST/RESPONSE (Try=1/222.1641ms, OpTime=223.1834ms) -- RESPONSE STATUS CODE ERROR
   PUT https://wdltestcf6818421e7f.blob.core.windows.net/inputs/HG01879.cram?comp=blocklist&se=2021-11-09t05%3A11%3A22z&sig=-REDACTED-&sp=racwdli&sr=c&st=2021-11-08t21%3A11%3A22z&sv=2020-08-04&timeout=901
   Content-Length: [51308]
   Content-Type: [application/xml]
   User-Agent: [AzCopy/10.13.0 Azure-Storage/0.14 (go1.16; Windows_NT)]
   X-Ms-Access-Tier: [Hot]
   X-Ms-Blob-Cache-Control: []
   X-Ms-Blob-Content-Disposition: []
   X-Ms-Blob-Content-Encoding: []
   X-Ms-Blob-Content-Language: []
   X-Ms-Blob-Content-Type: [application/octet-stream]
   X-Ms-Client-Request-Id: [d6a0d1ad-05f9-41b8-4f6b-12f34a3da14a]
   X-Ms-Meta-Asperatransfer: [true]
   X-Ms-Meta-Modified: [2015-11-10 00:00:00Z]
   X-Ms-Version: [2019-12-12]
   --------------------------------------------------------------------------------
   RESPONSE Status: 400 Blob access tier is not supported on this storage account type.
   Content-Length: [272]
   Content-Type: [application/xml]
   Date: [Mon, 08 Nov 2021 21:23:14 GMT]
   Server: [Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0]
   X-Ms-Client-Request-Id: [d6a0d1ad-05f9-41b8-4f6b-12f34a3da14a]
   X-Ms-Error-Code: [BlobAccessTierNotSupportedForAccountType]
   X-Ms-Request-Id: [959c29be-101e-0018-66e6-d43f99000000]
   X-Ms-Version: [2019-12-12]
Response Details: <Code>BlobAccessTierNotSupportedForAccountType</Code><Message>Blob access tier is not supported on this storage account type. </Message>

2021/11/08 21:23:14 ERR: [P#0-T#0] COPYFAILED: https://dataset1000genomes.blob.core.windows.net/dataset/data_collections/1000_genomes_project/data/ACB/HG01879/exome_alignment/HG01879.alt_bwamem_GRCh38DH.20150826.ACB.exome.cram?si=prod&sig=-REDACTED-&sr=c&sv=2019-10-10 : 400 : 400 Blob access tier is not supported on this storage account type.. When Committing block list. X-Ms-Request-Id: 959c29be-101e-0018-66e6-d43f99000000

How can we reproduce the problem in the simplest way?

Source:

Destination:

Log files:

2a578b5e-d116-4c41-485c-40284e5dbddc-scanning.log 2a578b5e-d116-4c41-485c-40284e5dbddc.log

Have you found a mitigation/solution?

Only by copying the file locally (DataLake gen 2 to local file system) and then uploading the file from the local drive to the Storage Account (local file system to v1 Storage Account)

siminsavani-msft commented 2 years ago

Hi @jlester-msft ! I was unable to repro your scenario. I tried uploading from ADLS gen 2 (with hierarchical namespace enabled) to a v1 storage account and it was successful for me. From the logs it seems that the tier level is preventing the blobs from being copied. Do you mind trying again and setting --s2s-preserve-access-tier to false? Please let me know how it goes!

jlester-msft commented 2 years ago

Hi @siminsavani-msft, adding in "--s2s-preserve-access-tier=false" allowed me to successfully transfer from the DataLake to the Storage Account. The "Failed to create one or more destination..." message still appears but it is still able to complete the transfer.

PS E:\projects> .\azcopy copy "https://dataset1000genomes.blob.core.windows.net/dataset/data_collections/1000_genomes_project/data/ACB/HG01879/exome_alignment/HG01879.alt_bwamem_GRCh38DH.20150826.ACB.exome.cram?$datalake_sas_token" "https://wdltestcf6818421e7f.blob.core.windows.net/inputs/HG01879.alt_bwamem_GRCh38DH.20150826.ACB.exome.cram?$wdl_sas_token" --s2s-preserve-access-tier=false
INFO: Scanning...
INFO: Failed to create one or more destination container(s). Your transfers may still succeed if the container already exists.
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support

Job 47dc852c-22c8-604a-421b-ff71fe427770 has started
Log file is located at: C:\Users\jlester\.azcopy\47dc852c-22c8-604a-421b-ff71fe427770.log

99.9 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total,

Job 47dc852c-22c8-604a-421b-ff71fe427770 summary
Elapsed Time (Minutes): 0.3344
Number of File Transfers: 1
Number of Folder Property Transfers: 0
Total Number of Transfers: 1
Number of Transfers Completed: 1
Number of Transfers Failed: 0
Number of Transfers Skipped: 0
TotalBytesTransferred: 6611021194
Final Job Status: Completed

When you were unable to repo the issue did the transfer work successfully without adding in "--s2s-preserve-access-tier=false"?

jlester-msft commented 2 years ago

After digging into this a bit more the issue comes down to the fact that the specific destination Storage Account being used here, wdltestcf6818421e7f, is a v1 Storage Account with its Access Tier set to disabled. Anyone with the same issue can check their "Default access tier" by going to the Azure Portal, going to the Storage Account, and going to the Overview page and seeing if there's a "Default access Tier" listed under "Blob service".

If it shows a default of 'Hot' you shouldn't encounter this issue. If it doesn't have a "Default access tier" listed you have three options: 1) Use --s2s-preserve-access-tier=false 2) Upgrade to a v2 Storage Account 3) See if you can change your v1 Storage Account to not have a disabled access tier, i.e. default to Hot

Likely 1) is the best solution.

Thanks!