Azure / azure-storage-azcopy

The new Azure Storage data transfer utility - AzCopy v10
MIT License
613 stars 222 forks source link

Summarization shows incorrect information about Skips and Errors #2221

Open oliverwolfat opened 1 year ago

oliverwolfat commented 1 year ago

Which version of the AzCopy was used?

Note: The version is visible when running AzCopy without any argument

10.17.0

Which platform are you using? (ex: Windows, Mac, Linux)

Linux/CentOS 7

What command did you run?

Note: Please remove the SAS to avoid exposing your credentials. If you cannot remember the exact command, please retrieve it from the beginning of the log file.

azcopy copy 'https://s3.amazonaws.com/hosting-linde-prod/' 'https://chcloudlindeprod.blob.core.windows.net/dambucket/ --recursive'

What problem was encountered?

Some blobs in S3 have invalid blob names and are not transfered. In the result summarization, there are however not skipped or failed copys, which leads to believe everything was fine.

Log Sample

2023/04/25 15:17:32 AzcopyVersion  10.17.0
2023/04/25 15:17:32 OS-Environment  linux
2023/04/25 15:17:32 OS-Architecture  amd64
2023/04/25 15:17:32 Log times are in UTC. Local time is 25 Apr 2023 17:17:32
2023/04/25 15:35:14 Skipping S3 object s_1/0/0/110/28264/QG4WKVFPB8J5EMN3SDQO_Business Card Chinese with Project Address_R2., as it is not a valid Blob name. Rename the object and retry the transfer
2023/04/25 15:35:14 Skipping S3 object s_1/0/0/110/28315/NYKI335VR2G11U0AK8RE_HPO Campaign Challenge_R3., as it is not a valid Blob name. Rename the object and retry the transfer
2023/04/25 15:35:14 Skipping S3 object s_1/0/0/118/30418/XEG9J6UQRGJA0BRGE4O3_Nametags_R2., as it is not a valid Blob name. Rename the object and retry the transfer
2023/04/25 15:35:26 Skipping S3 object s_1/0/0/173/44346/BPS10XXV26DD87IHYAKK_A4 Envelope_R2., as it is not a valid Blob name. Rename the object and retry the transfer
2023/04/25 15:35:26 Skipping S3 object s_1/0/0/173/44346/XWUT0ZIRPGV6WQUCUFZ4_A4 Envelope_R1., as it is not a valid Blob name. Rename the object and retry the transfer
2023/04/25 15:35:27 Skipping S3 object s_1/0/0/175/45035/UMDNTYO7JZZGNS2S8OB1_Data sheet Oxygen Enrichment for Claus_R2., as it is not a valid Blob name. Rename the object and retry the transfer

Summary Sample

[2023/04/25 17:23:40] Transfer summary:
-----------------
Total files transferred: 389531
Transfer successfully:   389531
Transfer skipped:        0
Transfer failed:         0
Elapsed time:            00.02:06:32

How can we reproduce the problem in the simplest way?

  1. Create a S3 Bucket with a blob with a dot (.) in the name in AWS S3.
  2. Create an Azure Blob Container
  3. Try to copy the blob with azcopy

Have you found a mitigation/solution?

Not yet. Possible solution would be to use Azure Datafactory instead.

siminsavani-msft commented 1 year ago

Hi @oliverwolfat ! I was able to repro this and I see the issue here. I will report this as a bug for now and will relay to the team.

Just FYI, the files listed as invalid name is due to the '.' at the end of the object name and these files are skipped. This is an invalid blob name as seen in the note in this documentation: https://learn.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-containers--blobs--and-metadata#blob-names.

gapra-msft commented 10 months ago

@siminsavani-msft this also might be a good one to pick up!