Open WaitingForGuacamole opened 1 year ago
Hi there, @WaitingForGuacamole!
Thank you for reaching out.
Re: concurrency value; we'll often recommend users specify AUTO rather than any specific value when performance concerns come to mind. It looks like that was already specified by default.
I would be curious to see the actual job log (not the scanning log) here. It looks like enumeration is completing, but the actual transfers are dramatically slowing down. This can happen for a very wide variety of reasons, including exponential backoff, chunks getting stuck in some nature of waiting state, etc.
Regards, Adele
Adele,
Thanks for responding! I’m going to run this again, with INFO level logging, and get you logs after it runs overnight.
Is there a way that I can get these to you without posting archives on Github?
Cheers, Steve
Hi @WaitingForGuacamole if you are still experiencing this issue, please reach out with logs to azcopydev AT microsoft.com
@gapra-msft @WaitingForGuacamole I myself have this issue as well. I think the problem is as follows:
We should have an option to first sync the directory structure and only then the files. Or something like setting a flag to always create the parent directory in advance.
What I have done as a workaround, which is pretty annoying is using robocopy to sync directory structure and after that is done, run azcopy again to sync files as well.
But this means I have to mount the storage accounts in Windows, which I prefer not to do if possible.
Also, if you would run robocopy to sync everything, including the files, that is also going to take ages.
Here is the command I use for robocopy:
robocopy \\<source>.file.core.windows.net\<source_path> \\<destination>.file.core.windows.net\<destination_path> /MT:128 /e /xf *
After the folder structure have been created, than azcopy runs really really fast to the end.
Here is a link to a previous issue where a statement has been made that the request failure is what azcopy relies on creating the folders: https://github.com/Azure/azure-storage-azcopy/issues/2179#issuecomment-1528690473
Which version of the AzCopy was used?
10.18.1
Which platform are you using? (ex: Windows, Mac, Linux)
Windows 10 22H2
What command did you run?
What problem was encountered?
I'm copying a file share with 2.2 million TIFF images, each in its own folder.
azcopy copy
bogs down no matter what options I choose - if I leave the environment variables likeAZCOPY_BUFFER_GB
andAZCOPY_CONCURRENCY_VALUE
alone and take their defaults, or change them, it does not seem to matter.the scan finds all of the files, it even copies 10's of thousands of files successfully, but after a couple of hours the rate at which it updates slows to virtually nothing. A few hundred files at every two minute update. At the rate it's going it'll take a really long time to complete.
My scanning log does have a number of errors stating:
despite these, it does seem to find the files.
I'm copying from a Premium File Share in East US to a Premium File Share in West US. I'd use account replicas, but that's not supported for this kind of storage account. Each has a private endpoint, and the logs suggest they are being used from the IPs that are being resolved.
Here's some of the console output (having hit return to get each update on its own line a few times):
How can we reproduce the problem in the simplest way?
Have you found a mitigation/solution?
No, am wondering if I should just mount the shares in Linux and
rsync
them. Then again, it'll take a year and a day to enumerate all of those folders, so maybe there's no difference.I'm willing to try changing environment variables, but they're not particularly well documented - oh, there's documentation, but some have defaults listed, some have a rationale for how they are calculated, others just say they're used and increase them (to what I don't know) if necessary.