Azure / azure-storage-azcopy

The new Azure Storage data transfer utility - AzCopy v10
MIT License
611 stars 222 forks source link

AZcopy slow performance on scanning and transferring 2-3 millions folder properties #1232

Open ankur8 opened 3 years ago

ankur8 commented 3 years ago

Which version of the AzCopy was used?

10.6.0

Note: The version is visible when running AzCopy without any argument

yes (AzCopy 10.6.0)

Which platform are you using? (ex: Windows, Mac, Linux)

linux

What command did you run?

azcopy copy /mnt/app1 "https://filesharename.file.core.windows.net/nameofshare/images-v2?"

Note: Please remove the SAS to avoid exposing your credentials. If you cannot remember the exact command, please retrieve it from the beginning of the log file.

What problem was encountered?

Scanning and transferring of folder properties taking more than 3-4 days . tweak below env variables but not much improvement we see export AZCOPY_CONCURRENT_FILES=512

export AZCOPY_CONCURRENT_SCAN=512

export AZCOPY_PARALLEL_STAT_FILES=true

How can we reproduce the problem in the simplest way?

in source folder have 2 million folder properties with some files in each folder and transfer from source to azure fileshare

Have you found a mitigation/solution?

working on workaround apart from azcopy as tried different env variables tunning but did not see performance jump.

MihalyTorok commented 3 years ago

I had the same problem with version 10.6. I tried the version 10.8 now and the issue is still in place. I want to copy the updated or new blobs from a source blob storage container to a destination in another storage account created in the same subscription. I need to check around 300000 blobs in different folders which have their total size 130GB. From these probably 10000 are different or new and they need to be copied. The azcopy.exe tool has run in the scanning step for 45 minutes and after that the authentication expired. In 30 minutes it scanned only the destination. I tried the same operation with an earlier version of azcopy (10.4.3) and it finished in 10 minutes. Please check the scanning functionality.

MihalyTorok commented 3 years ago

I tested the version 10.9 and the issue still exists.

worldspawn commented 3 years ago

I have also been totally disappointed with the performance of this tool. My requirements are even less fancy as I just want to copy an entire container to another empty container (on a different account). Lots of time it spends 98% of the time scanning and only 2% of the time copying anything. Progress absolutely snails along. I think at the core of copying in general is the storage blob rest api has no batch copy endpoint. You must perform a single http request per copy which i imagine is what azcopy has to do. Still it seems way to slow.

I've been playing around with rolling my own:

class Program
    {
        static async Task Main(string[] args)
        {
            var sourceSas =
                "itsasecret";
            var destinationSas =
                "itsasecret";

            var httpClient = new HttpClient();
            var stopWatch = new Stopwatch();
            stopWatch.Start();
            string nextMarker = null;

            ConcurrentQueue<string> files = new ConcurrentQueue<string>();
            var counter = 0;
            var done = false;

            var scanTask = Task.Run(async () =>
            {
                do
                {
                    var request = new HttpRequestMessage(HttpMethod.Get,
                        $"https://source.blob.core.windows.net/grain-blobs?restype=container&marker={nextMarker}&comp=list&include=metadata&{sourceSas}");

                    nextMarker = null;
                    var response = await httpClient.SendAsync(request);
                    using (var reader = XmlReader.Create(await response.Content.ReadAsStreamAsync(),
                        new XmlReaderSettings { Async = true }))
                    {

                        while (await reader.ReadAsync())
                        {
                            //Console.WriteLine(reader.Name);
                            if (reader.Name == "Name" && reader.NodeType == XmlNodeType.Element)
                            {
                                await reader.ReadAsync();
                                //Console.WriteLine(reader.Value);
                                files.Enqueue(reader.Value);
                            }

                            if (reader.Name == "NextMarker" && reader.NodeType == XmlNodeType.Element)
                            {
                                await reader.ReadAsync();
                                nextMarker = reader.Value;
                            }
                        }
                    }

                    while (files.Count > 6000)
                    {
                        await Task.Delay(500);
                    }

                    //Console.WriteLine(stopWatch.Elapsed);                 
                } while (nextMarker != null);

                done = true;
            });

            var socketsHandler = new SocketsHttpHandler
            {
                PooledConnectionLifetime = TimeSpan.FromMinutes(10),
                PooledConnectionIdleTimeout = TimeSpan.FromMinutes(5),
                MaxConnectionsPerServer = 100,
                EnableMultipleHttp2Connections = true
            };

            var copyRequests = new List<Task>();
            var filesProcessed = 0;
            var filesFailed = 0;
            var filesCompleted = 0;
            var activeCopyRequests = 0;
            while (!done)
            {
                if (files.TryDequeue(out var file))
                {
                    var httpClient2 = new HttpClient(socketsHandler, false);
                    var fileMessage = new HttpRequestMessage(HttpMethod.Put, new Uri($"https://destination.blob.core.windows.net/grain-blobs/{file}?{destinationSas}"));
                    fileMessage.Headers.Add("x-ms-copy-source", $"https://source.blob.core.windows.net/grain-blobs/{file}?{sourceSas}");
                    fileMessage.Headers.Add("x-ms-date", DateTime.UtcNow.ToString("R"));

                    var fileTask = httpClient2.SendAsync(fileMessage);
                    copyRequests.Add(fileTask.ContinueWith((task) =>
                    {
                        Interlocked.Decrement(ref activeCopyRequests);
                        if (task.IsCompletedSuccessfully)
                        {
                            Interlocked.Increment(ref filesCompleted);
                        }
                        else
                        {
                            Console.WriteLine(task.Exception?.Message);
                            Interlocked.Increment(ref filesFailed);
                        }
                    }, TaskContinuationOptions.None));
                    Interlocked.Increment(ref activeCopyRequests);
                    //activeCopyRequests++;
                    filesProcessed++;

                    if (filesProcessed % 100 == 0)
                    {
                        Console.WriteLine($"{filesProcessed}/{filesFailed} - Pending {files.Count} - {stopWatch.Elapsed}");
                    }

                    if (activeCopyRequests == socketsHandler.MaxConnectionsPerServer)
                    {
                        await Task.WhenAll(copyRequests);
                        copyRequests.Clear();
                    }
                }
                else
                {
                    await Task.Delay(50);
                }
            }
        }
    }

Its throttled to 100 concurrent copy requests. The scanning continues on while the copy requests are being made. Running on my home pc this is how its going so far (my case its many many small files):

166800/0 - Pending 7804 - 00:14:29.2234237
166900/0 - Pending 7704 - 00:14:29.7278800
167000/0 - Pending 7604 - 00:14:30.2788035
167100/0 - Pending 7504 - 00:14:30.7121184
167200/0 - Pending 7404 - 00:14:31.1373527
167300/0 - Pending 7304 - 00:14:35.3622993
167400/0 - Pending 7204 - 00:14:35.9175653

So its sent 167400 copy requests in 14m 35s. Note that does not mean 167400 copy requests have finished. Azcopy did about 600k after 6 hours then i cancelled it.

@MihalyTorok this might work for you in conjunctions with If-Unmodified-Since parameter. The list query does return the Last-Modified field per blob I'm just not making use it. You could adapt the code to store that with the blob uri and pass it with the copy request. In theory :D

worldspawn commented 3 years ago

Cleaned up this code into two services and got my whole storage account migrated in under an hour.

https://github.com/DrawboardLtd/blobcopy/tree/master/src

ZaoDigital commented 1 year ago

Can we get msft looking at this one again - this program is actually so slow going from s3 to azure with a lot of items involved - the above self-rolled programs work infinitely better. I want to use azcopy because in theory the functionality is everything I need but in practice it's very inefficient

trixomixolydian commented 7 months ago

I am encountering the same issue with 10.23.0 on both Windows and Linux. Have fiddled around with the environment variables for azcopy and none really seem to make any difference.

Need to copy 6 million files from Azure Files to Azure Blob. After testing with 100,000 on Windows, over six hours later it is still running and only 70,000 have been processed or maybe just scanned.

At these rates, this tool is basically useless.

PS C:\Utilities\AzCopy> .\azcopy.exe copy https://-REDACTED-.file.core.windows.net/expertadmin-ivr-01/conf/?se=2024-03-25t00%3A00%3A00z&sig=-REDACTED-&sp=rwdlacupiytfx&spr=https&srt=sco&ss=bfqt&st=2024-03-23t00%3A00%3A00z&sv=2022-11-02 https://-REDACTED-.blob.core.windows.net/liveconnects-ivr-01?se=2024-03-25t00%3A25%3A14z&sig=-REDACTED-&sp=rwdlacupiytfx&spr=https&srt=sco&ss=bfqt&st=2024-03-23t00%3A00%3A00z&sv=2022-11-02 --recursive=true --log-level=error --check-length=false --list-of-files .\listoffiles\folders100000.txt --overwrite=false

INFO: Scanning... INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support

Job d668f885-8520-0b40-4164-6066cbdbe416 has started Log file is located at: C:\Users\username.azcopy\d668f885-8520-0b40-4164-6066cbdbe416.log

0.0 %, 0 Done, 0 Failed, 0 Pending, 70000 Skipped, 70000 Total (scanning...),

The Job finally completed after nearly 24 hours. Pretty slow for only processing 100,000 folders. This job is to ensure that we reconcile what is in blob storage with what we had in an Azure File share that needed to be in place for on-premise access via Azure Storage Sync Services. We are then aging out the data from the file share after confirming that the data is in the blob container. Once the files are confirmed, then we are using AzCopy "rm" in a second job to remove the files from the Azure File Share. The second job takes nearly as much time as the first one. I would think a job that deletes would at least be exponentially faster than a copy job. This utility needs nourishment.