Azure / azure-storage-azcopy

The new Azure Storage data transfer utility - AzCopy v10
MIT License
614 stars 222 forks source link

AZCopy Failing Due to Consuming All Memory #2355

Open Improving-Lee opened 1 year ago

Improving-Lee commented 1 year ago

Which version of the AzCopy was used?

10.20.1

Which platform are you using? (ex: Windows, Mac, Linux)

Windows

What command did you run?

azcopy sync "https://fileshare.file.core.windows.net/share?&< SAS >" "https://fileshare2.file.core.windows.net/share&< SAS >" --preserve-smb-info --preserve-smb-permissions --recursive --delete-destination=TRUE

What problem was encountered?

I have an Azure VM with 4-cores and 16GB of RAM. I've set the memory buffer environment variable using $env:AZCOPY_BUFFER_GB=0.5. My AZCOPY SYNC job is processing around 15M small files, but the jobs always fail with the error "fatal error: out of memory." Looking at the VM insights, I can see the VM does run out of all available memory and the processor utilization sits at 100%.

If I am reading the documentation correctly, the memory buffer variable sets the total job memory limit, not a per-core memory limit. But even at that, I would not expect this job to be able to consume over 2GB RAM at most, but it eventually consumes all 16GB and fails. Screenshot confirms the environment variable is set in the same PowerShell session in which AZCOPY is being run.

How can we reproduce the problem in the simplest way?

Copying files between two Azure File Shares in different regions.

Have you found a mitigation/solution?

No solution. I originally had the memory buffer variable set at 4GB, and while this was set I did observe the memory consumption increase at a much faster rate. While set at 0.5GB, the memory consumption appears to be much slower, but it still consuming all available memory until it fails.

Screenshot 2023-08-30 165232

ttourougui commented 7 months ago

Check this There are multiple env vars that you can adjust to prevent OOM during massive upload to Azure storage account using azcopy cli

For example:

set AZCOPY_CONCURRENCY_VALUE=4
set AZCOPY_CONCURRENT_FILES=50
set AZCOPY_BUFFER_GB=4
lchircop commented 1 month ago

Having the same exact problem. Trying to optimize the command with different values like @ttourougui mentioned, but still having the same problem of the sync job failing when it reaches 100% of memory.

For some reason when there's a large number of files to be synced, most of the data is stored in memory causing it to fail, I think this only happens when the azcopy checks the amount of files at source and destination