Note: The version is visible when running AzCopy without any argument
10.23.0
10.24.0
10.25.0-Preview-1
Which platform are you using? (ex: Windows, Mac, Linux)
Linux: 6.5.0-1017-azure #17~22.04.1-Ubuntu SMP Sat Mar 9 10:04:07 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
What command did you run?
Note: Please remove the SAS to avoid exposing your credentials. If you cannot remember the exact command, please retrieve it from the beginning of the log file.
azure-storage-azcopy jobs show <jobID> --with-status=Failed
azure-storage-azcopy jobs show <jobID>
What problem was encountered?
Out of memory kill from the OS
How can we reproduce the problem in the simplest way?
Run the above commands on any of the above AzCopy versions on an Ubuntu VM on a large (my scenario included 225 million files) completed job's result.
Have you found a mitigation/solution?
The only workaround I have for inspecting errors is to grep the job's logs for COPYFAILED and pipe that to a separate file for further examination:
I noticed that when running azure-storage-azcopy jobs show <jobID> --with-status=Failed for a large job (~370 TB over 225 million files), the command exits with 137 and a Killed stderr message. This seems to correspond to an out-of-memory error from the kernel, and it kill(9)s the azcopy process.
Is this a known bug?
Some data
I captured some really crude logs with free on an Ubuntu 22.04 ARM64 VM in Azure running nothing but azure-storage-azcopy jobs show <jobID> --with-status=Failed in a tmux session and saw that system RAM usage grows monotonically until the OS kills azcopy (haven't correlated it fully with azcopy's invocation, but azcopy definitely gets killed before my memory sample collection is complete).
I've reproduced this with various combinations of Go and AzCopy versions:
I also captured a single free sample with azcopy 10.25.0-Preview-1 and Go 1.22.2 just running azure-storage-azcopy jobs show <jobID>, and that also shows a monotonic memory increase, but the azcopy command completes before it runs out of memory: azcopy-10.25.0-Preview-1-go1.22.2-linux-arm64-summary-memprofile.log
Here's how the system memory usage for each of these scenarios looks when plotted together:
Which version of the AzCopy was used?
Note: The version is visible when running AzCopy without any argument
Which platform are you using? (ex: Windows, Mac, Linux)
Linux:
6.5.0-1017-azure #17~22.04.1-Ubuntu SMP Sat Mar 9 10:04:07 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
What command did you run?
Note: Please remove the SAS to avoid exposing your credentials. If you cannot remember the exact command, please retrieve it from the beginning of the log file.
azure-storage-azcopy jobs show <jobID> --with-status=Failed
azure-storage-azcopy jobs show <jobID>
What problem was encountered?
Out of memory kill from the OS
How can we reproduce the problem in the simplest way?
Run the above commands on any of the above AzCopy versions on an Ubuntu VM on a large (my scenario included 225 million files) completed job's result.
Have you found a mitigation/solution?
The only workaround I have for inspecting errors is to grep the job's logs for
COPYFAILED
and pipe that to a separate file for further examination:I noticed that when running
azure-storage-azcopy jobs show <jobID> --with-status=Failed
for a large job (~370 TB over 225 million files), the command exits with137
and aKilled
stderr message. This seems to correspond to an out-of-memory error from the kernel, and itkill(9)
s the azcopy process.Is this a known bug?
Some data
I captured some really crude logs with
free
on an Ubuntu 22.04 ARM64 VM in Azure running nothing butazure-storage-azcopy jobs show <jobID> --with-status=Failed
in atmux
session and saw that system RAM usage grows monotonically until the OS killsazcopy
(haven't correlated it fully withazcopy
's invocation, butazcopy
definitely gets killed before my memory sample collection is complete).I also captured a single
free
sample withazcopy 10.25.0-Preview-1
andGo 1.22.2
just runningazure-storage-azcopy jobs show <jobID>
, and that also shows a monotonic memory increase, but theazcopy
command completes before it runs out of memory: azcopy-10.25.0-Preview-1-go1.22.2-linux-arm64-summary-memprofile.logHere's how the system memory usage for each of these scenarios looks when plotted together: