Azure Blob storage throttling upload blobs

Azure / azure-storage-cpp

Microsoft Azure Storage Client Library for C++

Apache License 2.0

132 stars 148 forks source link

Hi, I am facing some performance issues while uploading blobs to Azure blob storage using this particular SDK: I am uploading 64MB sized blobs, experimenting around with various values of parallelism_factor (4/8/16). When I upload around 1GB of data for parallelism = 8/16, I get around 110MBps, but when I increase the total to about 5GB, the total throughput drops to around 50MBps. I checked the intermediate throughput, and I see that for the initial few blobs I get 80-90MBps, but for the subsequent blobs the throughput drops to 40-50MBps, and it even drops down to 20MBps.

Note that I am uploading these blobs sequentially.

Do you know what the possible reason could be for the difference in throughput for total size, and if there is some configuration which would lead to better throughput for large amount of data uploaded?

Note that for my use case, it is important to upload data in 64MB blobs, and the total amount of data uploaded will be in 10s of GBs, and would like to optimize for this use case. Thanks.

tasks = [] for i in range(total_size // blob_size): blob_name = f"blob_{i}" data = b"Your 64MB data here" # Replace with your actual data task = asyncio.ensure_future(upload_blob(container_name, blob_name, data)) tasks.append(task) await asyncio.gather(*tasks)

Azure / azure-storage-cpp

Azure Blob storage throttling upload blobs #394