Azure / azure-sdk-for-go

This repository is for active development of the Azure SDK for Go. For consumers of the SDK we recommend visiting our public developer docs at:
https://docs.microsoft.com/azure/developer/go/
MIT License
1.64k stars 844 forks source link

Azure Blob Storage CommitBlockList API is not idempotent. #23794

Open prince776 opened 1 day ago

prince776 commented 1 day ago

Bug Report

Azure Blob Storage (azblob)'s API: CommitBlockList is not idempotent. REST API Reference: https://learn.microsoft.com/en-us/rest/api/storageservices/put-block-list?tabs=microsoft-entra-id

The above documented API reference does not state that this is an idempotent API, but I think it should be. If it is intended to be, please consider the scenario mentioned below. If this is not a bug, please do recommend the suggested/correct usage.

What happened?

Storage Account Config:

  1. Versioning is enabled
  2. We are talking about Block blobs

Consider the following scenario:

  1. We stage a bunch of blocks using StageBlock
  2. We commit the Blob using CommitBlockList

In very rare scenarios, we've noticed this creates 2 versions for the blob.

What did you expect or want to happen?

We expect only 1 version because we've called it only once.

Analysis

We noticed that the 2 created versions' timestamps were 3 seconds. Which is same as the retry policy's first retry we provide the azblob client.

    p := azblob.NewPipeline(
        creds,
        azblob.PipelineOptions{
            Retry:     azblob.RetryOptions{
                Policy:        azblob.RetryPolicyExponential,
                MaxTries:      int32(5),
                RetryDelay:    3 * time.Second,
                MaxRetryDelay: 30 * time.Second,
                TryTimeout:    5 * time.Minute,
            },
            HTTPSender: newDefaultHTTPClientFactory(),
        },
    )
    container := azblob.NewContainerURL(*containerURI, p)

Which means this most probably happens in the following scenario:

  1. We send ComimitBlockList req
  2. Azure servers process it
  3. Azure servers fail to send the response OR Azure servers send the response but it doesn't reach us
  4. the retry happens
  5. retry causes creation of another blob version.

Generally, to avoid such scenarios, servers can ask for a requestID, and then not do anything if that request was completed, resulting in a no op. The API Ref: https://learn.microsoft.com/en-us/rest/api/storageservices/put-block-list?tabs=microsoft-entra-id mentions such an id x-ms-client-request-id but it seems it's being used only for metrics.

This request ID is generated uniquely for each request (not retry) using NewUniqueRequestIDPolicyFactory Even if we had the same requestID, azure ends up creating a new version.

There's no way for client to to avoid this scenario, unless we add a blob existence check before the retry somehow. (Which will be very tedious, but i think we can do that by using the provided pipeline).

How to reproduce

Since simulating

  1. Azure servers fail to send the response OR Azure servers send the response but it doesn't reach us

will be very hard, you can just manually call the request twice, with same requestID. Image

Thanks!

github-actions[bot] commented 1 day ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.