The above documented API reference does not state that this is an idempotent API,
but I think it should be.
If it is intended to be, please consider the scenario mentioned below.
If this is not a bug, please do recommend the suggested/correct usage.
What happened?
Storage Account Config:
Versioning is enabled
We are talking about Block blobs
Consider the following scenario:
We stage a bunch of blocks using StageBlock
We commit the Blob using CommitBlockList
In very rare scenarios, we've noticed this creates 2 versions for the blob.
What did you expect or want to happen?
We expect only 1 version because we've called it only once.
Analysis
We noticed that the 2 created versions' timestamps were 3 seconds.
Which is same as the retry policy's first retry we provide the azblob client.
This request ID is generated uniquely for each request (not retry)
using NewUniqueRequestIDPolicyFactory
Even if we had the same requestID, azure ends up creating a new version.
There's no way for client to to avoid this scenario, unless we add a blob existence check
before the retry somehow. (Which will be very tedious, but i think we can do that by using
the provided pipeline).
How to reproduce
Since simulating
Azure servers fail to send the response
OR
Azure servers send the response but it doesn't reach us
will be very hard, you can just manually call the request twice, with same requestID.
Bug Report
Azure Blob Storage (
azblob
)'s API:CommitBlockList
is not idempotent. REST API Reference: https://learn.microsoft.com/en-us/rest/api/storageservices/put-block-list?tabs=microsoft-entra-idThe above documented API reference does not state that this is an idempotent API, but I think it should be. If it is intended to be, please consider the scenario mentioned below. If this is not a bug, please do recommend the suggested/correct usage.
What happened?
Storage Account Config:
Consider the following scenario:
StageBlock
CommitBlockList
In very rare scenarios, we've noticed this creates 2 versions for the blob.
What did you expect or want to happen?
We expect only 1 version because we've called it only once.
Analysis
We noticed that the 2 created versions' timestamps were 3 seconds. Which is same as the retry policy's first retry we provide the azblob client.
Which means this most probably happens in the following scenario:
Generally, to avoid such scenarios, servers can ask for a requestID, and then not do anything if that request was completed, resulting in a no op. The API Ref: https://learn.microsoft.com/en-us/rest/api/storageservices/put-block-list?tabs=microsoft-entra-id mentions such an id
x-ms-client-request-id
but it seems it's being used only for metrics.This request ID is generated uniquely for each request (not retry) using
NewUniqueRequestIDPolicyFactory
Even if we had the same requestID, azure ends up creating a new version.There's no way for client to to avoid this scenario, unless we add a blob existence check before the retry somehow. (Which will be very tedious, but i think we can do that by using the provided pipeline).
How to reproduce
Since simulating
will be very hard, you can just manually call the request twice, with same requestID.
Thanks!