Open marco-rossi29 opened 4 years ago
Hi @marco-rossi29, thanks for reaching out!
This behavior is by design, as the SDK needs to guarantee the data integrity: once the transfer starts, the downloader will lock onto the blob's Etag, as we cannot possibly know which part of the blob changed.
When given if_match='*', the method checks the condition by performing a GET request on the beginning of the blob; and the Etag to lock onto is obtained in the same call.
You could work around this problem by having your own version of the downloader, would that be a satisfactory solution for you?
yes, I'm happy to have my own version of the downloader.
I've found two ways to work around it by modifying this line:
https://github.com/Azure/azure-storage-python/blob/master/azure-storage-blob/azure/storage/blob/_download_chunking.py#L130
to be either:
1) self.if_match = None
2) comment out the line and passing if_match='*'
in get_blob_to_path() (this in my opinion is the expected behavior of if_match='*'
)
Do you have other suggestions?
option 1 seems to be the easiest. 😄
Thank you for your answer. I'm now puzzeled by the if_match='*'
specified in the docs:
if_match (str) – An ETag value, or the wildcard character (*). Specify this header to perform the operation only if the resource’s ETag matches the value specified.
My expectation is that when using if_match='*'
the blob is downloaded no matter what the ETag is. But, as we saw above, this is not the case.
Can you give me an example of when using if_match='*'
would give a behaviour different than the default, i.e., if_match=None
?
if_match='*'
guarantees that a blob exists (i.e. there is an actual Etag to look at). if_none_match='*'
guarantees that a blob does not exist.
I see. Thanks. So the bottom line is that the current API doesn't allow me to download a file while it is modified by an append operation. Assuming the user takes responsibility about the integrity of the file, is it possible to add a flag or any other mechanism on your side to ensure that get_blobto*() would work in this append scenario?
Would something like this PR be acceptable? https://github.com/Azure/azure-storage-python/pull/630
Which service(blob, file, queue) does this issue concern?
BlockBlobService.get_blob_to_path()
Which version of the SDK was used? Please provide the output of
pip freeze
.Same issue is encountered in versions: 0.37.1, 1.3.0, and 2.1.0
Version 0.37.1
Version 1.3.0
Latest version
What problem was encountered?
When trying to download a blob that is currently been written in append mode (i.e., the ETag changes during time of download),
BlockBlobService.get_blob_to_path()
crashes even when usingif_match='*'
giving ConditionNotMet Error:Client-Request-ID=9cce10c8-b739-11e9-9366-b32a86ae676b Retry policy did not allow for a retry: Server-Timestamp=Mon, 05 Aug 2019 04:29:28 GMT, Server-Request-ID=86bb1780-a01e-002e-6146-4b01e3000000, HTTP status code=412, Exception=The condition specified using HTTP conditional header(s) is not met. ErrorCode: ConditionNotMet<?xml version="1.0" encoding="utf-8"?><Error><Code>ConditionNotMet</Code><Message>The condition specified using HTTP conditional header(s) is not met.RequestId:86bb1780-a01e-002e-6146-4b01e3000000Time:2019-08-05T04:29:28.3320963Z</Message></Error>
Have you found a mitigation/solution?
max_connections=1
mitigates the problemCode to repro: