Azure / azure-storage-python

Microsoft Azure Storage Library for Python
https://azure-storage.readthedocs.io
MIT License
338 stars 240 forks source link

BlockBlobService inconsistent handling of container SAS URLs and tokens, particularly those generated by make_container_url #543

Open matthchr opened 5 years ago

matthchr commented 5 years ago

Which service(blob, file, queue) does this issue concern?

Blob

Which version of the SDK was used? Please provide the output of pip freeze.

azure-storage-blob==1.4.0

What problem was encountered?

There seems to be an inconsistency in how SAS's are generated and managed in the Python SDK.

First let me give a bit of background about my use case. Basically, we have folks providing us a SAS to one of their containers, which we then will upload some files to. They provide the SAS to us as one big URL, that is to say something which looks like this: https://<account>.blob.core.windows.net/<container>?se=2019-01-16T22%3A26%3A51Z&sp=w&sv=2018-03-28&sr=c&sig=<sig>

If I generate a SAS in Python like so I get a value which looks just like the query string in my sample above:

sas = blob_service.generate_container_shared_access_signature(
    container_name=container_name,
    permission=azure.storage.blob.ContainerPermissions(write=True),
    expiry=(datetime.datetime.utcnow() + datetime.timedelta(hours=3)))

But since the URL provided to me needs to also include the container name, etc, I use blob_service.make_container_url(container_name, sas_token=sas) to create a URL. This URL looks a bit different than I expect: https://<account>.blob.core.windows.net/<container>?restype=container&se=2019-01-16T22%3A26%3A51Z&sp=w&sv=2018-03-28&sr=c&sig=<sig> - specifically it's got this extra restype=container in it.

Later, when I want to actually perform a file upload using the provided URL, I have a few options.

  1. I can construct a BlockBlobService like this (using the full URL): blob_service = azure.storage.blob.BlockBlobService(account, sas_token='https://<account>.blob.core.windows.net/<container>?restype=container&se=2019-01-16T22%3A26%3A51Z&sp=w&sv=2018-03-28&sr=c&sig=<sig>')
  2. or like this (using just the query string): blob_service = azure.storage.blob.BlockBlobService(account, sas_token='restype=container&se=2019-01-16T22%3A26%3A51Z&sp=w&sv=2018-03-28&sr=c&sig=<sig>')

Number 2 above doesn't work, because of the restype=container. Number 1 works, but only if the sas URL contains restype=container - if somebody provides a URL generated from say the C# SDK which doesn't have restype=container I end up getting the following error when trying to use the SAS:

azure.common.AzureHttpError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. ErrorCode: AuthenticationFailed
<?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:72c1719b-201e-0054-078a-ae25a1000000
Time:2019-01-17T17:31:25.7442116Z</Message><AuthenticationErrorDetail>Access without signed identifier cannot have time window more than 1 hour: Start [Thu, 17 Jan 2019 17:31:25 GMT] - Expiry [Thu, 17 Jan 2019 19:29:18 GMT]</AuthenticationErrorDetail></Error>

Reading the documentation from the Python SDK, it seems like the sas_token parameter of BlockBlobService is supposed to just be the SAS token (i.e. this bit: se=2019-01-16T22%3A26%3A51Z&sp=w&sv=2018-03-28&sr=c&sig=<sig> - but the problem is that when presented with a full URL containing restype=container in the query string, it's not clear to me how to pull out "just the SAS" given an arbitrary query string. What I had been doing was to use the full query string as the SAS, but since restype=container is valid in the query string, but also not valid in the SAS token I'm now at a loss as to how to properly extract the SAS.

To summarize, my questions/issues are:

  1. Why does blob_service.make_container_url include restype=container in the URL? The URL generated already clearly points to a container so it seems like this extra restype=container is not useful.
  2. Given a URL like https://<account>.blob.core.windows.net/<container>?restype=container&se=2019-01-16T22%3A26%3A51Z&sp=w&sv=2018-03-28&sr=c&sig=<sig> how can I extract just the sas part, so that I can then pass that to the BlockBlobService constructor as the sas_token parameter?
  3. Why does BlockBlobService accept a full URL as the sas_token, but it only works if that full URL contains restype=container?
  4. Is the intent of the sas_token in BlockBlobService that it accepts both a full URL or the bare SAS token, or should it accept only the token?

Have you found a mitigation/solution?

The only mitigation/solution I've found is to tell people to avoid using make_container_url entirely, and to generate their URLs manually - which isn't really a great solution.

zezha-msft commented 5 years ago

Hi @matthchr, thanks for reaching out!

I've logged this issue for further investigation. We'll look into how to improve the user experience. Thank you for the feedbacks.

jdthorpe commented 5 years ago

I have the same issue (which I accidentally reported in the wrong repo...). Anyhow, the presence of the query parameter restype=container invalidates the Shared Access Signature (sig=c4W13uCl...) as the SAS is just a hash of other query parameters, and obviously inserting an additional query parameter would change the hash.

For what it's worth, you can create your own SAS URL like so:

from azure.storage import CloudStorageAccount
import datetime
from azure.storage.blob.models import ContainerPermissions

_CONTAINER_NAME = "test-container"
_ACCOUNT_NAME = "MyAccount"

account = CloudStorageAccount(
    account_name=_ACCOUNT_NAME, account_key="AAAAAAAAAAAAAAAA"
)
blockblob_service = account.create_block_blob_service()
container_sas = blockblob_service.generate_container_shared_access_signature(
    _CONTAINER_NAME,
    ContainerPermissions.READ
    + ContainerPermissions.WRITE
    + ContainerPermissions.DELETE
    + ContainerPermissions.LIST,
    datetime.datetime.utcnow() + datetime.timedelta(hours=1),
    start=datetime.datetime.utcnow(),
)

# Don't use!
invalid_sas_url = blockblob_service.make_container_url(
    _CONTAINER_NAME, sas_token=container_sas
)

# this one works :)
valid_sas_url = "https://{}.blob.core.windows.net/{}?{}".format(
    _STORAGE_ACCOUNT_NAME, _CONTAINER_NAME, container_sas
)
matthchr commented 5 years ago

@jdthorpe - totally agree that there is a workaround... but it doesn't make any sense to provide a helper method (make_container_url) which actually doesn't help!

@zezha-msft is there any plan to explain this behavior...? This issue has been open for quite a long time without a response.

v36372 commented 5 years ago

Hey @matthchr, @jdthorpe, I am having the same issue today. The latest version failed to produce a working sas url for my blob files. I had to trace back to the version I used sometimes ago and use an outdated one. So hopefully there weren't any security fixes from that time until now.

The working versions of the packages are:

azure-storage-common==1.4.0
azure-storage-blob==1.5.0

Hope this helps.

bgklein commented 5 years ago

Is there a status update on when/if this will be fixed?

brettwisner commented 5 years ago

Agree this is crazy to list the resource type when its already implied. Glad I found this question!