Azure / azure-storage-python

Microsoft Azure Storage Library for Python
https://azure-storage.readthedocs.io
MIT License
338 stars 240 forks source link

generating sas tokens fails with Arrow datetimes #553

Closed kodonnell closed 5 years ago

kodonnell commented 5 years ago

Which service(blob, file, queue) does this issue concern?

I'm only using blob, not sure if it affects others.

Which version of the SDK was used? Please provide the output of pip freeze.

1.4.0

What problem was encountered?

A SAS token was generated when using an Arrow value. This "worked" fine, but the token was invalid. This is (I think) because isinstance(date, <arrow value>) returns False, and hence _to_str is used (see here). For an Arrow value, calling str on e.g. arrow.utcnow() returns the same string as should occur, except with the TZ +00:00 appended. The resulting token (from e.g. generate_container_shared_access_signature has e.g. se=2019-02-06T09%3A56%3A25%2B00%3A00. When I try to use this SAS token, it's not authorized etc.

Have you found a mitigation/solution?

Well, there are a bunch. Since you don't want to be dependent on Arrow specifically, you could just do if hasattr(value, 'strftime') to test for "date-like-ness". Or, decide not to worry about it, and make Arrow users figure it out. (That said, adding a "warning" when _to_str get's called but there's a strftime attr might be worthwhile ...)

Not sure if this applies to other date libraries etc.

zezha-msft commented 5 years ago

Hi @kodonnell, thanks for reaching out!

You could pass in a str instead. Here is the doc.

kodonnell commented 5 years ago

Thanks @zezha-msft - I'm not sure that's a great solution, as it depends on the exact format of the string, etc. As suggested above, a better solution is to probably replace isinstance(date, val) with if hasattr(val, 'strftime') etc. That way your code can handle the exact format needed by the blob APIs. Otherwise, if these change, and people have been using strings, as you suggest, then bad things will happen.

As an aside, I'd suggest getting rid of _to_str - it'd probably be better to fail if it wasn't a str or date-y thing, as otherwise it will 'work' with any value I provide (e.g. None) - but I'll get an invalid token. In the case of it being a string, I'd also ensure it's of the correct format (via a quick strptime). This guarantees that any input will either cause an error, or be valid - whereas now, most inputs will result in 'silent' errors. I'm happy to submit a PR if you wish.