Azure / azure-storage-python

Microsoft Azure Storage Library for Python
https://azure-storage.readthedocs.io
MIT License
338 stars 240 forks source link

BaseBlobService.get_blob_to_stream doesn't accept io.StringIO as a stream #536

Open enricorotundo opened 5 years ago

enricorotundo commented 5 years ago

Which service(blob, file, queue) does this issue concern?

Blob

Which version of the SDK was used? Please provide the output of pip freeze.

azure-storage-blob==1.4.0
azure-storage-common==1.4.0
azure-storage-file==1.4.0
azure-storage-nspkg==3.0.0
azure-storage-queue==1.4.0

What problem was encountered?

from azure.storage.blob.baseblobservice import BaseBlobService
import io

file_uri = "..."
blob_service = BaseBlobService(connection_string="...")

string_stream = io.StringIO()
string_result = blob_service.get_blob_to_stream("...", "....csv", stream=string_stream)

Fails with:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-a15e4b35d24d> in <module>()
      1 string_stream = io.StringIO()
----> 2 string_result = blob_service.get_blob_to_stream("...", ".....csv", stream=string_stream)
      3 #print("content type: {}".format(type(string_stream)))

~/miniconda3/envs/py36/lib/python3.6/site-packages/azure/storage/blob/baseblobservice.py in get_blob_to_stream(self, container_name, blob_name, stream, snapshot, start_range, end_range, validate_content, progress_callback, max_connections, lease_id, if_modified_since, if_unmodified_since, if_match, if_none_match, timeout)
   2052         # Clear blob content since output has been written to user stream
   2053         if blob.content is not None:
-> 2054             stream.write(blob.content)
   2055             blob.content = None
   2056 

TypeError: string argument expected, got 'bytes'

Even though accordin to the docs the stream param is an io.IOBase and issubclass(io.StringIO, io.IOBase) is True. Am I missing something?

enricorotundo commented 5 years ago

The get_blob_to_stream doc string specifies stream param type as io.IOBase but when stream is io.StringIO it fails.

The metod stream.write (see below) is called by passing blob.content which in my case is a bytes object. That works when stream is an instance of io.BytesIO but fails when is instance of io.StringIO because in the latter case the inherited write method requires a string!

https://github.com/Azure/azure-storage-python/blob/0a92c379a3de5b7b53d55d791a80851934729d88/azure-storage-blob/azure/storage/blob/baseblobservice.py#L2054

enricorotundo commented 5 years ago

When Blob.content could be str instead of bytes?

zezha-msft commented 5 years ago

Hi @enricorotundo, thanks for reaching out!

I've logged this issue for further investigation. But upon a quick look, I think you have a really good point. We'll get back to you shortly.