Azure / azure-storage-python

Microsoft Azure Storage Library for Python
https://azure-storage.readthedocs.io
MIT License
338 stars 240 forks source link

Add recoding possibility to create_blob_from_bytes #558

Open romanovacca opened 5 years ago

romanovacca commented 5 years ago

Which service(blob, file, queue) does this issue concern?

Blob

What problem was encountered?

During the creation of a blob, it would be useful if you could add the encoding of the file ,together with the desired encoding. So if you have a file stored that is in big5 encoding, but you want it to be in utf-8, you would simple add two parameters and that could be used.

Currently, one has to re-encode the file themselves, which is sometimes time consuming with large files.

Why should this be implemented?

To be able to create a table in azure sql dwh using polybase, the only allowed encodings are utf-8 and utf-16. This means you have to re-encode all the files yourself first before creation tables for these. But enabling this feature would make the process way easier, since then you would have them converted during uploading.

zezha-msft commented 5 years ago

Hi @romanovacca, thanks for reaching out!

I'm a bit confused by your feature request.

So if you have a file stored that is in big5 encoding, but you want it to be in utf-8, you would simple add two parameters and that could be used.

Do you mean that you want the service to automatically decode your data and re-encode it? If the SDK did it, the process would still be time consuming since the computing is happening on the client side.

Perhaps you could explore using create_blob_from_stream instead, then wrap your source file stream to decode and re-encode the content on the go.