Closed dkapitan closed 6 years ago
@dkapitan Blob.download_to_file
does what you want (it takes a file object, versus the filename taken by Blob.download_to_filename
).
import io
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket('my-bucket-name')
blob = bucket.get_blob('my-blob-name')
buffer = io.BytesIO()
blob.download_to_file(buffer)
@dkapitan
Blob.download_to_file
does what you want (it takes a file object, versus the filename taken byBlob.download_to_filename
).import io from google.cloud import storage client = storage.Client() bucket = client.get_bucket('my-bucket-name') blob = bucket.get_blob('my-blob-name') buffer = io.BytesIO() blob.download_to_file(buffer)
For anyone working with this later, don't forget to call buffer.seek(0)
before reading it.
Use-case
We use google cloud storage to store .parquet files as part of our dataprocessing. We often want to load a .parquet file into memory, to be read directly into a pandas Dataframe without downloading it on disk.
The code below does the trick. My question is: would it be useful to include a
download_as_buffer
method in storage.blob?Feature request
Add method or modify
download_as_string
to have option to return the ByteIO buffer rather thangetvalue()
Or am I overlooking a similar method that is already included elsewhere in the API?
Environment