Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.52k stars 2.76k forks source link

Asynchronous blob_client.query_blob() #29625

Open olearyl94 opened 1 year ago

olearyl94 commented 1 year ago

I am having trouble getting query_blob() method of object azure.storage.blob.aio._blob_client_async.BlobClient to work.

Steps to reproduce the behavior:

 from azure.storage.blob.aio import BlobServiceClient
      async def main():
          blob_service_client = BlobServiceClient.from_connection_string(
              self.connection_string_srct.get_secret_value()
          )

          # Name of the container and blob to query
          container_name = self.container
          blob_name = blob

          # Get a reference to the blob
          blob_client = blob_service_client.get_blob_client(container_name, blob_name)

          # Query the blob asynchronously
          async with blob_client:
              data_reader = await blob_client.query_blob(  # nosec
                  f"SELECT * FROM BlobStorage where measurement_source_id in {in_string}",
                  blob_format=parquet_format,
                  output_format=csv_format
              )

Expected to return a data reader object in the same was that the synchronous version of object does but instead I get the error: 'TypeError: cannot unpack non-iterable coroutine object'

Is it possible to get this method working asynchronously and if so how? My end goal is to be able to filter the parquet file on blob storage before downloading to VM we are using to run the script.

kashifkhan commented 1 year ago

Thank you for the feedback @olearyl94 . We will investigate and get back to you asap.

olearyl94 commented 1 year ago

It looks like there is no asyncronous query_blob method defined in the aio.BlobClient Class but it inherits from the synchronous BlobClient. Are there any plans to add an async version of the query blob method soon?

jalauzon-msft commented 1 year ago

Hi @olearyl94, you are correct that we do not seem to have an implementation of query_blob in our async client. It seems like this is something we may have missed during our initial implementation. Assuming we don't find a technical reason why this wasn't included; we will work on adding this in the near future. In the meantime you can try using the sync client if you are able to.

I do also want to call out that while we may be able to add this to our async client, support for parquet format is actually in a limited preview from the service and therefore may not work in your region or may not work as intended. The service team is aware of these issues but have not given a timeline for when this feature may be fully supported. You are welcome to try it, but you may see unexpected results. See #28963.

dizhouwu commented 1 year ago

Is query_blob officially released yet or is it still in preview? Where to find this information? Thanks!

github-actions[bot] commented 4 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jalauzon-msft @vincenttran-msft.