Azure / azure-storage-python

Microsoft Azure Storage Library for Python
https://azure-storage.readthedocs.io
MIT License
339 stars 241 forks source link

Blob batch delete (container_client.delete_blobs) errors with: 'AttributeError: 'NoneType' object has no attribute 'on_request'' #650

Open nofunatall opened 4 years ago

nofunatall commented 4 years ago

Which service(blob, file, queue) does this issue concern?

Blob

Which version of the SDK was used? Please provide the output of pip freeze.

rio@rio-t460p:~$ pip3 freeze | grep az azure-core==1.2.0 azure-identity==1.2.0 azure-storage-blob==12.1.0

What problem was encountered?

def run_delete_blobs():
  blob_service_client = BlobServiceClient(account_url=STORAGE_URL, credential=STORAGE_SAS)
  container_client = blob_service_client.get_container_client(STORAGE_CONTAINER)
  blob_list = [blob.name for blob in list(container_client.list_blobs())]
  container_client.delete_blobs(*blob_list)

run_delete_blobs()

Results in error

Traceback (most recent call last):
  File "blob_cleanup.py", line 108, in <module>
    run_delete_blobs()
  File "blob_cleanup.py", line 106, in run_delete_blobs
    container_client.delete_blobs(*blob_list)
  File "/home/rio/.local/lib/python3.7/site-packages/azure/core/tracing/decorator.py", line 71, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/home/rio/.local/lib/python3.7/site-packages/azure/storage/blob/_container_client.py", line 1094, in delete_blobs
    return self._batch_send(*reqs, **options)
  File "/home/rio/.local/lib/python3.7/site-packages/azure/storage/blob/_shared/base_client.py", line 265, in _batch_send
    request, **kwargs
  File "/home/rio/.local/lib/python3.7/site-packages/azure/core/pipeline/_base.py", line 197, in run
    self._prepare_multipart_mixed_request(request)
  File "/home/rio/.local/lib/python3.7/site-packages/azure/core/pipeline/_base.py", line 185, in _prepare_multipart_mixed_request
    _ for _ in executor.map(prepare_requests, requests)
  File "/home/rio/.local/lib/python3.7/site-packages/azure/core/pipeline/_base.py", line 185, in <listcomp>
    _ for _ in executor.map(prepare_requests, requests)
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
    yield fs.pop().result()
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 428, in result
    return self.__get_result()
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/rio/.local/lib/python3.7/site-packages/azure/core/pipeline/_base.py", line 180, in prepare_requests
    _await_result(policy.on_request, pipeline_request)
AttributeError: 'NoneType' object has no attribute 'on_request'

Have you found a mitigation/solution?

Work around is to just use the single file container_client.delete_blob within a for loop but it's fairly slow so not ideal on large containers.

xiafu-msft commented 4 years ago

Hi @nofunatall Sorry about the late response, I wasn't able to reproduce the problem with your code. Here is the code I ran, would you like to copy paste it and run?

from azure.core.exceptions import ResourceExistsError

def delete_multiple_blobs(self):
    # Instantiate a BlobServiceClient using a connection string
    from azure.storage.blob import BlobServiceClient
    blob_service_client = BlobServiceClient("https://youraccountname.blob.core.windows.net", "yoursastoken")

    # Instantiate a ContainerClient
    container_client = blob_service_client.get_container_client("containerforbatchblobdelete")

    # Create new Container
    try:
        container_client.create_container()
    except ResourceExistsError:
        # Container already created
        pass

    # Upload a blob to the container
    upload_data = b"Hello World"
    container_client.upload_blob(name="my_blob1", data=upload_data)
    container_client.upload_blob(name="my_blob2", data=upload_data)
    container_client.upload_blob(name="my_blob3", data=upload_data)

    # [START delete_multiple_blobs]
    # Delete multiple blobs in the container by name
    blobs = [blob.name for blob in list(container_client.list_blobs())]
    container_client.delete_blobs(*blobs)

    # Delete multiple blobs by properties iterator
    my_blobs = container_client.list_blobs(name_starts_with="my_blob")
    container_client.delete_blobs(*my_blobs)
    # [END delete_multiple_blobs]
    container_client.delete_container()

if __name__ == '__main__':
    delete_multiple_blobs()

BTW the current repo is for azure-storage-blob version<=2.1.0, the version you are using is 12.1.0, here is the code base for 12.1.0 https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/storage/azure-storage-blob.

More samples could be found in this folder: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/storage/azure-storage-blob/swagger

Feel free to send me the result after you run that code!

cg6 commented 4 years ago

Hi @xiafu-msft I've run into this same problem. I believe the problem is in _batch_send from azure\storage\blob_shared\base_client.py

        request.set_multipart_mixed(
            *reqs,
            policies=[
                StorageHeadersPolicy(),
                self._credential_policy
            ],
            enforce_https=False
        )

self._credential_policy can be None. My container_client has a _query_str. The credential and _credential_policy are None.

Additionally if I add to my code container_client._credential_policy='NA' Then the error changes from AttributeError: 'NoneType' object has no attribute 'on_request' to 'AttributeError: 'str' object has no attribute 'on_request''`

cg6 commented 4 years ago

@xiafu-msft

I should add: I had this problem when calling container_client.set_standard_blob_tier_blobs - however it appears to be the same underlying issue as container_client.delete_blobs

Problem occurs when the BlobServiceClient is created via BlobServiceClient.from_connection_string(SAS_connect_str) or BlobServiceClient(f"https://{account}.blob.core.windows.net", SAS_query_str) Both these methods result in a query_str based authentication so credential policy is not needed.

Switching the authentication to Storage Account Key rather then SAS token mitigated the issue, but is undesirable.