fsspec / adlfs

fsspec-compatible Azure Datake and Azure Blob Storage access
BSD 3-Clause "New" or "Revised" License
166 stars 102 forks source link

await file_obj.credential.close() : TypeError: object NoneType can't be used in 'await' expression #431

Open ELToulemonde opened 9 months ago

ELToulemonde commented 9 months ago

My problem

At the end of my python script, I get a clean up error : TypeError: object NoneType can't be used in 'await' expression

Complete trace is :

Traceback (most recent call last):
  File ".../lib/python3.10/weakref.py", line 667, in _exitfunc
    f()
  File ".../lib/python3.10/weakref.py", line 591, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential
    await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression

Which is weird because, I don't do anything asynchronious.

A reproducible example

At least as much as I can:

from azure.identity import ChainedTokenCredential, ManagedIdentityCredential, AzureCliCredential
azure_cli = AzureCliCredential()
managed_identity = ManagedIdentityCredential()
CREDENTIAL_CHAIN = ChainedTokenCredential(managed_identity, azure_cli)

import pandas as pd
pd.read_parquet("abfs://blob-name@datalake.blob.core.windows.net/path_to_parquets.parquet", storage_options={"credential": credential_chain})
print("Done")

I do get the "Done" printed before failure.

My config

TomAugspurger commented 9 months ago

fsspec / adlfs use an async internally. can you try using the credentials from azure.identity.aio instead?

On Oct 6, 2023, at 10:43 AM, ELToulemonde @.***> wrote:

My problem

At the end of my python script, I get a clean up error : TypeError: object NoneType can't be used in 'await' expression on

Traceback (most recent call last): File ".../lib/python3.10/weakref.py", line 667, in _exitfunc f() File ".../lib/python3.10/weakref.py", line 591, in call return info.func(*info.args, **(info.kwargs or {})) File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync raise return_result File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner result[0] = await coro File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential await file_obj.credential.close() TypeError: object NoneType can't be used in 'await' expression Which is weird because, I don't do anything asynchronious.

A reproducible example

At least as much as I can:

from azure.identity import ChainedTokenCredential, ManagedIdentityCredential, AzureCliCredential azure_cli = AzureCliCredential() managed_identity = ManagedIdentityCredential() CREDENTIAL_CHAIN = ChainedTokenCredential(managed_identity, azure_cli)

import pandas as pd @.***/path_to_parquets.parquet", storage_options={"credential": credential_chain}) print("Done") I do get the "Done" printed before failure.

My config

ubuntu, python 3.10, azure-storage-blob==12.16.0 pandas==2.0.0 pyarrow==11.0.0 adlfs==2023.9.0 — Reply to this email directly, view it on GitHub https://github.com/fsspec/adlfs/issues/431 or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIVBR7SYSQMVOV7SKMTX6ARJZBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJLJONZXKZNENZQW2ZNLORUHEZLBMRPXI6LQMWBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTLDTOVRGUZLDORPXI6LQMWSUS43TOVS2M5DPOBUWG44SQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKJRGA4TIMBRHE4TTAVEOR4XAZNFNFZXG5LFUV3GC3DVMWVDCOJTGA2DOMBSGA2KO5DSNFTWOZLSUZRXEZLBORSQ. You are receiving this email because you are subscribed to this thread.

Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

davidsteinar commented 6 months ago

@TomAugspurger @ELToulemonde I get the same error, no difference importing DefaultAzureCredential from either azure.identity.aio or azure.identity , did you solve this?

mkp-jansen commented 6 months ago

Same error here, any solutions?

marktodisco commented 6 months ago

Importing DefaultAzureCredential from azure.identity.aio silenced that error for me.

Python 3.10.13 on Ubuntu.

Package                     Version  
--------------------------- ---------
adlfs                       2023.12.0
azure-ai-ml                 1.12.1
azure-common                1.1.28
azure-core                  1.29.6
azure-datalake-store        0.0.53
azure-identity              1.15.0
azure-mgmt-core             1.4.0
azure-mgmt-resource         23.0.1
azure-mgmt-storage          21.1.0
azure-mgmt-subscription     3.1.1
azure-storage-blob          12.19.0
azure-storage-file-datalake 12.14.0
azure-storage-file-share    12.15.0
pyarrow                     14.0.2
mhtrinh commented 1 month ago

I am aware that using azure.identity.aio silence the error but: Why does async involved in non-async call ?

TomAugspurger commented 1 month ago

fsspec uses asyncio internally.

I'd recommend people use credentials from azure.identity.aio. If someone wants, we could add an inspect.iscoroutine check to before we call .close.

mhtrinh commented 1 month ago

That may solve one of our issue: I am using adlfs as part of a complex code that use ThreadPool. At the end of the run, I get this message that do not change the exit code, so not fatal but looks a bit ugly:

Traceback (most recent call last):
  File ".../lib/python3.10/weakref.py", line 667, in _exitfunc
    f()
  File ".../lib/python3.10/weakref.py", line 591, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential
    await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression

I did not manage to create a small reproducable example ...