Scenario: need to leverage Azure storage for Airflow remote logging.
Step 1 is verifying the connection works, so I'm using the operator ADLSListOperator as a test case.
On the connector I have set the following properties:
Azure Client ID:
Azure Client Secret:
Azure Tenant ID:
Azure DataLake Store Name: <e.g. mystorageaccount>
I know the client id, secret, and tenant id are all valid. They match the credentials that successfully work against the storage account using the python operator and the azure.storage.blob library. If I try to leverage the ADLS Connection with ADLSListOperator from apache-airflow-providers-microsoft-azure (11.1.0), it fails. The error log seems to indicate it is trying to connect to the wrong domain - e.g. ConnectionError(MaxRetryError("HTTPSConnectionPool(host='none.azuredatalakestore.net'
The domain azuredatalakestore.net is for legacy azure storage accounts. New storage accounts cannot use this domain. All future storage accounts use blob.core.windows.net.
If anyone has successfully used the operator ADLSListOperator against a storage account hosted at blob.core.windows.net, I'd be curious to know the configuration used. The documentation and examples I've found are very sparse or inconsistent.
I've tried using connector types azure_data_lake (as described above) as well as types adls and wasb.
What you think should happen instead
I would exect ADLSListOperator to list files, but it times out. I assume because it is trying to connect to the wrong domain.
How to reproduce
Create a valid azure storage account that uses the blob.core.windows.net domain - which should be all new storage accounts on Azure.
Setup a azure_data_lake connection using valid client id, client secret, tenant id, and account name.
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
Apache Airflow Provider(s)
microsoft-azure
Versions of Apache Airflow Providers
apache-airflow-providers-microsoft-azure 11.1.0
Apache Airflow version
2.9.2
Operating System
Ubuntu 22.04.4
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
Scenario: need to leverage Azure storage for Airflow remote logging.
Step 1 is verifying the connection works, so I'm using the operator ADLSListOperator as a test case. On the connector I have set the following properties: Azure Client ID:
Azure Client Secret:
Azure Tenant ID:
Azure DataLake Store Name: <e.g. mystorageaccount>
The store name's fully qualified url is https://mystorageaccount.blob.core.windows.net/
I know the client id, secret, and tenant id are all valid. They match the credentials that successfully work against the storage account using the python operator and the azure.storage.blob library. If I try to leverage the ADLS Connection with ADLSListOperator from apache-airflow-providers-microsoft-azure (11.1.0), it fails. The error log seems to indicate it is trying to connect to the wrong domain - e.g. ConnectionError(MaxRetryError("HTTPSConnectionPool(host='none.azuredatalakestore.net'
The domain azuredatalakestore.net is for legacy azure storage accounts. New storage accounts cannot use this domain. All future storage accounts use blob.core.windows.net.
If anyone has successfully used the operator ADLSListOperator against a storage account hosted at blob.core.windows.net, I'd be curious to know the configuration used. The documentation and examples I've found are very sparse or inconsistent.
I've tried using connector types azure_data_lake (as described above) as well as types adls and wasb.
What you think should happen instead
I would exect ADLSListOperator to list files, but it times out. I assume because it is trying to connect to the wrong domain.
How to reproduce
Anything else
Always. Hasn't worked successfully yet.
Are you willing to submit PR?
Code of Conduct