Closed ahmadfarhan97 closed 1 year ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
You are using WasbTaskHandler that needs delete_local_copy
to be passed. You need to either update WASB_REMOTE_HANDLERS with this or pass delete_local_copy
through REMOTE_TASK_HANDLER_KWARGS
This was changed in below commit to retrieve value from configuration
commit b6392ae5fd466fa06ca92c061a0f93272e27a26b Author: Hussein Awala houssein.awala.96@gmail.com Date: Tue Mar 7 17:30:56 2023 +0100
you can fix the issue by updating the provided config file:
elif REMOTE_BASE_LOG_FOLDER.startswith("wasb"):
WASB_REMOTE_HANDLERS: dict[str, dict[str, str | bool | None]] = {
"task": {
"class": "airflow.providers.microsoft.azure.log.wasb_task_handler.WasbTaskHandler",
"formatter": "airflow",
"base_log_folder": str(os.path.expanduser(BASE_LOG_FOLDER)),
"wasb_log_folder": REMOTE_BASE_LOG_FOLDER,
"wasb_container": "airflow-logs",
"filename_template": FILENAME_TEMPLATE,
+ "delete_local_copy": False,
},
}
or upgrading the provider to the latest version which use False as default value, and in Airflow 2.6.0 you will be able to configure delete_local_copy
through Airflow config.
I'll check why do you have this problem, but can you test if this change can fix it?
@hussein-awala I added the line you suggested but now I'm getting a different error and the following is what I get for the server pod?
{wasb_task_handler.py:133} ERROR - can't list blobs
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/microsoft/azure/log/wasb_task_handler.py", line 129, in _read_remote_logs
blob_names = self.hook.get_blobs_list(container_name=self.wasb_container, prefix=prefix)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/microsoft/azure/hooks/wasb.py", line 276, in get_blobs_list
for blob in blobs:
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/paging.py", line 132, in __next__
return next(self._page_iterator)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/paging.py", line 76, in __next__
self._response = self._get_next(self.continuation_token)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_list_blobs_helper.py", line 100, in _get_next_cb
process_storage_error(error)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_shared/response_handlers.py", line 97, in process_storage_error
raise storage_error
File "/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_list_blobs_helper.py", line 93, in _get_next_cb
return self._command(
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_generated/operations/_container_operations.py", line 2605, in list_blob_hierarchy_segment
pipeline_response = self._client._pipeline.run( # type: ignore # pylint: disable=protected-access
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 205, in run
return first_node.send(pipeline_request) # type: ignore
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
response = self.next.send(request)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
response = self.next.send(request)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
response = self.next.send(request)
[Previous line repeated 2 more times]
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/policies/_redirect.py", line 160, in send
response = self.next.send(request)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
response = self.next.send(request)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_shared/policies.py", line 546, in send
raise err
File "/home/airflow/.local/lib/python3.8/site-packages/azure/storage/blob/_shared/policies.py", line 520, in send
response = self.next.send(request)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
response = self.next.send(request)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
response = self.next.send(request)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/policies/_authentication.py", line 115, in send
self.on_request(request)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/core/pipeline/policies/_authentication.py", line 92, in on_request
self._token = self._credential.get_token(*self._scopes)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/identity/_credentials/default.py", line 168, in get_token
return super(DefaultAzureCredential, self).get_token(*scopes, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/azure/identity/_credentials/chained.py", line 101, in get_token
raise ClientAuthenticationError(message=message)
azure.core.exceptions.ClientAuthenticationError: DefaultAzureCredential failed to retrieve a token from the included credentials.
Attempted credentials:
EnvironmentCredential: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
Visit https://aka.ms/azsdk/python/identity/environmentcredential/troubleshoot to troubleshoot.this issue.
ManagedIdentityCredential: ManagedIdentityCredential authentication unavailable, no response from the IMDS endpoint.
SharedTokenCacheCredential: SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
AzureCliCredential: Azure CLI not found on path
AzurePowerShellCredential: PowerShell is not installed
To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/python/identity/defaultazurecredential/troubleshoot.
I passed the blob storage secret as a connection string.
And this is how the values.yaml is referring to the secret:
secret:
- envName: "AIRFLOW_CONN_ADLS"
secretName: "azure-blob-storage-secret"
secretKey: "AIRFLOW_CONN_ADLS"
config:
logging:
remote_logging: 'True'
logging_config_class: log_config.LOGGING_CONFIG
remote_log_conn_id: ADLS
Official Helm Chart version
1.9.0 (latest released)
Apache Airflow version
2.5.3
Kubernetes Version
1.26.3
Helm Chart configuration
Docker Image customizations
What happened
I was trying to connect to Azure blob Storage for logs from local machine. I used the Airflow documentation to setup the configuration in values.yaml as well as creating and copying a log_config.py file into the docker image (as shown in the docker image customizations section). I keep getting the following error in the airflow-run-airflow-migrations-xxxxx pod:
What you think should happen instead
The pods keep getting error
How to reproduce
The following script is the log_config.py file which is copied using into `` the dockerfile
and the init.py empty file is copied to the image in the same directory
Anything else
No response
Are you willing to submit PR?
Code of Conduct