airflow-helm / charts

The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-ready deployments of Airflow on Kubernetes.
https://github.com/airflow-helm/charts/tree/main/charts/airflow
Apache License 2.0
630 stars 474 forks source link

Remote logging doesn't work #835

Open jkleinkauff opened 3 months ago

jkleinkauff commented 3 months ago

Checks

Chart Version

8.8.0

Kubernetes Version

Client Version: v1.29.0-eks-5e0fdde
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5-eks-5e0fdde

Helm Version

version.BuildInfo{Version:"v3.12.3", GitCommit:"3a31588ad33fe3b89af5a2a54ee1d25bfe6eaa5e", GitTreeState:"clean", GoVersion:"go1.20.7"}

Description

Hey folks! I'm bumping our Airflow version from 2.2.5 to 2.7.3. - I also upgraded the chart version to 8.8.0.

During this upgrade, it seems remote logging is not working anymore. I'm unsure in which version it stopped, tho.

My config:

airflow:
  config:
    AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
    AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: "aws"
    AIRFLOW__LOGGING__ENCRYPT_S3_LOGS: "True"

  connections:
    - id: aws
      type: aws

in helm_release:

  set {
    name  = "airflow.config.AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER"
    value = "s3://${var.s3_bucket_logs}"
  }

  set {
    name  = "airflow.connections[0].login"
    value = aws_id
  }

  set_sensitive {
    name  = "airflow.connections[0].password"
    value = aws_secret
  }

policy:

    actions = [
      "s3:ListBucket",
      "s3:List*",
      "s3:Get*",
      "s3:PutObject*",
      "s3:PutBucketPublicAccessBlock",
      "s3:GetBucketAcl",
      "s3:GetBucketLocation",
      "s3:DeleteObject",
    ]

    resources = [
      "arn:aws:s3:::loadsmart-airflow-logs-${var.env}/*",
]

I know the logs are specific about the permissions, but there is more to it as this was the same setting it was working on before. I would love any tips, thank you friends!

Relevant Logs

kubectl logs airflow-staging-web-7bd6d8c75b-c9bst -n airflow:

[2024-03-06T00:17:13.350+0000] {base.py:73} INFO - Using connection ID 'aws' for task execution.
[2024-03-06T00:17:13.351+0000] {connection_wrapper.py:378} INFO - AWS Connection (conn_id='aws', conn_type='aws') credentials retrieved from login and password.
[2024-03-06T00:17:14.267+0000] {app.py:1744} ERROR - Exception on /api/v1/dags/airflow_db_cleanup_dag/dagRuns/manual__2024-03-06T00:16:59.449220+00:00/taskInstances/print_configuration/logs/1 [GET]
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.9/site-packages/flask/app.py", line 2529, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/airflow/.local/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/airflow/.local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/airflow/.local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/decorator.py", line 68, in wrapper
    response = function(request)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/uri_parsing.py", line 149, in wrapper
    response = function(request)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/validation.py", line 399, in wrapper
    return function(request)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/response.py", line 113, in wrapper
    return _wrapper(request, response)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/decorators/response.py", line 90, in _wrapper
    self.operation.api.get_connexion_response(response, self.mimetype)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/apis/abstract.py", line 366, in get_connexion_response
    return cls._framework_to_connexion_response(response=response, mimetype=mimetype)
  File "/home/airflow/.local/lib/python3.9/site-packages/connexion/apis/flask_api.py", line 165, in _framework_to_connexion_response
    body=response.get_data() if not response.direct_passthrough else None,
  File "/home/airflow/.local/lib/python3.9/site-packages/werkzeug/wrappers/response.py", line 314, in get_data
    self._ensure_sequence()
  File "/home/airflow/.local/lib/python3.9/site-packages/werkzeug/wrappers/response.py", line 376, in _ensure_sequence
    self.make_sequence()
  File "/home/airflow/.local/lib/python3.9/site-packages/werkzeug/wrappers/response.py", line 391, in make_sequence
    self.response = list(self.iter_encoded())
  File "/home/airflow/.local/lib/python3.9/site-packages/werkzeug/wrappers/response.py", line 50, in _iter_encoded
    for item in iterable:
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/log/log_reader.py", line 87, in read_log_stream
    logs, metadata = self.read_log_chunks(ti, current_try_number, metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/log/log_reader.py", line 64, in read_log_chunks
    logs, metadatas = self.log_handler.read(ti, try_number, metadata=metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/log/file_task_handler.py", line 413, in read
    log, out_metadata = self._read(task_instance, try_number_element, metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/amazon/aws/log/s3_task_handler.py", line 149, in _read
    return super()._read(ti, try_number, metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/log/file_task_handler.py", line 313, in _read
    remote_messages, remote_logs = self._read_remote_logs(ti, try_number, metadata)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/amazon/aws/log/s3_task_handler.py", line 123, in _read_remote_logs
    keys = self.hook.list_keys(bucket_name=bucket, prefix=prefix)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 89, in wrapper
    return func(*bound_args.args, **bound_args.kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 820, in list_keys
    for page in response:
  File "/home/airflow/.local/lib/python3.9/site-packages/botocore/paginate.py", line 269, in __iter__
    response = self._make_request(current_kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/botocore/paginate.py", line 357, in _make_request
    return self._method(**current_kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/botocore/client.py", line 980, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

Custom Helm Values

airflow:
  config:
    AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
    AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: "aws"
    AIRFLOW__LOGGING__ENCRYPT_S3_LOGS: "True"

  connections:
    - id: aws
      type: aws
tusharraichand commented 3 months ago

I have similar issue but with GCP, my logs are being pushed to GCS bucket but airflow UI can not read them. I see the following error on airflow UI. *** No logs found in GCS; ti=%s <TaskInstance: xyz> *** Could not read served logs: [Errno -2] Name or service not known Also I am new to helm, whats the correct way of passing these values to values.yaml:

airflow:
  config:
    AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
    AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: "gs://airflow/logs"
    AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: "my_conn"

  connections:
    - id: my_conn
      type: google_cloud_platform
      description: my GCP connection

When I add these values to values.yaml and run a helm upgrade, I see no changes happening to any pod/deployment.

b0kky commented 3 months ago

+1 found workaround pass through creds as ENV https://github.com/airflow-helm/charts/discussions/833#discussioncomment-8617171