apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.53k stars 14.15k forks source link

PermissionDenied : Unable to use docker-compose due to UID conflicts #17320

Closed ImadYIdrissi closed 3 years ago

ImadYIdrissi commented 3 years ago

Apache Airflow version: 2.1.0

Environment:

What happened: When trying to run $ sudo docker-compose run airflow-init bash

Creating be-api_airflow-init_run ... done
....................
ERROR! Maximum number of retries (20) reached.

Last check result:
$ airflow db check
Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 5, in <module>
    from airflow.__main__ import main
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/__init__.py", line 34, in <module>
    from airflow import settings
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/settings.py", line 35, in <module>
    from airflow.configuration import AIRFLOW_HOME, WEBSERVER_CONFIG, conf  # NOQA F401
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/configuration.py", line 1115, in <module>
    conf = initialize_config()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/configuration.py", line 836, in initialize_config
    with open(AIRFLOW_CONFIG, 'w') as file:
PermissionError: [Errno 13] Permission denied: '/home/airflow/airflow.cfg'
ERROR: 1

What you expected to happen:

I expected to see a correct initialization of the container with the proper file permissions for the specified UID in the docker-compose.yml file, with an output that resembles this :

Creating be-api_airflow-init_run ... done
BACKEND=postgresql+psycopg2
DB_HOST=postgres
DB_PORT=5432

DB: postgresql+psycopg2://airflow:***@postgres/airflow
[2021-07-29 16:25:03,687] {db.py:695} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
Upgrades done
airflow already exist in the db
airflow@7a15c956e187:/opt/airflow$ cd /home/airflow/
airflow@7a15c956e187:~$

P.S : This output is achieved by using the UID=50000 within the .env file that is attached to the docker-compose.yml file.

When using a different UID (i.e 1001 in my case), in order to match the file permissions for the ./dags, ./logs, ./plugins, the error occurs. I think the UID=50000 was enforced at some point in the DockerFile of the Airflow image, and is not correctly substituted when docker-compose.yml tries to change this value, so the /home/airflow files are still created with owner as UID:50000 while the sub-directories ./dags, ./logs, ./plugins will have the UID/GID of the host system.

There are 2 major issues with the approach of using a fixed UID:

  1. If I have to create and use a single UID=50000 that will handle all airflow operations, then my airflow file system within the host cannot be operated properly with different users, e.g. devs when pulling new changes from git...
  2. Even if this works properly and we can use another UID than 50000, it still restricts the actions to a singular user, that is binded with the GID=0 (This is a requirement from airflow). The result is that we will have the same limitation as mentionned earlier, i.e. only 1 UID will be able to change the host file system. (Maybe I need to create a separate issue for this)

How to reproduce it: Create a project with the following structure

custom-project
 ┣ src
 ┃ ┣ dags
 ┃ ┃ ┗ hello_geeks.py
 ┃ ┣ logs
 ┃ ┗ plugins
 ┣ .env
 ┣ README.md
 ┗ docker-compose.yml

Use the following files with the following command sudo docker-compose run airflow-init bash

docker-compose.yml file :

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#

# Basic Airflow cluster configuration for CeleryExecutor with Redis and PostgreSQL.
#
# WARNING: This configuration is for local development. Do not use it in a production deployment.
#
# This configuration supports basic configuration using environment variables or an .env file
# The following variables are supported:
#
# AIRFLOW_IMAGE_NAME           - Docker image name used to run Airflow.
#                                Default: apache/airflow:|version|
# AIRFLOW_UID                  - User ID in Airflow containers
#                                Default: 50000
# AIRFLOW_GID                  - Group ID in Airflow containers
#                                Default: 50000
#
# Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode
#
# _AIRFLOW_WWW_USER_USERNAME   - Username for the administrator account (if requested).
#                                Default: airflow
# _AIRFLOW_WWW_USER_PASSWORD   - Password for the administrator account (if requested).
#                                Default: airflow
# _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when starting all containers.
#                                Default: ''
#
# Feel free to modify this file to suit your needs.
---
    version: '3'
    x-airflow-common:
      &airflow-common
      image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.1.0}
      environment:
        &airflow-common-env
        AIRFLOW__CORE__EXECUTOR: CeleryExecutor
        AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
        AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow
        AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
        AIRFLOW__CORE__FERNET_KEY: ''
        AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
        AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
        AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth'
        AIRFLOW_HOME: '${AIRFLOW_HOME:-/opt/airflow}'
        _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
      volumes:
        - ./src/dags:${AIRFLOW_HOME:-/opt/airflow}/dags
        - ./src/logs:${AIRFLOW_HOME:-/opt/airflow}/logs
        - ./src/plugins:${AIRFLOW_HOME:-/opt/airflow}/plugins
      user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}"
      depends_on:
        &airflow-common-depends-on
        redis:
          condition: service_healthy
        postgres:
          condition: service_healthy

    services:
      postgres:
        image: postgres:13
        environment:
          POSTGRES_USER: airflow
          POSTGRES_PASSWORD: airflow
          POSTGRES_DB: airflow
        volumes:
          - postgres-db-volume:/var/lib/postgresql/data
        healthcheck:
          test: ["CMD", "pg_isready", "-U", "airflow"]
          interval: 5s
          retries: 5
        restart: always

      redis:
        image: redis:latest
        expose:
          - 6379
        healthcheck:
          test: ["CMD", "redis-cli", "ping"]
          interval: 5s
          timeout: 30s
          retries: 50
        restart: always

      airflow-webserver:
        <<: *airflow-common
        command: webserver
        ports:
          - 9999:8080
        healthcheck:
          test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
        depends_on:
          <<: *airflow-common-depends-on
          airflow-init:
            condition: service_completed_successfully

      airflow-scheduler:
        <<: *airflow-common
        command: scheduler
        healthcheck:
          test: ["CMD-SHELL", 'airflow jobs check --job-type SchedulerJob --hostname "$${HOSTNAME}"']
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
        depends_on:
          <<: *airflow-common-depends-on
          airflow-init:
            condition: service_completed_successfully

      airflow-worker:
        <<: *airflow-common
        command: celery worker
        healthcheck:
          test:
            - "CMD-SHELL"
            - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"'
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
        depends_on:
          <<: *airflow-common-depends-on
          airflow-init:
            condition: service_completed_successfully

      airflow-init:
        <<: *airflow-common
        command: version
        environment:
          <<: *airflow-common-env
          _AIRFLOW_DB_UPGRADE: 'true'
          _AIRFLOW_WWW_USER_CREATE: 'true'
          _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow}
          _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow}

      airflow-cli:
        <<: *airflow-common
        profiles:
          - debug
        environment:
          <<: *airflow-common-env
          CONNECTION_CHECK_MAX_COUNT: "0"
        # Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252
        command:
          - bash
          - -c
          - airflow

      flower:
        <<: *airflow-common
        command: celery flower
        ports:
          - 5555:5555
        healthcheck:
          test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
          interval: 10s
          timeout: 10s
          retries: 5
        restart: always
        depends_on:
          <<: *airflow-common-depends-on
          airflow-init:
            condition: service_completed_successfully

    volumes:
      postgres-db-volume:

.env file :

AIRFLOW_UID=1001
AIRFLOW_GID=0
AIRFLOW_HOME=/home/airflow
ImadYIdrissi commented 3 years ago

I did find this thread that deals with this issue, but I believe it should be incorporated to the public community image directly, instead of forking with a custom image. I.e. This custom image, by puckle, which solves this issue is for airflow 1.10.9.

potiuk commented 3 years ago
  1. Docker-Compose of Airlow is not production ready. You should use it only for development and testing, if you want more production setup I recommend to use the official Helm Chart of the communiy: https://airflow.apache.org/docs/helm- chart/stable/index.html and K8S.

  2. If you mount your local folders as volumes to Airlfow, you should make sure that you use your HOST_ID as user_id and GID=0 as user. This is very clear in the "Initialize the environment section of the docker compose" https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#initializing-environment quick start documentation. Apparently you mised that step. so let me copy it here (you need to run it once in the host in the directory where you have docker compose.

    echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env

How it works - it changes the user that is used by Airflow to be the same as your host user, and sets the group ID to be 0 (which is the best practice followed from OpenShift to make possible to run container image as an arbitrary user).

  1. No other configuration is supported for docker compose when you mount your local folder on linux. This is is a Docker limitation, not airflow nor image limitation. Airflow image is defined according to best practices of OpenShift and allows to run as arbitrary user id (, but when you mount local volume from host, the user from your host volume is an owner. There are various ways YOU can deal with the problem when you run the container (one of the solutions is the very one that Airlfow proposes - where you can define and use HOST UID and GID=0 to run the image. You can read more here: https://airflow.apache.org/docs/docker-stack/entrypoint.html#allowing-arbitrary-user-to-run-the-container

  2. The solution you copied is to manually build image and there manually your UID as the user. Which (obviously) cannot be done in the public image, because we do not know your user id when we prepare the image and each user has a different UID. It's just impossible.

potiuk commented 3 years ago

BTW. I think you are not aware that your user changes when you run sudo. DO NOT use sudo when you run docker compose or docker commands, because then they will be used as "root" user (UID=0) which is likely the root cause of your problem. Most likely your logs/dags/ etc. file have been created as owned by that root user and this is causing all your permission problem. Make sure you do what docker installation suggests https://docs.docker.com/engine/install/linux-postinstall/ - add your user to docker group, so you do not have to use sudo to run docker command. Then check what permissions/ownership you have for the dags/ logs/ etc. folders (and files inside). Change them to be owned by your user rather than root.

And then follow the steps from the quick start.

ImadYIdrissi commented 3 years ago

I don't understand why this thread was closed. The causes of this issue are still not clearly identified, or confirmed.

Docker-Compose of Airlow is not production ready. You should use it only for development and testing

I am not trying to use it for production, we are merely testing this approach, and trying to understand it.

you should make sure that you use your HOST_ID as user_id and GID=0 as user.

I thought I did set AIRFLOW_UID to 1001 and AIRFLOW_GID to 0 (manually) in the .env file. Is it not the same as doing echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env ?

BTW. I think you are not aware that your user changes when you run sudo... Most likely your logs/dags/ etc. file have been created as owned by that root user and this is causing all your permission problem.

My dags, logs etc... are not root owned, they're user owned (UID : 1001). I hope I am not badly mistaken about this, but if so, could you kindly clarify the misconception ? image image

The solution you copied is to manually build image and there manually your UID as the user.

Could you clarify what you mean by manually? Is it because I put it in the .env myself? As I have asked earlier, wouldn't the above mentionned command do just that?

I will follow your recommendation about adding the user to the docker group to avoid using sudo. Thank you.

P.S : Please find the .env file content under the docker-compose.yml which was enclosed in the initial post, but might have been easily missed as the main content is bigger.

potiuk commented 3 years ago

The reason it was closed, because you indicated that you expect that things will work out-of-the-box even when you indicated yourself you tried to change users (first 50000 and then 1001) and that it has changed since previous time, which indicated that you expected more from the docker compose than what it is intended to.

The "quick start" is really to get you quick-started and if you want to change anything (like change the user and experiment with the setting), there is not much we can do to "solve" the issue you raised as bug. From your message it seems that you had a history of using this setup and that you "expect" it to work under this different circumstances (one of the problems was that you used sudo to run the docker compose which is guaranteed not to work).

You also indicated "If I have to create and use a single UID=50000 that will handle all airflow operations, then my airflow file system within the host cannot be operated properly with different users, e.g. devs when pulling new changes from git..." which means that you wanted to make the docker compose works for many users, but this is definitely not the intention of our docker compose. It is there to make single user to be able to quick-start and run airflow on that user's machine. That's it. There are other ways to make airflow works for multiple users for development environment, but the quick start docker compose is not one of them. That was the "production" use i was referring to, which was indeed a bit to narrow, "multi-user" would be more appropriate. It's not designed to be used by "multiple users". So it is not a bug that it does not work this way.

The Quick-start documentation is just that - quick-start. No more, no less. You have to strictly follow it to get it working, if you do any deviation from that, it might or might not work. When you open an issue indicating "bug" I think you expect it to be fixed. There is no action anyone can take here to "fix" the problems you were experiencing. That's why the issue was closed - because there is no reasonable action anyone can take here to fix the problems you experienced.

The best thing You can do is to wipe out all your setup and start from scratch and precisely follow the quick-start with no deviations and see if it works. If it does not, please report it here with all information that you can and I will be happy to reopen the issue (if it willl indicate that our quick start instructions are wrong. Or even better - open a PR fixing it straight away.

If you want to discuss the usage and results of your experiments you can use "GitHub discussions" or discuss it in slack. But opening a "bug" in this case is clearly a mistake resulting from misuse and not following the instructions, rather than bug in airflow. This is precisely why it was marked as "invalid" and "closed".

And if you want to propose new feature, change how docker compose works - feel free. We actually have an open feature to make a more "versatile" docker-compose setup with more examples and possibly wizard-like generation https://github.com/apache/airflow/issues/16031 It would be great if you could contribute to that

potiuk commented 3 years ago

Could you clarify what you mean by manually? Is it because I put it in the .env myself? As I have asked earlier, wouldn't the above mentionned command do just that?

Manually means added to Dockerfile and specify it verbatim at the time of docker building. This is at least one of the solutions in the thread that you mentioned (you did not tell which one it was, so I picked the one that concluded the thread).

potiuk commented 3 years ago

I thought I did set AIRFLOW_UID to 1001 and AIRFLOW_GID to 0 (manually) in the .env file. Is it not the same as doing echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env ?

This looks good. As mentioned above - I recommend you to wipe it out, restart and see if you still have problems. In your earlier comments you mentioned that previously you used different USER and you used sudo to run docker command (which is guaranteed not to work because then your container are run as root rather than your own user. So I guess (as I wrote) that you missed the .env setup and run it without it as sudo user (which would create those directory as root user if you did). That was my line of thoughts.

MingTaLee commented 1 year ago

I got similar issue with @ImadYIdrissi .

I'm setting up an Airflow develop / test environment with the Airflow 2.5.3 docker image and docker-compose.yaml file from Apache Airflow official website.

The server startng / running the docker-compose is an AWS EC2 server running Ubuntu 20.04.6 LTS. The user to run docker-compose up airflow-init and docker-compose up has ID 1002, and dags / logs / plugins are owned by that user in the host side. (I did not use SUDO when running the docker-compose command)

Already set AIRFLOW_UID=1002 in .env file as suggested by @potiuk .

Checking inside airflow-airlfow-worker-1 container after the containers initiated and ready, I can see 3 users: root / airflow / default User "airflow" has UID 50000 and User "default" has UID 1002, but it is a nologin user and its home is set as user airflow (/home/airflow).

If I change the line [ user: "${AIRFLOW_UID:-50000}:0" ] to [ user: "${AIRFLOW_UID:-1002}:0" ] in docker-compose.yaml, there will be no user account name "default", however, inside the containers, the account "airflow" still use ID 50000 and dags / logs / plugins still have owner as 1002:root.

I have scraped the containers and restart with docker-compose down / up several times, with different settings and none gave me proper ownership of the volume bind paths (i.e. dags/ logs / plugins).

Is there any other settings that I missed?

Any input is appreciated! Thanks!

potiuk commented 1 year ago

What exact stack track do you have? You mentioned "similar" but you forgot to attach the error or stack trace you have (which - unless you have also some modificaiton in your image or variables you pass to it.

I believe the problem might be, that you also override AIRFLOW_HOME variable or something similar.

By default when you set the AIRFLOW_UID variable, the following things are happening:

1) The new user is created and what you observe is correct: It should have home set to /airflow/home, it should have UID=1002 and GID=0. So far so good.

This is what I have when I enter the image with UID=1002. This is exactly as expected:

default@aa130926bd39:/opt/airflow$ cat /etc/passwd | grep default
default:x:1002:0:default user:/home/airflow:/sbin/nologin

2) The AIRFLOW_HOME variable should be set to /opt/airflow and the default airflow.cfg should be already created there:

default@aa130926bd39:/opt/airflow$ set |grep AIRFLOW_HOME
AIRFLOW_HOME=/opt/airflow
default@aa130926bd39:/opt/airflow$ ls ${AIRFLOW_HOME}
airflow.cfg  airflow.db  dags  logs  webserver_config.py

3) I can run airflow config list and it works all right - showing the values from airflow.cfg. Because default user belongs to 0 group and the /opt/airflow folder is owned by the 0 group and the group has all the permissions there. I can even remove the airflow.cfg and run airlfow help or another command at it will get recreated:

default@aa130926bd39:/opt/airflow$ airflow config list
[core]
dags_folder = /opt/airflow/dags
hostname_callable = airflow.utils.net.getfqdn

(removed for brevity)

default@aa130926bd39:/opt/airflow$ ls
airflow.cfg  airflow.db  dags  logs  webserver_config.py
default@aa130926bd39:/opt/airflow$ rm airflow.cfg
default@aa130926bd39:/opt/airflow$ ls
airflow.db  dags  logs  webserver_config.py
default@aa130926bd39:/opt/airflow$ airflow help
usage: airflow [-h] GROUP_OR_COMMAND ...

(removed for brevity)

default@aa130926bd39:/opt/airflow$ ls
airflow.cfg  airflow.db  dags  logs  webserver_config.py
default@aa130926bd39:/opt/airflow$ ls -la
total 84
drwxrwxr-x 1 airflow root  4096 May 29 09:33 .
drwxr-xr-x 1 root    root  4096 Mar 31 22:55 ..
-rw------- 1 default root 51721 May 29 09:33 airflow.cfg
-rw-r--r-- 1 default root     0 May 29 09:24 airflow.db
drwxrwxr-x 2 airflow root  4096 Mar 31 22:55 dags
drwxrwxr-x 1 airflow root  4096 May 29 09:24 logs
-rw-rw-r-- 1 default root  4771 May 29 09:24 webserver_config.py
default@aa130926bd39:/opt/airflow$

So I wonder - what's your error @MingTaLee and what the above commands show?

MingTaLee commented 1 year ago

@potiuk Thank yo very much for your reply, really appreciated!

Our purpose is to set up a development and testing enviromnemt for my colleagues to test their dags. And we would like to have the dags folder bind to another directory from the server side using docker volume settings. My colleagues will use the account "airflow" to SSH login into the worker container and modify dags, git push to repository and test it.

However, since the volume bind into the container now owned by the user named "default", user "airflow" does not have permission to do the works (modification and git). And if I modify the owner or permission inside the container, I will mess up those in the server side....

Below is what I did / not did:

  1. I didn't use docker swarm to manage the service. Simply docker-compose up airflow-init and then docker-compose up to start.. So unfortunately I don't have stack trace to provide here (or do you mean there is something else called stack track? If so, please help me with how I can find it!).

  2. Another important issue I forgot to mention earlier is, in my first few tests to start the service with docker-compose up (with version 2.5.3 docker image from Airflow directly, no modification at this stage), the tests failed. After investigation, I found that the "webserver_config.py" file is not created properly. Instead of a file, there is an empty folder with that name created. And no airflow.cfg file to be found. I have to manually provide airflow.cfg and webserver_config.py I copied from my previous test (back when testing Airflow 1.10.12). NOT SURE WHETHER SOMETHING MESSED UP IN THIS STEP.... I simply assumed that since the service started, those settings should be OK....

  3. Later I did modify the image a bit to add SSH and PyMySQL into the image. Here is the modified dockerfile:

    
    FROM apache/airflow:2.5.3-python3.8

LABEL description="Modify from Airflow 2.5.3 image by Apache. Add openssh-server / PyMySQL. New_name: airflow253:v2.01" version="2.01"

RUN export DEBIAN_FRONTEND=noninteractive \ && python3 -m pip install --no-cache-dir --upgrade pip && python3 -m pip install --upgrade setuptools \ && python3 -m pip install --no-cache-dir pymysql

USER root

RUN export DEBIAN_FRONTEND=noninteractive \ && apt-get update && apt-get -y upgrade \ && apt-get install -y openssh-server git \ && apt-get purge && apt-get clean && apt-get autoclean && apt-get remove && apt-get -y autoremove \ && rm -Rf /root/.cache/pip \ && rm -rf /var/lib/apt/lists/*

CMD ["/bin/bash"]


Although I don't think these modifications will change the variables you mentioned, but I may be wrong.

Below are the output of those commands you mentioned:

default@51d57fa9c7f2:/opt/airflow$ cat /etc/passwd | grep default default:x:1002:0:default user:/home/airflow:/sbin/nologin

default@51d57fa9c7f2:/opt/airflow$ set |grep AIRFLOW_HOME AIRFLOW_HOME=/opt/airflow default@51d57fa9c7f2:/opt/airflow$ ls ${AIRFLOW_HOME} airflow-worker.pid airflow.cfg dags logs plugins webserver_config.py

default@51d57fa9c7f2:/opt/airflow$ airflow config list [core] dags_folder = /opt/airflow/dags hostname_callable = airflow.utils.net.getfqdn default_timezone = utc executor = CeleryExecutor parallelism = 32 max_active_tasks_per_dag = 16 dags_are_paused_at_creation = True max_active_runs_per_dag = 16 load_examples = True plugins_folder = /opt/airflow/plugins execute_tasks_new_python_interpreter = False fernet_key = donot_pickle = True dagbag_import_timeout = 30.0 dagbag_import_error_tracebacks = True dagbag_import_error_traceback_depth = 2 dag_file_processor_timeout = 50 task_runner = StandardTaskRunner default_impersonation = security = unit_test_mode = False enable_xcom_pickling = False allowed_deserialization_classes = airflow..* killed_task_cleanup_time = 60 dag_run_conf_overrides_params = True dag_discovery_safe_mode = True dag_ignore_file_syntax = regexp default_task_retries = 0 default_task_retry_delay = 300 default_task_weight_rule = downstream default_task_execution_timeout = min_serialized_dag_update_interval = 30 compress_serialized_dags = False min_serialized_dag_fetch_interval = 10 max_num_rendered_ti_fields_per_task = 30 check_slas = True xcom_backend = airflow.models.xcom.BaseXCom lazy_load_plugins = True lazy_discover_providers = True hide_sensitive_var_conn_fields = True sensitive_var_conn_names = default_pool_task_slot_count = 128 max_map_length = 1024 daemon_umask = 0o077 sql_alchemy_conn = postgresql+psycopg2://airflow:airflow@postgres/airflow

[database] sql_alchemy_conn = postgresql+psycopg2://airflow:airflow@postgres/airflow sql_engine_encoding = utf-8 sql_alchemy_pool_enabled = True sql_alchemy_pool_size = 5 sql_alchemy_max_overflow = 10 sql_alchemy_pool_recycle = 1800 sql_alchemy_pool_pre_ping = True sql_alchemy_schema = load_default_connections = True max_db_retries = 3

[logging] base_log_folder = /opt/airflow/logs remote_logging = False remote_log_conn_id = google_key_path = remote_base_log_folder = encrypt_s3_logs = False logging_level = INFO celery_logging_level = fab_logging_level = WARNING logging_config_class = colored_console_log = True colored_log_format = [%(blue)s%(asctime)s%(reset)s] {%(blue)s%(filename)s:%(reset)s%(lineno)d} %(log_color)s%(levelname)s%(reset)s - %(log_color)s%(message)s%(reset)s colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter log_format = [%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s simple_log_format = %(asctime)s %(levelname)s - %(message)s dag_processor_log_target = file dag_processor_log_format = [%(asctime)s] [SOURCE:DAG_PROCESSOR] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s log_formatter_class = airflow.utils.log.timezone_aware.TimezoneAware task_log_prefix_template = log_filename_template = dag_id={{ ti.dag_id }}/run_id={{ ti.run_id }}/task_id={{ ti.task_id }}/{% if ti.map_index >= 0 %}map_index={{ ti.map_index }}/{% endif %}attempt={{ try_number }}.log log_processor_filename_template = {{ filename }}.log dag_processor_manager_log_location = /opt/airflow/logs/dag_processor_manager/dag_processor_manager.log task_log_reader = task extra_logger_names = worker_log_server_port = 8793

[metrics] statsd_on = False statsd_host = localhost statsd_port = 8125 statsd_prefix = airflow statsd_allow_list = stat_name_handler = statsd_datadog_enabled = False statsd_datadog_tags =

[secrets] backend = backend_kwargs =

[cli] api_client = airflow.api.client.local_client endpoint_url = http://localhost:8080

[debug] fail_fast = False

[api] enable_experimental_api = False auth_backends = airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session maximum_page_limit = 100 fallback_page_limit = 100 google_oauth2_audience = google_key_path = access_control_allow_headers = access_control_allow_methods = access_control_allow_origins =

[lineage] backend =

[atlas] sasl_enabled = False host = port = 21000 username = password =

[operators] default_owner = airflow default_cpus = 1 default_ram = 512 default_disk = 512 default_gpus = 0 default_queue = default allow_illegal_arguments = False

[hive] default_hive_mapred_queue =

[webserver] base_url = http://localhost:8080 default_ui_timezone = UTC web_server_host = 0.0.0.0 web_server_port = 8080 web_server_ssl_cert = web_server_ssl_key = session_backend = database web_server_master_timeout = 120 web_server_worker_timeout = 120 worker_refresh_batch_size = 1 worker_refresh_interval = 6000 reload_on_plugin_change = False secret_key = BSlegi2JIGb8pADrl2RNYw== workers = 4 worker_class = sync access_logfile = - error_logfile = - access_logformat = expose_config = False expose_hostname = False expose_stacktrace = False dag_default_view = grid dag_orientation = LR log_fetch_timeout_sec = 5 log_fetch_delay_sec = 2 log_auto_tailing_offset = 30 log_animation_speed = 1000 hide_paused_dags_by_default = False page_size = 100 navbar_color = #fff default_dag_run_display_number = 25 enable_proxy_fix = False proxy_fix_x_for = 1 proxy_fix_x_proto = 1 proxy_fix_x_host = 1 proxy_fix_x_port = 1 proxy_fix_x_prefix = 1 cookie_secure = False cookie_samesite = Lax default_wrap = False x_frame_enabled = True show_recent_stats_for_completed_runs = True update_fab_perms = True session_lifetime_minutes = 43200 instance_name_has_markup = False auto_refresh_interval = 3 warn_deployment_exposure = True audit_view_excluded_events = gantt,landing_times,tries,duration,calendar,graph,grid,tree,tree_data

[email] email_backend = airflow.utils.email.send_email_smtp email_conn_id = smtp_default default_email_on_retry = True default_email_on_failure = True

[smtp] smtp_host = localhost smtp_starttls = True smtp_ssl = False smtp_port = 25 smtp_mail_from = airflow@example.com smtp_timeout = 30 smtp_retry_limit = 5

[sentry] sentry_on = False sentry_dsn =

[local_kubernetes_executor] kubernetes_queue = kubernetes

[celery_kubernetes_executor] kubernetes_queue = kubernetes

[celery] celery_app_name = airflow.executors.celery_executor worker_concurrency = 16 worker_prefetch_multiplier = 1 worker_enable_remote_control = True broker_url = redis://:@redis:6379/0 flower_host = 0.0.0.0 flower_url_prefix = flower_port = 5555 flower_basic_auth = sync_parallelism = 0 celery_config_options = airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG ssl_active = False ssl_key = ssl_cert = ssl_cacert = pool = prefork operation_timeout = 1.0 task_track_started = True task_adoption_timeout = 600 stalled_task_timeout = 0 task_publish_max_retries = 3 worker_precheck = False result_backend = db+postgresql://airflow:airflow@postgres/airflow

[celery_broker_transport_options]

[dask] cluster_address = 127.0.0.1:8786 tls_ca = tls_cert = tls_key =

[scheduler] job_heartbeat_sec = 5 scheduler_heartbeat_sec = 5 num_runs = -1 scheduler_idle_sleep_time = 1 min_file_process_interval = 30 parsing_cleanup_interval = 60 dag_dir_list_interval = 300 print_stats_interval = 30 pool_metrics_interval = 5.0 scheduler_health_check_threshold = 30 enable_health_check = True scheduler_health_check_server_port = 8974 orphaned_tasks_check_interval = 300.0 child_process_log_directory = /opt/airflow/logs/scheduler scheduler_zombie_task_threshold = 300 zombie_detection_interval = 10.0 catchup_by_default = True ignore_first_depends_on_past_by_default = True max_tis_per_query = 512 use_row_level_locking = True max_dagruns_to_create_per_loop = 10 max_dagruns_per_loop_to_schedule = 20 schedule_after_task_execution = True parsing_processes = 2 file_parsing_sort_mode = modified_time standalone_dag_processor = False max_callbacks_per_loop = 20 dag_stale_not_seen_duration = 600 use_job_schedule = True allow_trigger_in_future = False trigger_timeout_check_interval = 15

[triggerer] default_capacity = 1000

[kerberos] ccache = /tmp/airflow_krb5_ccache principal = airflow reinit_frequency = 3600 kinit_path = kinit keytab = airflow.keytab forwardable = True include_ip = True

[elasticsearch] host = log_id_template = {dag_id}-{task_id}-{run_id}-{map_index}-{try_number} end_of_log_mark = end_of_log frontend = write_stdout = False json_format = False json_fields = asctime, filename, lineno, levelname, message host_field = host offset_field = offset

[elasticsearch_configs] use_ssl = False verify_certs = True

[kubernetes_executor] pod_template_file = worker_container_repository = worker_container_tag = namespace = default delete_worker_pods = True delete_worker_pods_on_failure = False worker_pods_creation_batch_size = 1 multi_namespace_mode = False in_cluster = True kube_client_request_args = delete_option_kwargs = enable_tcp_keepalive = True tcp_keep_idle = 120 tcp_keep_intvl = 30 tcp_keep_cnt = 6 verify_ssl = True worker_pods_pending_timeout = 300 worker_pods_pending_timeout_check_interval = 120 worker_pods_queued_check_interval = 60 worker_pods_pending_timeout_batch_size = 100

[sensors] default_timeout = 604800



Thanks for youe valuable help and please let know if there is any information you need.
potiuk commented 1 year ago

This is your problem to solve. I am not reading all the details, but if you wish to continue this dicussion, then pleas opem a new one - you are piggybacking on someone's different error and different case. You are hijacking this closed issue for somewhat related but different issue.

And please do not add even more issues. You seem to have an issue with designing a company wide solution based on docker-compose that is quite a bit of beyond of "why the quick-start docker compose does not work as advertised". You seem to need a professional paid help on solving the problems if you won't be able to design it on your own.

I am not goiing to have time (in my free time) to solve the problem you have and design a solution that will work for your company and team but can tell you some assumptions of the image we have and point you to the right docs you can read to understand thoroughly what's going on so that you can solve it - and if you want to map into company wide solution for you, you need to map it to those assumptions as you see fit.

If you are trying to use the docker compose ("quick start") of ours to be able to do your "company wide" deployment you are mostly on your own to make it works well for your case (this is how docker-compose works) and you shoudl modify it to fit your needs. Our quick-start docker-compose is just a starting point and reference for someone to write their own (if they wish) and (as indicated in the docker-compose) it has plenthy of things that you will likely have to modify and design your own docker-compose that you need to do to make it works.

I think (but I will have no time to dive-deep into your solution - we are all here helping in our free time, so I can give at most some generic advice.

The image of airlfow works with the assumption that either you have "airflow" owned files and folders or "0" group owned ones (this is for open-shift compatibility). See all the details about it in the docs: https://airflow.apache.org/docs/docker-stack/entrypoint.html . What you happanes when you use different uid the entrypoint (and it is all described in the docs) creates a new user, makes it belongs to "0" group and sets it home to same as "airflow" user. And all the files and folders etc should be owned and created by the "0" group and read/write works for them to make it works. So if you wish to do it and share files and folders somehow you need make sure "0" group owns it. If you are using "airflow" user by default, you should just make sure that your volumes are read/write for "airflow" users.

How to do it exactly if you build some kind of sharing and git based on the docker compose is primarily your job to figure out, depends what you want to do. I have no ready-recipes here I am afrais.

MingTaLee commented 1 year ago

Thanks for your input and info. Will test with a fresh new VM to pinpoint the issues.