bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
8.81k stars 9.1k forks source link

Unable to install pip requirements via extravolumeMount due to Read-only file system #28124

Open Gdtav opened 1 month ago

Gdtav commented 1 month ago

Name and Version

bitnami/airflow 18.3.9

What architecture are you using?

amd64

What steps will reproduce the bug?

Install the Helm Chart with a requirements.txt file mounted as a configMap (extraDeploy) using extraVolumes and extraVolumeMounts as described in the documentation.

Are you using any custom parameters or values?

relevant values:

extraVolumes:
  - name: requirements-volume
    configMap:
      name: airflow-requirements
extraVolumeMounts:
  - name: requirements-volume
    mountPath: /bitnami/python
extraDeploy:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: airflow-requirements
    data:
      requirements.txt: |
        apache-airflow[google]
        apache-airflow-providers-postgres
        pandas
        google-cloud-bigquery
        openlineage-python
        pandas-gbq

What is the expected behavior?

During the first initialization, the scheduler, web and worker containers will execute pip install -r /bitnami/python/requirements.txt successfully and install the required dependencies for my DAGs

What do you see instead?

This is the container log, and it repeats in a crash loop (truncated most of the "requirements already satisfied" lines):

airflow-scheduler 15:47:58.77 INFO  ==> 
airflow-scheduler 15:47:58.78 INFO  ==> Welcome to the Bitnami airflow-scheduler container
airflow-scheduler 15:47:58.78 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
airflow-scheduler 15:47:58.78 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
airflow-scheduler 15:47:58.78 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
airflow-scheduler 15:47:58.79 INFO  ==> 
airflow-scheduler 15:47:58.79 INFO  ==> Enabling non-root system user with nss_wrapper
WARNING: The directory '/opt/bitnami/airflow/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Requirement already satisfied: apache-airflow-providers-postgres in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 2)) (5.11.1)
Requirement already satisfied: pandas in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 3)) (2.1.4)
Requirement already satisfied: google-cloud-bigquery in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 4)) (3.20.1)
Collecting openlineage-python (from -r /bitnami/python/requirements.txt (line 5))
  Downloading openlineage_python-1.18.0-py3-none-any.whl.metadata (1.7 kB)
Requirement already satisfied: pandas-gbq in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 6)) (0.23.0)
Requirement already satisfied: apache-airflow[google] in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 1)) (2.9.2)
[...]
Requirement already satisfied: pydantic-core==2.18.4 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from pydantic<3->google-cloud-aiplatform>=1.42.1->apache-airflow-providers-google->apache-airflow[google]->-r /bitnami/python/requirements.txt (line 1)) (2.18.4)
Requirement already satisfied: importlib-resources>=1.3 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from limits>=2.8->Flask-Limiter<4,>3->flask-appbuilder==4.4.1->apache-airflow-providers-fab>=1.0.2->apache-airflow[google]->-r /bitnami/python/requirements.txt (line 1)) (6.4.0)
Downloading openlineage_python-1.18.0-py3-none-any.whl (44 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.2/44.2 kB 1.1 MB/s eta 0:00:00
Installing collected packages: openlineage-python
ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/opt/bitnami/airflow/venv/lib/python3.11/site-packages/openlineage'
[notice] A new release of pip is available: 24.1 -> 24.1.2
[notice] To update, run: pip install --upgrade pip
vorandrew commented 1 month ago

Same here

airflow-worker 15:55:42.14 INFO ==>
airflow-worker 15:55:42.14 INFO ==> Welcome to the Bitnami airflow-worker container
airflow-worker 15:55:42.15 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
airflow-worker 15:55:42.15 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
airflow-worker 15:55:42.15 INFO ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
airflow-worker 15:55:42.15 INFO ==>
airflow-worker 15:55:42.16 INFO ==> Enabling non-root system user with nss_wrapper
WARNING: The directory '/opt/bitnami/airflow/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Collecting corsound-airflow@ https://_json_key_base64:****@us-python.pkg.dev/mvp-2023-10-10/corsound/corsound-airflow/corsound_airflow-2.0.0-py3-none-any.whl (from -r /bitnami/python/requirements.txt (line 2))
Downloading https://_json_key_base64:****@us-python.pkg.dev/mvp-2023-10-10/corsound/corsound-airflow/corsound_airflow-2.0.0-py3-none-any.whl (4.5 kB)
Requirement already satisfied: flask in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 1)) (2.2.5)
Requirement already satisfied: Werkzeug>=2.2.2 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from flask->-r /bitnami/python/requirements.txt (line 1)) (2.2.3)
Requirement already satisfied: Jinja2>=3.0 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from flask->-r /bitnami/python/requirements.txt (line 1)) (3.1.4)
Requirement already satisfied: itsdangerous>=2.0 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from flask->-r /bitnami/python/requirements.txt (line 1)) (2.2.0)
Requirement already satisfied: click>=8.0 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from flask->-r /bitnami/python/requirements.txt (line 1)) (8.1.7)
Collecting peppercorn (from corsound-airflow@ https://_json_key_base64:=@us-python.pkg.dev/mvp-2023-10-10/corsound/corsound-airflow/corsound_airflow-2.0.0-py3-none-any.whl->-r /bitnami/python/requirements.txt (line 2))
Downloading peppercorn-0.6-py3-none-any.whl.metadata (3.4 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from Jinja2>=3.0->flask->-r /bitnami/python/requirements.txt (line 1)) (2.1.5)
Downloading peppercorn-0.6-py3-none-any.whl (4.8 kB)
Installing collected packages: peppercorn, corsound-airflow
ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/opt/bitnami/airflow/venv/lib/python3.11/site-packages/peppercorn'
[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: pip install --upgrade pip
kav commented 1 month ago

Apologies for the drive by but would you not want to provide a custom container image with the dependencies baked in? Installing on a vanilla image on pod start feels very anti-pattern.

Gdtav commented 1 month ago

Apologies for the drive by but would you not want to provide a custom container image with the dependencies baked in? Installing on a vanilla image on pod start feels very anti-pattern.

I was following the instructions to add dependencies as described on the chart page; I tried to go the custom image approach with the official airflow chart but it didn't work and I couldn't figure out why, and the bitnami chart had no instructions on how to do it that way. For me either solution would be fine, as long as I can finally deploy this.

matheuscarreirod commented 1 month ago

same problem here

heizerbalazs commented 1 month ago

As a temporal solution you can install the chart by disabling the podSecurityContext and containerSecurityContext for the web, scheduler and worker deployments

rafariossaa commented 1 month ago

Hi, The filesystem is readonly, so for this case, you would need to set the pod and container security contexts.

Gdtav commented 1 month ago

Hi, The filesystem is readonly, so for this case, you would need to set the pod and container security contexts.

Alright! Would be nice to update the documentation to reflect that caveat. I ended up using the official airflow with a custom image as was suggested by @kav, I tried to do the same with this chart but also couldn't do it, is it possible at all? If it's considered a best practice, shouldn't there be a short example about it in the chart page? (anyway, for me the issue can be closed).

fmulero commented 3 weeks ago

Another workaorund, instead of setting the container security contexts, is to mount a volume with the content of the virtualenv. Here you can find see an example:

extraVolumes:
  - name: requirements-volume
    configMap:
      name: airflow-requirements
  - name: venv
    emptyDir: {}
extraVolumeMounts:
  - name: requirements-volume
    mountPath: /bitnami/python
  - name: venv
    mountPath: /opt/bitnami/airflow/venv/lib
initContainers:
  - name: copy-python-env
    image: bitnami/airflow
    command:
      - /bin/bash
    args:
      - -ec
      - |
        #!/bin/bash
        cp -r /opt/bitnami/airflow/venv/lib/* /venv
    volumeMounts:
      - name: venv
        mountPath: /venv
extraDeploy:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: airflow-requirements
    data:
      requirements.txt: |
        apache-airflow[google]
        apache-airflow-providers-postgres
        pandas
        google-cloud-bigquery
        openlineage-python
        pandas-gbq
keelamp commented 3 weeks ago

I wanted to point out that no one can use PythonVirtualEnvOperator since it requires an optional airflow dependency to be installed: apache-airflow[virtualenv]==2.9.1. Can there be a fix for letting us define optional airflow dependencies without messing with security contexts?

cheeyeelim commented 2 weeks ago

Another workaorund, instead of setting the container security contexts, is to mount a volume with the content of the virtualenv. Here you can find see an example:

extraVolumes:
  - name: requirements-volume
    configMap:
      name: airflow-requirements
  - name: venv
    emptyDir: {}
extraVolumeMounts:
  - name: requirements-volume
    mountPath: /bitnami/python
  - name: venv
    mountPath: /opt/bitnami/airflow/venv/lib
initContainers:
  - name: copy-python-env
    image: bitnami/airflow
    command:
      - /bin/bash
    args:
      - -ec
      - |
        #!/bin/bash
        cp -r /opt/bitnami/airflow/venv/lib/* /venv
    volumeMounts:
      - name: venv
        mountPath: /venv
extraDeploy:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: airflow-requirements
    data:
      requirements.txt: |
        apache-airflow[google]
        apache-airflow-providers-postgres
        pandas
        google-cloud-bigquery
        openlineage-python
        pandas-gbq

I can confirm that this works for me. Only small modifications I did is to change /opt/bitnami/airflow/venv/lib to /opt/bitnami/airflow/venv as some Python libraries install into venv/bin as well.

fmulero commented 1 week ago

Thanks @cheeyeelim for sharing your outputs.