bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
8.86k stars 9.14k forks source link

[bitnami/airflow] OpenMetaData configuration #12786

Closed mustafa-rmd closed 1 year ago

mustafa-rmd commented 1 year ago

Name and Version

bitnami/airflow

What steps will reproduce the bug?

Hey guys,

I am trying to get configure airflow as the main pipeline service for OpenMetaData in my Kubernetes cluster. I am using the official open meta data helm chart along with the airflow managed by bitnami. See chart links section for links.

What have I done: For bitnami/airflow value.yaml

I installed the python packages as mentioned in the official page: https://docs.open-metadata.org/openmetadata/connectors/pipeline/airflow/lineage-backend

openmetadata-ingestion
openmetadata-airflow-managed-apis

and set the following environment variables to configure a lineage backend:

AIRFLOW__LINEAGE__BACKEND="airflow_provider_openmetadata.lineage.openmetadata.OpenMetadataLineageBackend"
AIRFLOW__LINEAGE__AIRFLOW_SERVICE_NAME="airflow_helm"
AIRFLOW__LINEAGE__OPENMETADATA_API_ENDPOINT="http://openmetadata:8585/api"
AIRFLOW__LINEAGE__AUTH_PROVIDER_TYPE="no-auth"

For OpenMetaData I disable the default airflow pod that comes with OpenMetaData and add the configuration for the bitnami/airflow.

Custom parameters

extraDeploy:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: requirements
    data:
      requirements.txt: |
        openmetadata-ingestion
        openmetadata-airflow-managed-apis
extraVolumeMounts:
      - name: requirements
        mountPath: /bitnami/python/requirements.txt
        subPath: requirements.txt
extraVolumes:
  - name: requirements
    configMap:
      name: requirements 
extraEnvVars:
   - name: AIRFLOW__LINEAGE__BACKEND
     value: airflow_provider_openmetadata.lineage.openmetadata.OpenMetadataLineageBackend
   - name: AIRFLOW__LINEAGE__AIRFLOW_SERVICE_NAME
     value: airflow_helm
   - name: AIRFLOW__LINEAGE__OPENMETADATA_API_ENDPOINT
     value: "http://openmetadata:8585/api"
   - name: AIRFLOW__LINEAGE__AUTH_PROVIDER_TYPE
     value: "no-auth"

What is the expected behavior?

OpenMetaData should detect airflow

What do you see instead?

Now. Going to OpenMetaData UI I getting the following error

Airflow Exception [Failed to get Pipeline Service host IP.] due to [Failed to get Pipeline Service host IP due to { "detail": "The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.", "status": 404, "title": "Not Found", "type": "about:blank" } ].

Basically, OpenMetaData cannot find Airflow. Note, that both in the same namespace which is not the default. I tried debugging for hours and could not found the source of the issue.

Additional information

Open meta data helm - https://github.com/open-metadata/openmetadata-helm-charts Airflow helm - https://github.com/bitnami/charts/tree/master/bitnami/airflow

Helm version: 3.2.1

javsalgar commented 1 year ago

Hi,

If OpenMetadata cannot find Airflow, could you try checking with the Openmetadata chart devs? If we need to change anything in the chart please let us know, but it seems that the chart allows you to add the custom settings you require.

mustafa-rmd commented 1 year ago

Hi @javsalgar

Thanks. I already inform the OpenMetaData devs. I trying to determine which chart cause the issue.

I have noted that setting the extra environment section doesn't update the airflow.cfg with lineage backend settings

 extraEnvVars:
    - name: AIRFLOW__LINEAGE__BACKEND
      value: airflow_provider_openmetadata.lineage.openmetadata.OpenMetadataLineageBackend
    - name: AIRFLOW__LINEAGE__AIRFLOW_SERVICE_NAME
      value: airflow_helm
    - name: AIRFLOW__LINEAGE__OPENMETADATA_API_ENDPOINT
      value: "http://openmetadata.datafabric.svc.cluster.local:8585/api"
    - name: AIRFLOW__LINEAGE__AUTH_PROVIDER_TYPE
      value: "no-auth"

airflow.cfg path is: opt/bitnami/airflow

airflow_env_setting

javsalgar commented 1 year ago

Hi,

As far as I remember, the AIRFLOW__ variables do not update the airflow.cfg but are applied anyway, as they are directly consumed by airflow

mustafa-rmd commented 1 year ago

Hi @javsalgar

But, how can confirm that is being applied?

javsalgar commented 1 year ago

Hi,

I'd check the Airflow logs to see if there's any mention to the configuration.

github-actions[bot] commented 1 year ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] commented 1 year ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

akash-jain-10 commented 1 year ago

Hello. Providing a closure to this issue here. It seemed to be an issue with using a deprecated python package with newer version of OpenMetadata. @mustafa-rmd confirmed the fix in the issue ticket raised here.