OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
Describe the bug
I see this error in an Airflow pipeline ingestion when a DAG is owned by a user that does not exist in OpenMetadata
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest CreatePipelineRequest [redacted_dag_name] due to api request failure: Team of type Organization can't own entities. Only Team of type Group can own entities.
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
[2024-06-07, 09:20:46 PDT] {status.py:76} WARNING - Failed to ingest Pipeline Status [Airflow.redacted_dag_name] due to api request failure: pipeline instance for Airflow.redacted_dag_name not found
2024-06-07, 12:26:49 PDT] {topology_runner.py:231} DEBUG - Processing stage: type_=<class 'metadata.ingestion.models.ometa_classification.OMetaTagAndClassification'> processor='yield_tag' nullable=True must_return=False overwrite=True consumer=None context='tags' store_all_in_context=False clear_context=False store_fqn=False cache_entities=False use_cache=False
[2024-06-07, 12:26:49 PDT] {topology_runner.py:231} DEBUG - Processing stage: type_=<class 'metadata.generated.schema.entity.data.pipeline.Pipeline'> processor='yield_pipeline' nullable=False must_return=False overwrite=True consumer=['pipeline_service'] context='pipeline' store_all_in_context=False clear_context=False store_fqn=False cache_entities=False use_cache=True
[2024-06-07, 12:26:49 PDT] {metadata_rest.py:135} DEBUG - Processing Create request <class 'metadata.generated.schema.api.data.createPipeline.CreatePipelineRequest'>
[2024-06-07, 12:26:49 PDT] {status.py:76} WARNING - Failed to ingest CreatePipelineRequest [redacted_dag_name] due to api request failure: Team of type Organization can't own entities. Only Team of type Group can own entities.
[2024-06-07, 12:26:49 PDT] {status.py:77} DEBUG - Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 219, in _one_request
resp.raise_for_status()
File "/home/airflow/.local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://openmetadata-prod.openmetadata.svc.cluster.local:8585/api/v1/pipelines
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 145, in _run
return self._run_dispatch(record)
File "/usr/local/lib/python3.10/functools.py", line 926, in _method
return method.__get__(obj, cls)(*args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 136, in _run_dispatch
return self.write_create_request(record)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/sink/metadata_rest.py", line 166, in write_create_request
created = self.metadata.create_or_update(entity_request)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/ometa_api.py", line 276, in create_or_update
return self._create(data=data, method="put")
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/ometa_api.py", line 267, in _create
resp = fn(self.get_suffix(entity), data=data.json(encoder=show_secrets_encoder))
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/utils/execution_time_tracker.py", line 195, in inner
result = func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 298, in put
return self._request("PUT", path, data)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 193, in _request
return self._one_request(method, url, opts, retry)
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/ometa/client.py", line 237, in _one_request
raise APIError(error, http_error) from http_error
metadata.ingestion.ometa.client.APIError: Team of type Organization can't own entities. Only Team of type Group can own entities.
To Reproduce
Create an Airflow DAG where the owner is set to the email of a user that does not exist in OpenMetadata.
Run this DAG at least once
Configure OpenMetadata to ingest Airflow pipelines with "Include Owners" set to true
Expected behavior
The missing owner should not prevent ingesting the DAG entirely. Instead it should be ingested with no owner, or some default owner.
Affected module Ingestion Framework
Describe the bug I see this error in an Airflow pipeline ingestion when a DAG is owned by a user that does not exist in OpenMetadata
To Reproduce
Expected behavior The missing owner should not prevent ingesting the DAG entirely. Instead it should be ingested with no owner, or some default owner.
Version:
openmetadata-ingestion[all]==1.4.0.1