OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
Actually i'm using Openmetadata v 1.5.6. and Airflow v 2.10.2 and python 3.10.12
On my airflow i have installed the following packages:
openmetadata-ingestion 1.5.6.0
openmetadata_managed_apis 1.5.6.0
then i have set a connection on my openmetadata gui, specifying the airflow credentials and database it's installed on.
I have set the metadata ingestion including: lineage, debug log and owners
But when i manually run it, it says success, no errors and so on but no pipelines are shown!! i'm getting mad with this.
These are the logs associated with the pipeline:
0dd1c83348b6
Found local files:
* /opt/airflow/logs/dag_id=84735fc4-594b-4a34-b4b0-7110d46b9225/run_id=manual2024-11-12T22:21:07+00:00/task_id=ingestion_task/attempt=1.log
[2024-11-12T22:21:08.426+0000] {local_task_job_runner.py:120} INFO - ::group::Pre task execution logs
[2024-11-12T22:21:08.478+0000] {taskinstance.py:2076} INFO - Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: 84735fc4-594b-4a34-b4b0-7110d46b9225.ingestion_task manual2024-11-12T22:21:07+00:00 [queued]>
[2024-11-12T22:21:08.504+0000] {taskinstance.py:2076} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: 84735fc4-594b-4a34-b4b0-7110d46b9225.ingestion_task manual2024-11-12T22:21:07+00:00 [queued]>
[2024-11-12T22:21:08.505+0000] {taskinstance.py:2306} INFO - Starting attempt 1 of 1
[2024-11-12T22:21:08.545+0000] {taskinstance.py:2330} INFO - Executing <Task(CustomPythonOperator): ingestion_task> on 2024-11-12 22:21:07+00:00
[2024-11-12T22:21:08.563+0000] {standard_task_runner.py:63} INFO - Started process 2744 to run task
[2024-11-12T22:21:08.578+0000] {standard_task_runner.py:90} INFO - Running: ['airflow', 'tasks', 'run', '84735fc4-594b-4a34-b4b0-7110d46b9225', 'ingestion_task', 'manual2024-11-12T22:21:07+00:00', '--job-id', '185', '--raw', '--subdir', 'DAGS_FOLDER/84735fc4-594b-4a34-b4b0-7110d46b9225.py', '--cfg-path', '/tmp/tmprk1ecm4a']
[2024-11-12T22:21:08.579+0000] {standard_task_runner.py:91} INFO - Job 185: Subtask ingestion_task
[2024-11-12T22:21:08.673+0000] {task_command.py:426} INFO - Running <TaskInstance: 84735fc4-594b-4a34-b4b0-7110d46b9225.ingestion_task manual2024-11-12T22:21:07+00:00 [running]> on host 0dd1c83348b6
[2024-11-12T22:21:08.848+0000] {taskinstance.py:2648} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='admin' AIRFLOW_CTX_DAG_ID='84735fc4-594b-4a34-b4b0-7110d46b9225' AIRFLOW_CTX_TASK_ID='ingestion_task' AIRFLOW_CTX_EXECUTION_DATE='2024-11-12T22:21:07+00:00' AIRFLOW_CTX_TRY_NUMBER='1' AIRFLOW_CTX_DAG_RUN_ID='manual__2024-11-12T22:21:07+00:00'
[2024-11-12T22:21:08.849+0000] {taskinstance.py:430} INFO - ::endgroup::
[2024-11-12T22:21:08.907+0000] {server_mixin.py:74} INFO - OpenMetadata client running with Server version [1.5.6] and Client version [1.5.6.0]
[2024-11-12T22:21:09.138+0000] {ingestion_pipeline_mixin.py:52} DEBUG - Created Pipeline Status for pipeline Airflow_adm_conn.84735fc4-594b-4a34-b4b0-7110d46b9225: runId='0a96f4a7-ba98-49a1-b2c0-989a311356a7' pipelineState=<PipelineState.running: 'running'> startDate=Timestamp(root=1731450068886) timestamp=Timestamp(root=1731450068886) endDate=None status=None
[2024-11-12T22:21:12.001+0000] {test_connections.py:221} INFO - Test connection results:
[2024-11-12T22:21:12.001+0000] {test_connections.py:222} INFO - failed=[] success=["'CheckAccess': Pass"] warning=[]
[2024-11-12T22:21:12.002+0000] {metadata.py:57} DEBUG - Source type:airflow,<class 'metadata.ingestion.source.pipeline.airflow.metadata.AirflowSource'> configured
[2024-11-12T22:21:12.002+0000] {metadata.py:59} DEBUG - Source type:airflow,<class 'metadata.ingestion.source.pipeline.airflow.metadata.AirflowSource'> prepared
[2024-11-12T22:21:13.604+0000] {metadata.py:68} DEBUG - Sink type:metadata-rest, <class 'metadata.ingestion.sink.metadata_rest.MetadataRestSink'> configured
[2024-11-12T22:21:13.606+0000] {topology_runner.py:166} DEBUG - Processing node producer='getservices' stages=[NodeStage(type=<class 'metadata.generated.schema.entity.services.pipelineService.PipelineService'>, processor='yield_create_request_pipeline_service', nullable=False, must_return=True, overwrite=False, consumer=None, context='pipeline_service', store_all_in_context=False, clear_context=False, store_fqn=False, cache_entities=True, use_cache=False)] children=['pipeline'] post_process=['mark_pipelines_as_deleted'] threads=False
[2024-11-12T22:21:13.606+0000] {topologyrunner.py:231} DEBUG - Processing stage: type=<class 'metadata.generated.schema.entity.services.pipelineService.PipelineService'> processor='yield_create_request_pipeline_service' nullable=False must_return=True overwrite=False consumer=None context='pipeline_service' store_all_in_context=False clear_context=False store_fqn=False cache_entities=True use_cache=False
[2024-11-12T22:21:13.681+0000] {topology_runner.py:166} DEBUG - Processing node producer='getpipeline' stages=[NodeStage(type=<class 'metadata.ingestion.models.ometa_classification.OMetaTagAndClassification'>, processor='yield_tag', nullable=True, must_return=False, overwrite=True, consumer=None, context='tags', store_all_in_context=False, clear_context=False, store_fqn=False, cache_entities=False, usecache=False), NodeStage(type=<class 'metadata.generated.schema.entity.data.pipeline.Pipeline'>, processor='yield_pipeline', nullable=False, must_return=False, overwrite=True, consumer=['pipeline_service'], context='pipeline', store_all_in_context=False, clear_context=False, store_fqn=False, cache_entities=False, usecache=True), NodeStage(type=<class 'metadata.ingestion.models.pipeline_status.OMetaPipelineStatus'>, processor='yield_pipeline_status', nullable=True, must_return=False, overwrite=True, consumer=['pipeline_service'], context=None, store_all_in_context=False, clear_context=False, store_fqn=False, cache_entities=False, usecache=False), NodeStage(type=<class 'metadata.generated.schema.api.lineage.addLineage.AddLineageRequest'>, processor='yield_pipeline_lineage', nullable=True, must_return=False, overwrite=True, consumer=['pipeline_service'], context=None, store_all_in_context=False, clear_context=False, store_fqn=False, cache_entities=False, use_cache=False)] children=None post_process=None threads=False
[2024-11-12T22:21:13.726+0000] {metadata.py:338} DEBUG - Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/ingestion/source/pipeline/airflow/metadata.py", line 324, in get_pipelines_list
dag = AirflowDagDetails(
File "/home/airflow/.local/lib/python3.10/site-packages/pydantic/main.py", line 176, in init
self.pydantic_validator.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for AirflowDagDetails
tasks.0.task_id
Field required [type=missing, input_value={'__var': {'template_fiel...}, 'type': 'operator'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.7/v/missing
Is there anybody that can help me figure out what's the matter???? ( this is my first post on here and i kindly ask sorry if i wrote this wrong)
Good morning everybody! Before writing this post i have seen this post https://github.com/open-metadata/OpenMetadata/issues/17751 and looks like someone else got my same issue but anyone explained how to solve it!
Actually i'm using Openmetadata v 1.5.6. and Airflow v 2.10.2 and python 3.10.12 On my airflow i have installed the following packages:
Is there anybody that can help me figure out what's the matter???? ( this is my first post on here and i kindly ask sorry if i wrote this wrong)