apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.3k stars 14.09k forks source link

JdbcHook doesn't support OpenLineage #41878

Open hadanmarv opened 2 weeks ago

hadanmarv commented 2 weeks ago

Apache Airflow Provider(s)

common-sql

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.24.0 apache-airflow-providers-celery==3.7.2 apache-airflow-providers-cncf-kubernetes==8.3.1 apache-airflow-providers-common-compat==1.2.0 apache-airflow-providers-common-io==1.3.2 apache-airflow-providers-common-sql==1.14.0 apache-airflow-providers-docker==3.12.0 apache-airflow-providers-elasticsearch==5.4.1 apache-airflow-providers-fab==1.1.1 apache-airflow-providers-ftp==3.9.1 apache-airflow-providers-google==10.19.0 apache-airflow-providers-grpc==3.5.1 apache-airflow-providers-hashicorp==3.7.1 apache-airflow-providers-http==4.11.1 apache-airflow-providers-imap==3.6.1 apache-airflow-providers-jdbc==4.3.1 apache-airflow-providers-microsoft-azure==10.1.1 apache-airflow-providers-microsoft-mssql==3.7.1 apache-airflow-providers-microsoft-winrm==3.5.1 apache-airflow-providers-mongo==4.1.1 apache-airflow-providers-mysql==5.6.1 apache-airflow-providers-odbc==4.6.1 apache-airflow-providers-openlineage==1.11.0 apache-airflow-providers-postgres==5.11.1 apache-airflow-providers-redis==3.7.1 apache-airflow-providers-samba==4.7.1 apache-airflow-providers-sendgrid==3.5.1 apache-airflow-providers-sftp==4.10.1 apache-airflow-providers-slack==8.7.1 apache-airflow-providers-smtp==1.7.1 apache-airflow-providers-snowflake==5.5.1 apache-airflow-providers-sqlite==3.8.1 apache-airflow-providers-ssh==3.11.1

Apache Airflow version

2.9.2

Operating System

Debian GNU/Linux 12 (bookworm)

Deployment

Docker-Compose

Deployment details

No response

What happened

There is no support of OpenLineage in JdbcHook. For example to extract metadata and create table and columns in openlineage. We find this https://github.com/OpenLineage/OpenLineage/blob/main/integration/airflow/openlineage/airflow/extractors/sql_extractor.py but unfortunatly we can't used it

What you think should happen instead

No response

How to reproduce

Always the case

Anything else

No response

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 2 weeks ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

potiuk commented 2 weeks ago

Sure - marked it as good first issue for someone to make it support it. You could do it if you want it faster, other than that someone will have to pick it and implement it (but ideally things like that are implemented by those who need them).

dangerdude237 commented 1 week ago

Hi @potiuk, I would like to handle this, could you give me some pointers on this issue?

potiuk commented 1 week ago

I'd say @mobuchowski and @kacpermuda are the best to help

mobuchowski commented 1 week ago

Actually @JDarDagran works on that now 🙂

potiuk commented 1 week ago

😱

potiuk commented 1 week ago

So maybe @JDarDagran comment here and we will assign it to you :D

potiuk commented 1 week ago

(we can't do it without you commenting)

JDarDagran commented 1 week ago

tactical dot