tomasfarias / airflow-dbt-python

A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
https://airflow-dbt-python.readthedocs.io
MIT License
170 stars 35 forks source link

Why are these operators not in Airflow as standards #105

Closed KarthikRajashekaran closed 1 year ago

KarthikRajashekaran commented 1 year ago

Why are these operators not in Airflow as standards

Anything I am missing?

tomasfarias commented 1 year ago

There are some guidelines on contributing a new provider in the Airflow docs.

Most of the work involves chores, like adding up a setup.cfg or provider_info.schema.json, and version 1.0 already included a lot of clean-up to make these tasks easier. All in all, this is work we want to do to aid with discoverability of airflow-dbt-python's features and turn it into a proper "Airflow Provider".

If by "Airflow standards" you mean contributing airflow-dbt-python to the Airflow repo, that would require support from the community/Airflow devs:

Can I contribute my own provider to Apache Airflow?

Of course, but it’s better to check at developer’s mailing list whether such contribution will be accepted by the Community, before investing time to make the provider compliant with community requirements. The Community only accepts providers that are generic enough, are well documented, fully covered by tests and with capabilities of being tested by people in the community. So we might not always be in the position to accept such contributions.

IMO airflow-dbt-python should fit the bill but I'm obviously biased.

KarthikRajashekaran commented 1 year ago

We have a requirement to clone the GitLab repo dbtproject to airflow and run dbt operators

Does airflow-dbt-python helps in this case?

We are using Airflow with K8s

KarthikRajashekaran commented 1 year ago

@tomasfarias Please check Airflow Slack messages , to see if someone can help you to get as standards in airflow

tomasfarias commented 1 year ago

We have a requirement to clone the GitLab repo dbtproject to airflow and run dbt operators Does airflow-dbt-python helps in this case?

The DbtGitRemoteHook was built exactly to support dbt projects stored in git repositories. Granted, it's new in version 1.0, and I haven't had the chance to test it with GitLab. It would be very appreciated if you could test it out. If not, I can run some tests over the next week or so. Happy to address any issues that crop up.

The documentation contains an example DAG for downloading a dbt project GitHub: https://airflow-dbt-python.readthedocs.io/en/latest/getting_started.html#id7.

Please check Airflow Slack messages , to see if someone can help you to get as standards in airflow.

I'm currently more focused on fixing bugs. However, we do want to support the Airflow provider interface, so PRs are more than welcome.