Open AetherUnbound opened 2 years ago
I'd encourage us to think about how much of this could be handled with custom lint rules, templates, and things like that over a manual handbook. Happy to have some documentation about best practices, but the more we can automate the better!
Some of the syntax and name type requirements could be defined in a flake8 plugin: https://flake8.pycqa.org/en/latest/plugin-development/index.html
Filenames can be handled by https://pypi.org/project/flake8-filename/
A pylint custom checker might be easier to implement than a flake8 plugin though... https://pylint.pycqa.org/en/latest/development_guide/how_tos/custom_checkers.html
Description
By virtue of being workflows which can be defined in pure Python, there are innumerable ways to set up and author DAGs. Particularly as we get more contributors, it may be useful and prudent to define a set of patterns to use (and to avoid) when authoring DAGs. This document could include information on:
op_kwargs
withPythonOperators
get_*_operator
functions WordPress/openverse-catalog#238, WordPress/openverse-catalog#301)start_date
for a DAGIt could even potentially include some boilerplate DAG templates (I know that I almost never write a DAG from scratch :sweat_smile:)!
Alternatives
Additional context
Implementation