astronomer / dag-factory

Dynamically generate Apache Airflow DAGs from YAML configuration files
Apache License 2.0
1.21k stars 182 forks source link

[Feature] load_yaml_dags should have recursive=False option #289

Open wearpants opened 6 days ago

wearpants commented 6 days ago

Description

load_yaml_dags by default loads files from all subdirectories of the dags_folder - sometimes that's not what you want. An option to use glob instead of rglob here would be helpful

Use case/motivation

My project structure is complex, and uses mulitple dag folders with yaml files, and the recursive behavior has been a source of bugs. I'd like to be able to disable it.

Related issues

290, #291, #297

Are you willing to submit a PR?

cmarteepants commented 5 days ago

Hi @wearpants, that sounds reasonable, especially if recursive is the default. Do you think we also need an option to specify how deep to go when recursive=True?

Since you indicated that you are willing to submit a PR, we can assign this to you if you'd like?

cc @tatiana

wearpants commented 4 days ago

It's not possible to specify a depth limit with rglob, it's either no recursion (via glob ) or infinite. Also I think recursive by default is the wrong behavior (or at least unexpected), on basis of the "explicit is better than implicit" principle.

Don't know your policy on breaking backwards compatibility (as #290 would also do) - might be better to introduce new load_directory function covering both cases (also a better name) that allows for future API growth (see also #297)?

I'm not likely to have time for this until Q1

cmarteepants commented 4 days ago

That's not a bad idea. We're going to release a 1.0 in Q1, so there is an opportunity if we wanted to break backwards compat. Will get feedback from the team, but that's a fair callout re being explicit.