kedro-org / kedro-plugins

First-party plugins maintained by the Kedro team.
Apache License 2.0
90 stars 82 forks source link

How to add task_group argument? [kedro-airflow] [task_group] [group_id] #339

Open kevin-koga-mckinsey opened 11 months ago

kevin-koga-mckinsey commented 11 months ago

Description

I need to add a task_group to the kedro pipelines in order to create a better hierarchy of execution.

Context

I have a airflow pipeline that needs to have a task_group (with group_id, etc) due to order of execution progress. It is like a cross dependency between all input are ready to the modeling process starts.

Possible Implementation

Bind nodes tags to the task_group.group_id by ordering the tags list in alphabetical order then concatenating them in order to create a task_group.group_id! OR even getting the name of the referenced pipeline as a task_group.group_id.

Possible Alternatives

Adapting the generated project_name_dag.py and manually adding the groups.

Attention! This is not a complaint, this is just an idea ! : )

Happy to hear you back folks, cheers : )

astrojuanlu commented 11 months ago

Thanks @kevin-koga-mckinsey for making this suggestion! I think this resonates with your research @datajoely , in particular enabling users to manually group nodes.

@kevin-koga-mckinsey just FYI notice there's an ongoing PR by @sbrugman that adds some automatic grouping https://github.com/kedro-org/kedro-plugins/pull/241 but it's still automatic.