apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
35.45k stars 13.85k forks source link

Rendering custom map index before task is run. #39118

Open tomrutter opened 3 months ago

tomrutter commented 3 months ago

Description

The map_index_template field on a task is used to render a custom name for each individual sub task in a mapped task. It is generated by rendering the task context on the jinja template provided. This rendering happens after the task is completed so it does not appear during or before the task run. The new feature would be to attempt to render and display this field before the task is run (using the context available at the time), with a suitable failback if this is not possible with the context values available.

Use case/motivation

It would be useful to see the custom map index during task runs, when the context information to do so is available. The mapped task index can be used to provide human readable information to distinguish the mapped tasks from one another and would make it easier to track task progress during a dag run.

Related issues

Various issues and PRs have dealt with displaying the custom map index when the task fails.

39065, #39092, #38902, #39087

Are you willing to submit a PR?

Code of Conduct

tomrutter commented 3 months ago

This is a follow on from #39065 with a narrower scope (just rendering while the task is running) as the other items in that issue have been or will be resolved by the other issues/PRs referenced.

raphaelauv commented 3 months ago

related : https://github.com/apache/airflow/issues/39092

RNHTTR commented 2 months ago

@tomrutter Would you like to be assigned?

Given the current design (using a task's context object means we have to wait for the task to start running), I think this might be trickier than meets the eye.

raphaelauv commented 2 months ago

since we can define at operator definition the field map_index_template="{{ task.op_kwargs['date'] }}",, it can be render by the webserver

tomrutter commented 2 months ago

Happy to be assigned, but if someone has a clear idea, happy for that to go ahead too.

I’m guessing the big issue with rendering on the webserver is picking up the context added by the task. I’d start with that for now though with failover to rendered value from the task if that exists.

raphaelauv commented 3 weeks ago

Hi @tomrutter are you working on ?