Closed rinzool closed 1 year ago
Hmmm... thanks for finding this one! It's a real puzzler, the manual claims TaskGroups are a UI-only concept.
But the manual also hints that task groups are implemented via the task_id. Which makes me suspect that this line might be breaking things! After all, BaseOperator says something different:
if task_group:
task_id = task_group.child_id(task_id)
Plan:
Issue
It seems that LakeFS operators are not working in Airflow TaskGroups
When using a LakeFS operator in a task group, the DAG disappear (not in error, but not listed by the scheduler). But if we remove the lakefs operator from a task group and use it directly in a DAG, the DAG magically appears.
It is a very strange behaviour (I've never seen that with Airflow), can be easily see locally with docker (see below)
Versions
How to reproduce
dags
with a filelakefs_issue_dag.py
:import pendulum from lakefs_provider.operators.merge_operator import LakeFSMergeOperator
@task_group def my_group(): begin = EmptyOperator(task_id="begin")
@dag( dag_id="lakefs_task_group_issue", schedule=None, start_date=pendulum.today('UTC').add(days=-1), ) def create_dag(arg1: str = "input"):
dag = create_dag()
docker compose up airflow-init docker compose up
docker exec -it {container_id} bash
pip install airflow-provider-lakefs