apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.31k stars 14.09k forks source link

Priority weight upstream inconsistency for tasks between DAGs #41835

Open MatthewStrickland opened 2 weeks ago

MatthewStrickland commented 2 weeks ago

Apache Airflow version

Other Airflow 2 version (please specify below)

If "Other Airflow 2 version" selected, which one?

2.8.1

What happened?

Task with higher priority (when priority weight upstream is set) not selected to run when different DAGs are racing each other

What you think should happen instead?

Once a task has finished the executor should give a chance for the tasks of other DAGs to queue up before choosing the next task to run. (I'm assuming this is the problem why the executor sometimes chooses the wrong task to run). And/or the order of tasks should be somewhat consistent/deterministic.

Fwiw, I don't ever see this issue inside a single DAG where dagruns are racing each other, just between dagruns of different DAGs. There's no mention in the docs that this feature is only consistent between dagruns of the same DAG. It's a feature coupled with pools I believe, which work across DAGs, so I assumed this would work across DAGs too.

How to reproduce

Create 2 dags (dag 1 & 2) and set priority weight upstream on both Create a pool Have 5 tasks each, where both use the same pool for tasks 3 & 4 Have task 3 take a long time (sleep) Run both dags together

Eventually: dag 1 is running task 3 and dag 2 has queued task 3 Sometimes (and desired): dag 1 task 4 runs before dag 2 task 3 Sometimes (and undesired): dag 2 task 3 runs before dag 1 task 4

Operating System

rhel 9

Versions of Apache Airflow Providers

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else?

So this is very minor and might not even be considered a bug at all, but I found it rather unintuitive, so if it's not classed as an bug I'm ok with that, just information I thought worth sharing.

Are you willing to submit PR?

Code of Conduct

jedcunningham commented 2 weeks ago

I have a feeling schedule_after_task_execution may be the reason. Could you try turning that off and confirming?

jscheffl commented 2 weeks ago

+1 in the assumption. Post task executions the next tasks are scheduled. Thus they might get immediately scheduled while the scheduler running centrally sees limits early and does not schedule other stuff.

Also can you please post an example DAG and check if the same applies to Airflow 2.10 as well?

I would judge it a bit that scheduling is mostly not forcing a priority and does not attempt NOT to schedule bust attempts to schedule in a loop using best-effort what is schedule-able. The schedule_after_task_execution might be contributing to the best effort fact with the benefit of reduced latency between tasks.

github-actions[bot] commented 2 days ago

This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.