apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.29k stars 14.09k forks source link

Support Serialized DAGs on CLI Commands #15306

Open john-jac opened 3 years ago

john-jac commented 3 years ago

Description

CLI commands such as backfill and list_dags currently parse dags before executing. This introduces 2 primary issues. 1) That parse process can be time consuming, and 2) It will not work if running on a web server that does not have access to DAGs and/or their respective Python libraries or Airflow plugins.

Use case / motivation

By providing users an option to use serialized DAGs with the Airflow CLI, users can opt for the more efficient method of executing commands based on the information available in the metadatabase rather than relying solely on parsing the source DAGs.

Are you willing to submit a PR?

Yes

Related Issues

No

jhtimmins commented 3 years ago

@kaxil I'm not super familiar with DAG serialization, but I think this sounds reasonable. Are there any drawbacks to supporting this functionality?

uranusjr commented 3 years ago

Would it make more sense to change them to always use the serialised DAGs instead, since all DAGs are now serialised in 2.0? Or are there things that only parsing the real DAG can do?

kaxil commented 3 years ago

list_dags is fine -- backfill won't work on serialized dags.

john-jac commented 3 years ago

list_dags is fine -- backfill won't work on serialized dags.

@kaxil could you provide details as to why backfill can't run on serialized dags?

fanaticjo commented 3 years ago

dags list-runs -d braavos_query_maker_1_20210614_09161623642378 -e 2021-06-17 --no-backfill -o json -s 2021-05-01 even this commands wont run if a dag is using plugins , this commands only run inside the workers , i think this is a major issue with airflow

neontty commented 1 year ago

This problem is impacting my ability to use 'backfill' on AWS MWAA CLI effectively. If this feature could be added it would be much appreciated.

potiuk commented 1 year ago

It think the fastest way to get it done is.... to implement it @neontty . And I am afraid any implementation here will NOT solve your problem on MWAA for another 6 months at least, because even it is implemented now, that's the minimum release cycle for MWAA (at least so far) - so you should rather look at better way of handling it in MWAA (and raise support request there).

I marked it as "good first issue" - maybe someone will pick it up but again -the fastest way to implement it, is to pick it up and implement.

eladkal commented 1 month ago

We now have reprising end point https://github.com/apache/airflow/pull/39138 @utkarsharma2 is there intention to support it also for CLI?