Open nullhack opened 4 years ago
I think back_run=True
can work only if catchup=True
but depends_on_past=False
catchup=False
means past runs are not necessary.
Can anyone correct me if I am wrong?
I think
back_run=True
can work only ifcatchup=True
butdepends_on_past=False
catchup=False
means past runs are not necessary. Can anyone correct me if I am wrong?
Yes, this is what I meant, thanks for spotting that. I'm changing the description
@eladkal I see that the "Enhancement" label has been removed. Is this feature still being considered? Is there a way to emulate this? It would be perfect for our workflow for the same reason the original post stated.
@eladkal I see that the "Enhancement" label has been removed. Is this feature still being considered? Is there a way to emulate this? It would be perfect for our workflow for the same reason the original post stated.
@ndawg - Labels are just "organisation".
Maybe you do not understand how OSS software works, but the feature is implemented when someone picks an interest it implementing it. Creating a feature request does not - in any way - mean that someone works on it. This is an open-source-project - if you implement it yourself and contribute as PR, this is the most sure way to get it implemented. If you convince others to do it - as well, but otherwise someone will have to pick the task and implement it. This is how things work here. There is no queue, or planning. Things get implemented because someone make a decision to implement them.
So if you really WANT to have something implemented, creating a feature issue is just a beginning and you need to either implement or successfully advocate implementing it.
Description of the problem
A common workflow we have is running daily DAGs not dependent of past runs. When deployed, It need to catch up for many years before the runs are up to date and can be used by analysts.
The problem is that most of times, recent runs are more used/valuable than performing ETL of execution time 20 years ago. In practice we deploy the DAG two times, one for filling recent data and one for filling since inception.
Feature requested
A flag (
back_run=False
) on DAG class. If this flag is set toTrue
, DAG runs are scheduled in reverse order.E.g. consider we're on day
10
and in the middle of the process a new run for day11
happens: the DAG runs would be scheduled like10 -> 9 -> 8 -> 11 -> 7 -> 6 -> 5
(assuming the run for11
happened after running for day8
). This flag can only be used ifdepends_on_past=False