Open kakoni opened 7 years ago
@kakoni
Have you ever tried require>
?
http://docs.digdag.io/operators/require.html#require-depends-on-another-workflow
I hope this is the operator you are looking.
Aah right, I could do something like
+require:
require>: ..SELF..
session_time: ${last_session_time}
Yes that would work! One more question though, how do I get the initial state/run(=that can't be depended on last session ...)?
Hmm... last_session_time
is calculated just based on current timestamp... https://github.com/treasure-data/digdag/blob/master/digdag-standards/src/main/java/io/digdag/standards/scheduler/SecondsIntervalSchedulerFactory.java#L73
Indeed, you can't depend on it.
Maybe you need to use external persistent data (e.g. local file) as a workaround like this.
+start:
sh>: touch /tmp/${session_time}.lock
+check:
sh>: if [ -f /tmp/${last_session_time}.lock ]; then exit 1; fi
+run:
echo>: "Executing ${session_time}"
+end:
sh>: rm /tmp/${session_time}.lock
It seems there is a room to improve the above workflow in terms of robustness, though.
@hiroyuki-sato Does digdag have an interface to get previous instance for session?(=to get its status)
Hello, @kakoni
Could you tell me more detail about your question? Are you looking for CLI command like this? https://github.com/treasure-data/digdag/issues/603
Maybe there is no CLI interface yet.
I was thinking about creating a new operator/extending require> with depends_on_past(=perhaps there is a better name, but using this for now) option.
In order to get that to work, I would need to access the previous instance for the current session. So in pseudo lang;
Hello, @kakoni
I have no idea yet. I'll let you know if I find a good solution. (Due to I'm not core developer, I have to read the source) Le'ts hacking digdag! :smile:
@kakoni Did you ever find a solution to this problem? I'm dealing with the same thing. See #929.
@jaymed Yes. I really wanted to use digdag for my usecases but as this depends on past is so essential for my workflows, I had to go with airflow..
@kakoni OK makes sense. Thanks for getting back to me.
@hiroyuki-sato There's definitely a major need for this feature.
Hello, @kakoni and @jaymed
Thank you for commenting on a new feature.
Compare with AirFlow project(677 contributors), Digdag still develops with very a small team(58 contributors).
I will consider those requests.
By the way, I'm not familiar Apache AirFlow.
Do you know how to write depends_on_tasks
for an initial state in AirFlow? (It's mean that can't be depended on the last session )
https://github.com/treasure-data/digdag/issues/615#issuecomment-320591081
@muga Please take a look this Issue when you get a chance https://github.com/treasure-data/digdag/issues/929#issuecomment-454270266
To solve #615 and #929, I would like to introduce new scheduler options. wait_until_last_schedule and wait_until_last_schedule_succeed as follows.
How about these options?
@hiroyuki-sato
Do you know how to write depends_on_tasks for an initial state in AirFlow? (It's mean that can't be depended on the last session )
Theres another configuration option called start_date. If your execution date is same as start_date then it doesn't depend on last session(As this is the initial/first state)
Hello, @kakoni
Thank you for your reply!
@yoyama Does wait_until_last_schedule and wait_until_last_schedule_succeed support start_date
option in Airflow?
Heres the logic in airflow if interested https://github.com/apache/airflow/blob/master/airflow/ti_deps/deps/prev_dagrun_dep.py#L47
I am also wanted this feature. It is necessary for backfill multiple sessions but it need to proceed one-by-one. And also it need to run as single job such a memory consume workflow.
Apache airflow has this feature called depends_on_past, where "task instances will depend on the success of the preceding task instance".
I find this extremely usable in my usecase where I've got daily recurring tasks, so task running on 20170806 depends on success of 20170805.
Not sure, can you do something similar with digdag?