treasure-data / digdag

Workload Automation System
https://www.digdag.io/
Apache License 2.0
1.3k stars 221 forks source link

How to run for_each tasks independently? #1490

Open trendcloudservices opened 3 years ago

trendcloudservices commented 3 years ago

I have a simple workflow:

schedule:
    daily>: 17:00:00

+generate:
    +repeat:
         for_each>:
             ns: [year-2015, year-2016, year-2017, year-2018, year-2019, year-2020]
             _do:
                 sh>: ${HOME}/scripts/generate.sh --ns ${ns}

I would like to run the task 'generate.sh' independently, i.e. if one of the for loop year fails, the execution should keep continuing for rest of the tasks.

I tried _parallel:true , but that makes all the for loop tasks run parallely. I want them to run serially but failure of one task shouldn't halt the execution, and in the end, if one of the tasks fails the entire group should be tagged as failure after completion of all for each vars. How do I make it work?

hiroyuki-sato commented 3 years ago

Hello, @trendcloudservices

Have you ever tried require> operator with session_time option?

timezone: UTC

schedule:
  monthly>: 1,09:00:00

+depend_on_all_daily_workflow_in_month:
  loop>: ${moment(last_session_time).daysInMonth()}
  _do:
    require>: daily_workflow
    session_time: ${moment(last_session_time).add(i, 'day')}

# some monthly tasks here
vnktsh commented 3 years ago

@hiroyuki-sato

Hello hiroyuki, this is my personal account, that was my work account. Same person.

I didn't understand your suggestion. I don't have multiple workflows. I have just one workflow where I run tasks for_each: [a,b,c] sequentially.

However, what I want to achieve is, if task a fails, I want the workflow to keep running for b,c. Then at the end fail the task.

This way, what I can achieve is, once I get failure notification. I can go and fix task a and then utilize "retry_failed_tasks" button and just re-run it.

hiroyuki-sato commented 3 years ago

Hello, @vnktsh

I think I don't understand your requirements completely. I think It is difficult to achieve without an original locking mechanism.

You can skip the failure task with || exit 0 But. It treats as a successful task.

+test:
  for_each>:
    ns: [year-2015, year-2016, year-2017, year-2018, year-2019, year-2020]
  _do:
    sh>: sleep 1 ; /usr/bin/false || echo "force skip ${ns}"
2021-05-17 22:15:23 +0900 [INFO] (0017@[0:default]+fuga2+test^sub+for-0=ns=0=year-201): sh>: sleep 1 ; /usr/bin/false || echo "force skip year-2015"
force skip year-2015
2021-05-17 22:15:24 +0900 [INFO] (0017@[0:default]+fuga2+test^sub+for-0=ns=1=year-201): sh>: sleep 1 ; /usr/bin/false || echo "force skip year-2016"
force skip year-2016
2021-05-17 22:15:25 +0900 [INFO] (0017@[0:default]+fuga2+test^sub+for-0=ns=2=year-201): sh>: sleep 1 ; /usr/bin/false || echo "force skip year-2017"
force skip year-2017
2021-05-17 22:15:26 +0900 [INFO] (0017@[0:default]+fuga2+test^sub+for-0=ns=3=year-201): sh>: sleep 1 ; /usr/bin/false || echo "force skip year-2018"
force skip year-2018
vnktsh commented 3 years ago

Thanks a lot @hiroyuki-sato , I now understand your workaround. This will make the task exit with 0 status and let the for_each run for other tasks.

I now believe what I'm looking for is not achievable with existing digdag implementation. I'll look to using my own scripts to address this.

Thanks.

seiyab commented 2 years ago

I have encountered similar Issue. Then I defined Python task that runs command declared as an argument, store exit status as a variable and returns as success.

It might be useful if _independent: true option or something exist. However it might make retrying complex or raise other problems.