apache / dolphinscheduler

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
https://dolphinscheduler.apache.org/
Apache License 2.0
12.73k stars 4.58k forks source link

[Feature][Master] Allows updating workflows and restarting failed Tasks while the workflow is running #16368

Closed lizc9 closed 3 weeks ago

lizc9 commented 2 months ago

Search before asking

Description

Allows updating workflows and restarting failed Tasks while the workflow is running

Use case

Scenario 1: In the running workflow instance, there is a still running Task1 and a failed Task2, but Task1 needs to run for 2 hours. I either have to kill the entire workflow or wait for the workflow to finish running for 2 hours before resuming Task2. If Task1 has been executed halfway, it will take another 2 hours to kill the workflow and rerun it, but the best solution is to directly allow the failed Task2 to be rerun. Scenario 2: Maybe you can set the number of retries for Task2, but sometimes you need to modify the parameters of Task2 to rerun successfully, so you also need to allow the parameters of Task2 to be modified in the running workflow. screenshot-20240725-131834 screenshot-20240725-131957

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

SbloodyS commented 2 months ago

I don't think this is part of the normal workflow. A running workflow that is modified at the same time will cause a running disorder.

lizc9 commented 2 months ago

I don't think this is part of the normal workflow. A running workflow that is modified at the same time will cause a running disorder.

Recovery of failed tasks on running workflows is supported even if it is not allowed to modify workflows

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] commented 3 weeks ago

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.