kestra-io / kestra

:zap: Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
https://kestra.io
Apache License 2.0
10.53k stars 886 forks source link

Bulk delete is too slow #2593

Open brian-mulier-p opened 10 months ago

brian-mulier-p commented 10 months ago

Explain the bug

Currently as a quick win what was done is that both for by-ids & by-query deletions, we retrieve executions first:

Then in both case for each retrieved execution we do a repository delete (which doesn't do bulk at all) which can be time & resource consuming :red_circle:

What I suggest is in both case do a bulk delete query:

This will greatly reduce deletion times.

Another solution would be to make this process asynchronous but we may need to provide a view to see the deletion progress which might be hard :thinking:

In any way I think the first solution is the go-to for now as it will heavily reduce database load.

Environment Information

loicmathieu commented 10 months ago

By careful that some databases limit the number of elements in a IN clause so we may limit the bulk to for ex 100. This needs to be evaluated