kestra-io / kestra

:zap: Open-source workflow automation platform. Orchestrate any language using YAML, hundreds of integrations. Alternative to Airflow, Zapier, RunDeck, Camunda, ...
https://kestra.io
Apache License 2.0
9.47k stars 701 forks source link

Add ability to clean executions automatically #4206

Closed aku closed 2 months ago

aku commented 3 months ago

Feature description

I know there are Purge and PurgeExecution tasks available, however, it would be very useful to be able to specify TTL of executions somewhere in the workflow. I've faced with a problem running multiple kafka-based workflows that produce gigabytes of intermediate data. It would be so much easier for me to set something like this on a flow-level:

id: flowId
namespace: nsId

autoCleanup: 
  interval: PT48H # clean executions every 48 hrs or clean executions that are 48 hrs old
  states: # clean executions only in certain states
    - SUCCESS

In my experience, I rarely need to access data of old executions. If I need to persist some data I would rather put it in a separate storage. Currently, I have to create separate flows with Purge + Cron Trigger for each use-case. Also, I've faced with multiple problems using Purge task (logs and executions are not cleaned properly)

anna-geller commented 3 months ago

I think your main issue will be addressed in https://github.com/kestra-io/kestra/issues/4207

the automatic cleanup seems like a nice additional idea!

anna-geller commented 2 months ago

we discussed internally that we have no way of implementing this other than by using system flows - I linked the request to the bigger project so you can follow along