kestra-io / kestra

Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
https://kestra.io
Apache License 2.0
7.01k stars 407 forks source link

io.kestra.plugin.core.storage.Purge does not clean logs of deleted or disabled flows #3961

Open aku opened 3 weeks ago

aku commented 3 weeks ago

Issue description

I have a following flow:

id: clean_system
namespace: system

tasks:
  - id: clean_up_everything
    type: io.kestra.plugin.core.storage.Purge
    endDate: "{{ now() }}"
    purgeExecution: true
    purgeLog: true
    purgeMetric: true
    purgeStorage: true

I get following output when I run it:

image

It deletes execution but it does not delete all of the logs:

image

I've tried to specify namespace/flowId mentioned it these logs but it did not work. I've using latest Kestra 0.17.1 in a standalone mode on Ubuntu VMs I've noticed similar behaviour on 0.16.x as well Am I missing something?

Another potential problem is that when I delete a flow it's logs are still available in Logs page. Probably it works as intended but it would be nice to be able to purge logs that are no longer needed. Maybe you can add some button to do it on demand or a config option to control logs behaviour.

anna-geller commented 3 weeks ago

Can you say what you are trying to purge and for what time interval? Keep in mind we also have a PurgeExecution task

anna-geller commented 3 weeks ago

purge logs that are no longer needed.

We actually have that functionality. Right click on any logs and from the menu on the right you should see an option to delete that specific log 👍

aku commented 3 weeks ago

Not sure where to find the menu ( Here's what I see:

image

My problems is that I have triggers that produce thousands of executions. Each execution in turn produces several logs entires. I'd like to be able to do a regular clean up either manually or using time trigger. In this case I just execute the flow manually. Also, I'd like to be able to purge logs left over deleted flows

aku commented 3 weeks ago

Can you say what you are trying to purge and for what time interval? Keep in mind we also have a PurgeExecution task

I'm trying to delete all old entries ( endDate = now() ) for now. Everything (executions, storage) got deleted except logs. In a real-world scenario I would have several tasks to delete different objects with different time windows (e.g. delete triggered executions every 15 minutes, keep failed tasks info for 48 hours+ for analysis)

aku commented 3 weeks ago

Another problem is that if you go to logs and click on some execution that was deleted Kestra will show a spinner for awhile then an error pops up

image
anna-geller commented 2 weeks ago

Interesting, thanks for that extra content. What if, instead of purging logs, you could not store them in the first place? If you add the property logLevel: WARN (set to WARN or ERROR level), only critical logs will be stored, while all INFO/DEBUG logs won't be captured on the backend.

To manually purge specific logs:

image

aku commented 2 weeks ago

Interesting, thanks for that extra content. What if, instead of purging logs, you could not store them in the first place? If you add the property logLevel: WARN (set to WARN or ERROR level), only critical logs will be stored, while all INFO/DEBUG logs won't be captured on the backend.

To manually purge specific logs:

image

The thing is that executions are already deleted so I cannot access the UI page you mentioned. Also it would be quite daunting to delete logs one by one.

Can I use logLevel option per task/flow or it is a global setting that will affect the whole system?

Ideally, I would like to have following features:

anna-geller commented 2 weeks ago

Can I use logLevel option per task/flow or it is a global setting that will affect the whole system?

it's per task

being able to delete logs

the Purge task is intended to purge data in bulk (e.g. data older than 2 weeks) to avoid cluttering database space. for now, we don't offer selective delete of specific logs programmatically, so I'd recommend trying not to store the logs that you don't want to keep by using the logLevel option

a button in Logs page to delete logs manually

you're right that it's missing, I opened an issue here https://github.com/kestra-io/kestra/issues/3964

per task/flow option to suppress logs

Totally feasible, you can leverage plugin defaults to set it for a specific task or all tasks in a flow - setting this will only store WARN level log for all tasks in this flow

id: myflow
namespace: company.myteam

tasks:
  - id: print_status
    type: io.kestra.plugin.core.log.Log
    message: hello

pluginDefaults:
  - type: io.kestra
    values:
      logLevel: WARN
anna-geller commented 2 weeks ago

using the above info, I think you know what to do, so I'll close the issue. feel free to ask via Slack if you have further questions: https://kestra.io/slack

loicmathieu commented 2 weeks ago

@aku, may I ask for additional information to be sure there isn't something we didn't understand?

Did you use the Purge task or did you delete executions manually? Because by default, the purge task will delete related logs; if not, it's a bug.

loicmathieu commented 2 weeks ago

And you purge very aggressively with endDate: "{{ now() }}" without specifying the list of states so executions currently running will also be purged.

I'll advise to set it to for ex {{ now() | dateAdd(-1, 'HOURS') }} and restrict the state to avoid purging running executions.

aku commented 2 weeks ago

@aku, may I ask for additional information to be sure there isn't something we didn't understand?

Did you use the Purge task or did you delete executions manually? Because by default, the purge task will delete related logs; if not, it's a bug.

I've deleted data using Purge task. What I was trying to say is that it does not delete logs for non-existing executions/tasks which is probably a bug because there is no other way to delete these logs. Moreover, if you click some some of the links in these logs you will get an error - corresponding execution is missing

aku commented 2 weeks ago

And you purge very aggressively with endDate: "{{ now() }}" without specifying the list of states so executions currently running will also be purged.

I'll advise to set it to for ex {{ now() | dateAdd(-1, 'HOURS') }} and restrict the state to avoid purging running executions.

I used now() only as an example. In a real-world scenario I will of course use filters and less agressive time range. Since I'm adopting Kestra at the moment and do a lot of experiments I need a way to easily clean the system from hundreds of executions/logs/etc.

loicmathieu commented 2 weeks ago

it does not delete logs for non-existing executions/tasks

I don't understand what are those logs that are not linked to an execution/task?

aku commented 2 weeks ago

it does not delete logs for non-existing executions/tasks

I don't understand what are those logs that are not linked to an execution/task?

They are linked to executions/tasks that were deleted with with their corresponding flow Also, it seems that the Purge task ignores disabled flows logs

Steps to reproduce: 1) Create some flow 2) Run this flow 3) Observe there are some logs related to this flow's tasks/execution 4) Delete the flow manually 5) Run Purge task (endTime == now()) - it will delete logs of existing flows/executions as expected 6) Logs from step 3 are still there

See example below - 923 logs cleaned, more than 3k logs remain

image image
loicmathieu commented 2 weeks ago

I re-openned it, we will have a look at purging correctly executions for deleted or disabled flows.