hjacobs / kube-janitor

Clean up (delete) Kubernetes resources after a configured TTL (time to live)
GNU General Public License v3.0
473 stars 40 forks source link

A scale question #80

Open Stono opened 4 years ago

Stono commented 4 years ago

Hey, Apologies this is more of a question than an issue... I tried to get the answer from the code but python isn't my strong suit :)

I'm thinking about kube-janitor at scale, we have 500+ namespaces etc, and I believe (please correct me if I'm wrong) the approach janitor takes is to iterate over them all, pulling at the resources, then inspecting the annotations - every minute.

That feels like an expensive operation, and I'm wondering if you've considered either:

or

Cheers Karl

hjacobs commented 4 years ago

Polling makes no sense I think as the janitor mainly triggers based on time. Having an extra label just as performance optimization also does not make sense. I think you can easily increase the interval (--interval) to hours or run kube-janitor with --once as a CronJob (e.g. once a day).

Stono commented 4 years ago

Having an extra label just as performance optimization also does not make sense

It does when you're dealing with 1000's of resources and kube-janitor is only applicable to a handful of them. A label allows you to massively cut down on the amount of data being returned from the kube api (and subsequently the amount of data that janitor needs to process).

Your suggestion of running it less frequently doesn't make sense to me - that's effectively saying "yes it is going to be slow and expensive so run it less" rather than exploring ways in which we can make it more performant at scale.