hjacobs / kube-janitor

Clean up (delete) Kubernetes resources after a configured TTL (time to live)
GNU General Public License v3.0
473 stars 40 forks source link

Resources not being deleted #51

Closed diogouchoas closed 4 years ago

diogouchoas commented 4 years ago

I'm having a problem with kube-janitor where the debug log shows success on deleting a resource:

2020-01-02 20:01:38,929 DEBUG: Rule backtest-configmaps with JMESPath "starts_with(metadata.name, 'backtest-params-')" evaluated for ConfigMap backtest/backtest-params-536746: True
2020-01-02 20:01:38,929 DEBUG: Rule backtest-configmaps applies 48h TTL to ConfigMap backtest/backtest-params-536746
2020-01-02 20:01:38,929 DEBUG: ConfigMap backtest-params-536746 with 48h TTL is 2d19h34m29s old
2020-01-02 20:01:38,929 INFO: ConfigMap backtest-params-536746 with 48h TTL is 2d19h34m29s old and will be deleted (rule backtest-configmaps matches)
2020-01-02 20:01:38,934 DEBUG: https://172.20.0.1:443 "POST /api/v1/namespaces/backtest/events HTTP/1.1" 201 867
2020-01-02 20:01:38,934 INFO: Deleting ConfigMap backtest/backtest-params-536746..
2020-01-02 20:01:38,941 DEBUG: https://172.20.0.1:443 "DELETE /api/v1/namespaces/backtest/configmaps/backtest-params-536746 HTTP/1.1" 200 None

but the resource doesn't get deleted. Any idea of what might be happening?

Rules file:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-janitor
  namespace: backtest
data:
  rules.yaml: |-
    rules:
      - id: backtest-jobs
        resources:
          - jobs
        jmespath: "starts_with(metadata.name, 'backtest-')"
        ttl: 48h
      - id: backtest-configmaps
        resources:
          - configmaps
        jmespath: "starts_with(metadata.name, 'backtest-params-')"
        ttl: 48h
hjacobs commented 4 years ago

Looks good to me, can you check what the configmap status says (kubectl get cm backtest-params-.. -n backtest -o yaml)?

diogouchoas commented 4 years ago

@hjacobs, after a few hours the resources ended up being deleted correctly, I think that because of the amount of resources that were deleted in the first execution (more than 30K configmaps and jobs) our Kubernetes cluster took a while to execute all API calls. Thank you very much for your help.