lwolf / kube-cleanup-operator

Kubernetes Operator to automatically delete completed Jobs and their Pods
MIT License
503 stars 109 forks source link

Environment variable --keep-successful doesn't work #20

Closed kurkop closed 5 years ago

kurkop commented 6 years ago

I'm testing this project with Kubernetes V. 1.10.6-gke.2, when I use --keep-successful=0 works, but, when I use --keepsccessful=1 only works in the first run.

Note: I change this variable edited the deployemnt with kubectl edit deploy cleanup-operator and adding kubectl edit deploy cleanup-operator to args.

lwolf commented 6 years ago

Hi, I'm not sure I understand what you mean by "when I use --keepsccessful=1 only works in the first run." Could you please describe it in more details. Something like:

gammore commented 5 years ago

Hi!

I'm having the same issue. I made a litlle change to the deployment yaml since i'm more interested in a "Role" than a "ClusterRole". Let me explain how to reproduce:

deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    run: cleanup-operator
  name: cleanup-operator
  namespace: <NAMESPACE>
spec:
  replicas: 1
  selector:
    matchLabels:
      run: cleanup-operator
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        run: cleanup-operator
    spec:
      serviceAccountName: cleanup-operator
      containers:
      - args:
        - -namespace=<NAMESPACE
        - -keep-failures=1
        - -keep-successful=1
        - -v=99
        image: quay.io/lwolf/kube-cleanup-operator
        imagePullPolicy: Always
        name: cleanup-operator
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      terminationGracePeriodSeconds: 30

RBAC yaml:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: cleanup-operator
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
  name: cleanup-operator
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - list
  - watch
  - delete
- apiGroups: ["batch", "extensions"]
  resources:
  - jobs
  verbs:
  - delete
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  name: cleanup-operator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cleanup-operator
subjects:
- kind: ServiceAccount
  name: cleanup-operator
  namespace: <NAMESPACE>

I deploy the rbac and deployment no problem. There are no permissions issues. The starting log trace is:

➜  ~ k logs cleanup-operator-5ddd886565-22mvh
2019/03/08 14:58:49 Configured namespace: '<NAMESPACE>', keepSuccessHours: 1, keepFailedHours: 1
2019/03/08 14:58:49 Starting controller...
2019/03/08 14:58:49 Listening for changes...
2019/03/08 14:58:49 Pod cleanup-operator-5ddd886565-22mvh was not created by a job... ignoring

Then i deploy the "pi" job as an exaple:

2019/03/08 15:01:07 Checking pod pi-jmvc2 with Pending status that was executed 0.000000 hours ago
2019/03/08 15:01:07 Checking pod pi-jmvc2 with Pending status that was executed 0.000000 hours ago
2019/03/08 15:01:07 Checking pod pi-jmvc2 with Pending status that was executed 0.000000 hours ago
2019/03/08 15:01:11 Checking pod pi-jmvc2 with Running status that was executed 0.000000 hours ago
2019/03/08 15:01:16 Checking pod pi-jmvc2 with Succeeded status that was executed 0.000000 hours ago

Then, i wait an hour (since i specified the keep-successful=1) but nothing happens. The log doesn't write anything else than:

I0308 14:51:47.668392       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:52:17.668673       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:52:47.669095       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:53:17.669310       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:53:47.670445       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:54:17.670712       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:54:47.671161       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:55:17.673313       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:55:32.711115       1 reflector.go:405] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: Watch close - *v1.Pod total 5 items received
I0308 14:55:32.712147       1 round_trippers.go:386] curl -k -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kube-cleanup-operator/v1.7.3 (linux/amd64) kubernetes/$Format" -H "Authorization: Bearer <TOKEN>" https://<AKS_CLUSTER_URL>:443/api/v1/namespaces/<NAMESPACE>/pods?resourceVersion=3434496&timeoutSeconds=365&watch=true
I0308 14:55:32.721589       1 round_trippers.go:405] GET https://<AKS_CLUSTER_URL>:443/api/v1/namespaces/<NAMESPACE>/pods?resourceVersion=3434496&timeoutSeconds=365&watch=true 200 OK in 9 milliseconds
I0308 14:55:32.721606       1 round_trippers.go:411] Response Headers:
I0308 14:55:32.721610       1 round_trippers.go:414]     Content-Type: application/json
I0308 14:55:32.721613       1 round_trippers.go:414]     Date: Fri, 08 Mar 2019 14:55:32 GMT
I0308 14:55:47.673573       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:56:17.674585       1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync

Over and over again.

If i recreate the pod after that, the jobs with an age older than 1 hour got deleted no problem.

If you need more info please tell me.

lwolf commented 5 years ago

@AugerC thanks for the detailed report, could you also specify which k8s version do you use?

gammore commented 5 years ago

@lwolf Of course, sorry for not including that:

➜  ~ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.6", GitTreeState:"clean", BuildDate:"2018-12-16T04:30:10Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
lwolf commented 5 years ago

thanks, I can reproduce it now.

lwolf commented 5 years ago

@AugerC this should be fixed in the latest 0.5.0 release. Could you please check

gammore commented 5 years ago

Works like a charm, thanks!