Closed kurkop closed 5 years ago
Hi, I'm not sure I understand what you mean by "when I use --keepsccessful=1 only works in the first run." Could you please describe it in more details. Something like:
Hi!
I'm having the same issue. I made a litlle change to the deployment yaml since i'm more interested in a "Role" than a "ClusterRole". Let me explain how to reproduce:
deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
run: cleanup-operator
name: cleanup-operator
namespace: <NAMESPACE>
spec:
replicas: 1
selector:
matchLabels:
run: cleanup-operator
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
run: cleanup-operator
spec:
serviceAccountName: cleanup-operator
containers:
- args:
- -namespace=<NAMESPACE
- -keep-failures=1
- -keep-successful=1
- -v=99
image: quay.io/lwolf/kube-cleanup-operator
imagePullPolicy: Always
name: cleanup-operator
dnsPolicy: ClusterFirst
restartPolicy: Always
terminationGracePeriodSeconds: 30
RBAC yaml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: cleanup-operator
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: cleanup-operator
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
- delete
- apiGroups: ["batch", "extensions"]
resources:
- jobs
verbs:
- delete
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: cleanup-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: cleanup-operator
subjects:
- kind: ServiceAccount
name: cleanup-operator
namespace: <NAMESPACE>
I deploy the rbac and deployment no problem. There are no permissions issues. The starting log trace is:
➜ ~ k logs cleanup-operator-5ddd886565-22mvh
2019/03/08 14:58:49 Configured namespace: '<NAMESPACE>', keepSuccessHours: 1, keepFailedHours: 1
2019/03/08 14:58:49 Starting controller...
2019/03/08 14:58:49 Listening for changes...
2019/03/08 14:58:49 Pod cleanup-operator-5ddd886565-22mvh was not created by a job... ignoring
Then i deploy the "pi" job as an exaple:
2019/03/08 15:01:07 Checking pod pi-jmvc2 with Pending status that was executed 0.000000 hours ago
2019/03/08 15:01:07 Checking pod pi-jmvc2 with Pending status that was executed 0.000000 hours ago
2019/03/08 15:01:07 Checking pod pi-jmvc2 with Pending status that was executed 0.000000 hours ago
2019/03/08 15:01:11 Checking pod pi-jmvc2 with Running status that was executed 0.000000 hours ago
2019/03/08 15:01:16 Checking pod pi-jmvc2 with Succeeded status that was executed 0.000000 hours ago
Then, i wait an hour (since i specified the keep-successful=1
) but nothing happens. The log doesn't write anything else than:
I0308 14:51:47.668392 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:52:17.668673 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:52:47.669095 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:53:17.669310 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:53:47.670445 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:54:17.670712 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:54:47.671161 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:55:17.673313 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:55:32.711115 1 reflector.go:405] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: Watch close - *v1.Pod total 5 items received
I0308 14:55:32.712147 1 round_trippers.go:386] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kube-cleanup-operator/v1.7.3 (linux/amd64) kubernetes/$Format" -H "Authorization: Bearer <TOKEN>" https://<AKS_CLUSTER_URL>:443/api/v1/namespaces/<NAMESPACE>/pods?resourceVersion=3434496&timeoutSeconds=365&watch=true
I0308 14:55:32.721589 1 round_trippers.go:405] GET https://<AKS_CLUSTER_URL>:443/api/v1/namespaces/<NAMESPACE>/pods?resourceVersion=3434496&timeoutSeconds=365&watch=true 200 OK in 9 milliseconds
I0308 14:55:32.721606 1 round_trippers.go:411] Response Headers:
I0308 14:55:32.721610 1 round_trippers.go:414] Content-Type: application/json
I0308 14:55:32.721613 1 round_trippers.go:414] Date: Fri, 08 Mar 2019 14:55:32 GMT
I0308 14:55:47.673573 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
I0308 14:56:17.674585 1 reflector.go:276] github.com/lwolf/kube-cleanup-operator/pkg/controller/controller.go:97: forcing resync
Over and over again.
If i recreate the pod after that, the jobs with an age older than 1 hour got deleted no problem.
If you need more info please tell me.
@AugerC thanks for the detailed report, could you also specify which k8s version do you use?
@lwolf Of course, sorry for not including that:
➜ ~ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.6", GitTreeState:"clean", BuildDate:"2018-12-16T04:30:10Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
thanks, I can reproduce it now.
@AugerC this should be fixed in the latest 0.5.0 release. Could you please check
Works like a charm, thanks!
I'm testing this project with Kubernetes V. 1.10.6-gke.2, when I use --keep-successful=0 works, but, when I use --keepsccessful=1 only works in the first run.
Note: I change this variable edited the deployemnt with
kubectl edit deploy cleanup-operator
and addingkubectl edit deploy cleanup-operator
to args.