Closed ratnadeep007 closed 4 years ago
need more context here. deploy manifests for the operator, example of job and pod
Deploy manifets Ran given kubectl command to deploy operator in cluster. Default deploy manifest are used.
Job Manifest
apiVersion: batch/v1
kind: Job
metadata:
name: django-migrate
labels:
app: django-migrate
spec:
backoffLimit: 4
template:
metadata:
labels:
app: django-migrate
spec:
restartPolicy: Never
containers:
- name: django-migrate
image: django-migrate-image # from private repo
ports:
- containerPort: 3000
resources:
requests:
memory: 1800Mi
limits:
memory: 1800Mi
args:
- python
- manage.py
- migrate
- --noinput
please attach
kubectl get job django-migrate -o yaml > job.yaml
kubectl get pod django-migrate-<POD_ID> -o yaml > job.yaml
kubectl get job django-migrate -o yaml > job.yaml
apiVersion: batch/v1
kind: Job
metadata:
annotations: <annotations>
creationTimestamp: "2020-05-18T13:11:29Z"
labels:
app: django-migrate
app.kubernetes.io/managed-by: skaffold-v1.7.0
skaffold.dev/builder: local
skaffold.dev/cleanup: "true"
skaffold.dev/deployer: kubectl
skaffold.dev/docker-api-version: "1.40"
skaffold.dev/profile.0: dev
skaffold.dev/run-id: 685c60f0-25a3-4ad4-8b12-2e149b82cc0d
skaffold.dev/tag-policy: git-commit
skaffold.dev/tail: "true"
name: django-migrate
namespace: default
resourceVersion: "442017"
selfLink: /apis/batch/v1/namespaces/default/jobs/django-migrate
uid: 80da5af6-8b5f-44ba-b285-ed87a1f87139
spec:
backoffLimit: 4
completions: 1
parallelism: 1
selector:
matchLabels:
controller-uid: 80da5af6-8b5f-44ba-b285-ed87a1f87139
template:
metadata:
creationTimestamp: null
labels:
app: django-migrate
app.kubernetes.io/managed-by: skaffold-v1.7.0
controller-uid: 80da5af6-8b5f-44ba-b285-ed87a1f87139
job-name: django-migrate
skaffold.dev/builder: local
skaffold.dev/cleanup: "true"
skaffold.dev/deployer: kubectl
skaffold.dev/docker-api-version: "1.40"
skaffold.dev/profile.0: dev
skaffold.dev/run-id: 685c60f0-25a3-4ad4-8b12-2e149b82cc0d
skaffold.dev/tag-policy: git-commit
skaffold.dev/tail: "true"
spec:
containers:
- args:
- python
- manage.py
- migrate
- --noinput
env:
- name: DATABASE_URL
valueFrom:
configMapKeyRef:
key: database_url
name: django-config
image: <ecr_repo_link>/django_migrate:latest
imagePullPolicy: IfNotPresent
name: django
ports:
- containerPort: 3000
protocol: TCP
resources:
limits:
memory: 1000Mi
requests:
memory: 1000Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
completionTime: "2020-05-18T13:11:56Z"
conditions:
- lastProbeTime: "2020-05-18T13:11:56Z"
lastTransitionTime: "2020-05-18T13:11:56Z"
status: "True"
type: Complete
startTime: "2020-05-18T13:11:29Z"
succeeded: 1
kubectl get pod django-migrate-<POD_ID> -o yaml > job.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
creationTimestamp: "2020-05-18T13:11:29Z"
generateName: django-migrate-
labels:
app: django-migrate
app.kubernetes.io/managed-by: skaffold-v1.7.0
controller-uid: 80da5af6-8b5f-44ba-b285-ed87a1f87139
job-name: django-migrate
skaffold.dev/builder: local
skaffold.dev/cleanup: "true"
skaffold.dev/deployer: kubectl
skaffold.dev/docker-api-version: "1.40"
skaffold.dev/profile.0: dev
skaffold.dev/run-id: 685c60f0-25a3-4ad4-8b12-2e149b82cc0d
skaffold.dev/tag-policy: git-commit
skaffold.dev/tail: "true"
name: django-migrate-ppw78
namespace: default
ownerReferences:
- apiVersion: batch/v1
blockOwnerDeletion: true
controller: true
kind: Job
name: django-migrate
uid: 80da5af6-8b5f-44ba-b285-ed87a1f87139
resourceVersion: "442016"
selfLink: /api/v1/namespaces/default/pods/django-migrate-ppw78
uid: 25ae3126-a16c-462a-aae8-4b47f88ec8ac
spec:
containers:
- args:
- python
- manage.py
- migrate
- --noinput
env:
- name: DATABASE_URL
valueFrom:
configMapKeyRef:
key: database_url
name: django-config
image: <ecr_repo_link>/django_migrate:latest
imagePullPolicy: IfNotPresent
name: django
ports:
- containerPort: 3000
protocol: TCP
resources:
limits:
memory: 1000Mi
requests:
memory: 1000Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-sc5hv
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: ip-192-168-17-65.ap-south-1.compute.internal
priority: 0
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: default-token-sc5hv
secret:
defaultMode: 420
secretName: default-token-sc5hv
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2020-05-18T13:11:29Z"
reason: PodCompleted
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2020-05-18T13:11:56Z"
reason: PodCompleted
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2020-05-18T13:11:56Z"
reason: PodCompleted
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2020-05-18T13:11:29Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://5123b39ea7484d3c9c648dfa84dde30bec092b595eaeabc1104ae16a0d245d6f
image: <ecr_repo_link>/django_migrate:latest
imageID: docker-pullable://<ecr_repo_link>/django_migrate:latest
lastState: {}
name: django
ready: false
restartCount: 0
state:
terminated:
containerID: docker://5123b39ea7484d3c9c648dfa84dde30bec092b595eaeabc1104ae16a0d245d6f
exitCode: 0
finishedAt: "2020-05-18T13:11:56Z"
reason: Completed
startedAt: "2020-05-18T13:11:50Z"
hostIP: 192.168.17.65
phase: Succeeded
podIP: 192.168.15.73
qosClass: Burstable
startTime: "2020-05-18T13:11:29Z"
Could you try the new 0.7 version and let me know if you still experience this issue? https://github.com/lwolf/kube-cleanup-operator/releases/tag/v0.7.0
It now has separate loops for jobs and pods to make sure that everything gets cleaned properly.
Make sure to run it with -legacy-mode=false
to be able to use this feature.
I tried using a eks cluster where I have to deploy job. Using example job from documentation. Still not working.
rbac.yml
apiVersion: v1
kind: ServiceAccount
metadata:
name: cleanup-operator
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cleanup-operator
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
- delete
- apiGroups: ["batch", "extensions"]
resources:
- jobs
verbs:
- delete
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cleanup-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cleanup-operator
subjects:
- kind: ServiceAccount
name: cleanup-operator
namespace: default
deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: cleanup-operator
name: cleanup-operator
namespace: default
spec:
replicas: 1
selector:
matchLabels:
run: cleanup-operator
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
run: cleanup-operator
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "7000"
spec:
serviceAccountName: cleanup-operator
containers:
- args:
- --namespace=default
- --legacy-mode=false
- --delete-successful-after=0s
image: quay.io/lwolf/kube-cleanup-operator
imagePullPolicy: Always
name: cleanup-operator
ports:
- containerPort: 7000
resources:
requests:
cpu: 50m
memory: 50Mi
limits:
cpu: 50m
memory: 50Mi
dnsPolicy: ClusterFirst
restartPolicy: Always
terminationGracePeriodSeconds: 30
Logs from pod
2020/06/05 12:43:21 Starting the application.
2020/06/05 12:43:21 Provided options:
namespace: default
dry-run: false
delete-successful-after: 0s
delete-failed-after: 0s
delete-pending-after: 0s
delete-orphaned-after: 1h0m0s
delete-evicted-after: 15m0s
legacy-mode: false
keep-successful: 0
keep-failures: -1
keep-pending: -1
W0605 12:43:21.975384 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2020/06/05 12:43:21 Controller started...
2020/06/05 12:43:21 Listening at 0.0.0.0:7000
2020/06/05 12:43:21 Listening for changes...
Brief
kube-clean-up
deletes pods for jobs but unable to delete job itself. Logs shows:Expected Behavior
Delete job and pod both
More context
Managed Kubernets: Yes (EKS on AWS) Kubernetes Version: 1.15