Open eamansour opened 1 month ago
Looked into the etcd pod for one of the above runs and there are some DSS properties still set for the run, which is preventing the pod from being deleted. Finished runs should have all of their properties removed from the DSS - possibly something going wrong in the test runner or resource monitor?
/ # etcdctl get --prefix dss.framework.run.C11032
dss.framework.run.C11032.allocate.timeout
2024-09-20T06:15:18.046583935Z
dss.framework.run.C11032.allocated
2024-09-20T06:00:18.046583935Z
dss.framework.run.C11032.controller
k8s-controller
dss.framework.run.C11032.finished
2024-09-20T06:00:45.317826407Z
dss.framework.run.C11032.group
e92948b1-5237-4a2a-bd0d-054b2d22dc76
dss.framework.run.C11032.local
false
dss.framework.run.C11032.queued
2024-09-20T06:00:10.495191201Z
dss.framework.run.C11032.rasrunid
cdb-88279a5fdef52dd33f48f5f350c33263
dss.framework.run.C11032.request.type
CLI
dss.framework.run.C11032.requestor
galasadelivery@ibm.com
dss.framework.run.C11032.result
EnvFail
dss.framework.run.C11032.started
2024-09-20T06:00:35.036521921Z
dss.framework.run.C11032.status
finished
dss.framework.run.C11032.stream
inttests
dss.framework.run.C11032.test
dev.galasa.inttests/dev.galasa.inttests.sdv.local.isolated.SDVLocalJava11UbuntuIsolated
dss.framework.run.C11032.testbundle
dev.galasa.inttests
dss.framework.run.C11032.testclass
dev.galasa.inttests.sdv.local.isolated.SDVLocalJava11UbuntuIsolated
dss.framework.run.C11032.trace
true
In the meantime, I've cleared the properties hanging around from old runs from the DSS. Test pods are now being cleaned up properly and nothing is being left behind in the DSS. Will keep monitoring in case this happens again.
Describe the bug
When running a
kubectl get pods
in the galasa-dev k8s namespace where prod1 lives, there are a lot of test pods that are in theCompleted
state but aren't being cleaned up by the resource monitor.Restarting the resource monitor doesn't seem to help.
Steps to reproduce
kubectl get pods -n galasa-dev
Expected behavior
Finished tests should be cleaned up and the pods should not exist.