Closed tiborsimko closed 5 years ago
See also:
Once we have these, we should be able to remove Completed
workflow run pods as in https://github.com/reanahub/reana-job-controller/issues/101.
What about using https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#ttl-mechanism-for-finished-jobs ?
To use this feature we would need to upgrade Kubernetes to v1.12
The CERN cluster is on Kubernetes v1.12 since December 2018 so it should be possible to upgrade indeed...
It would be also good to check whether restartPolicy
is Never
for workflow run jobs when we are at it, I saw on my box a few days ago when I wanted to fully stop the cluster components and kill any running workflows that some workflow runtime pods were surviving and restarting themselves...
ttl_seconds_after_finished=200
works well, but as this feature is in alpha version, it needs to be enabled manually by adding --feature-gates
flag.
minikube start --kubernetes-version="v1.12.0" --vm-driver=hyperkit --feature-gates="TTLAfterFinished=true"
I can confirm restartPolicy
is set to Never
and seems to work fine on my local cluster.
Adding a single required flag is OK, if we need more flags so that the command line would be too long, we can provider a helper wrapper like:
$ reana-dev minikube-start
Seeing again
Completed
pods as in https://github.com/reanahub/reana-job-controller/issues/101. This is good for debugging, but we should remember to clean this later.(stemmed from https://github.com/reanahub/reana-workflow-engine-serial/pull/56#issuecomment-449398979)
Note that killing pods requires storing and exposing logs from workflow pods (and job pods) in an accessible place...