Open jkleinlercher opened 7 months ago
next step: compare how openshift-gitops and VirtualMachine pods behave on openshift
same situation in OpenShift when vm is applied via gitops
investigating...
Openshift: due to live migration messages in condition - changing hco settings evictionStrategy and WorkloadUpdateStrategy ( https://github.com/kubevirt/hyperconverged-cluster-operator/blob/main/docs/cluster-configuration.md )
still progressing
https://github.com/argoproj/argo-cd/issues/7175 -> code merged, no solution
looks like we have the same behaviour described in https://github.com/argoproj/argo-cd/issues/15317
also pods created from kubevirt VirtualMachineInstance have "restartPolicy: never" defined. Don't know if we can change that and which consequences that has —> Seems to be hardcoded in https://github.com/kubevirt/kubevirt/blob/ea53cc9d444227a033c55d521979e6ccc688456f/pkg/virt-controller/services/template.go#L583
kubectl get pods -n qa-demo-kubevirt -o yaml |grep restart
restartPolicy: Never
With that said, as Long as the application state is healthy, maybe this issue is not that important?
opened kubevirt issue https://github.com/kubevirt/kubevirt/issues/11813
In the meantime we can try to create a lua health script but very specific to vm-launcher pods with restartPolicy never to prevent any side effects for other pods and jobs.
problem is, that the health logic for pods is quite complex: https://github.com/argoproj/gitops-engine/blob/fbecbb86e41254a75a59943b5eb43ed55d21cdb9/pkg/health/health_pod.go#L29 On slack I found a person who also tried to add some health logic for deployment, without rewriting the whole health logic, see https://cloud-native.slack.com/archives/D0720GKMCS1/p1714978483287869 Maybe he has some tips how to write it
virt-launcher pods stay on progressing https://argocd-metalstack.platform-engineer.cloud/applications/argocd/m-qa?view=tree&resource=&node=%2FPod%2Fqa-demo-kubevirt%2Fvirt-launcher-m-zktvn%2F0