Open alexec opened 3 years ago
Here is the cause:
func getCorev1PodHealth(pod *corev1.Pod) (*HealthStatus, error) {
// This logic cannot be applied when the pod.Spec.RestartPolicy is: corev1.RestartPolicyOnFailure,
// corev1.RestartPolicyNever, otherwise it breaks the resource hook logic.
// The issue is, if we mark a pod with ImagePullBackOff as Degraded, and the pod is used as a resource hook,
// then we will prematurely fail the PreSync/PostSync hook. Meanwhile, when that error condition is resolved
// (e.g. the image is available), the resource hook pod will unexpectedly be executed even though the sync has
// completed.
here is the code https://github.com/argoproj/gitops-engine/blob/master/pkg/health/health_pod.go#L119 they hardcoded the return value.
Potentially a fix would be to allow adding an annotation to override the default behaviour, that way it doesn't break the intended functionality for jobs and hooks? Any thoughts?
My pods stay in progressing, event thought they are ready/running:
Any chance you can take 2 mins to let me know why that might be? It is confusing for me.