Open strowi opened 2 years ago
Hey @strowi 👋 I don't think this is an argocd bug. What version of Prometheus operator are you using? The status was only added in version 0.56 of the operator. If you use something previous to that the health check won't be in the CRD and the health will be always progressing
After upgrading to 2.5.0 I've had the same issue. Also noticed the same behaviour on the demo site https://cd.apps.argoproj.io/applications/prometheus-operator?resource=
@joaosilva15 hey and thank you for the info. That could be indeed the cause rancher is still using 0.50.x. @drogbeer i think that is also an older version of the prometheus-operator the code suggests to me 0.46..
I also faced the issue but after upgrading the monitoring stack from Chart version 34.9.0 to 41.7.4, everything is fine again.
GKE clusters
ArgoCD version v2.5.4
Hey @mkilchhofer, I'm facing this health Progressing in kind: prometheus object in the prometheus version
prometheus ● quay.io/prometheus/prometheus:v2.40.0
My ArgoCD manages different versions of kube-prometheus-stack, some are still not exposing the status, some other are. I just did the following patch, this will allow me to well discover their status:
resource.customizations.health.monitoring.coreos.com_Prometheus: |
if obj.metadata.annotations ~= nil and obj.metadata.annotations["argocd.argoproj.io/skip-health-check"] ~= nil then
hs = {}
hs.status = "Healthy"
hs.message = "Ignoring Prometheus Health Check"
return hs
end
hs={ status = "Progressing", message = "Waiting for initialization" }
if obj.status ~= nil then
if obj.status.conditions ~= nil then
for i, condition in ipairs(obj.status.conditions) do
if condition.type == "Available" and condition.status ~= "True" then
if condition.reason == "SomePodsNotReady" then
hs.status = "Progressing"
else
hs.status = "Degraded"
end
hs.message = condition.message or condition.reason
end
if condition.type == "Available" and condition.status == "True" then
hs.status = "Healthy"
hs.message = "All instances are available"
end
end
end
end
return hs
I then added the argocd.argoproj.io/skip-health-check
to my old Prometheus objects.
Related discussion: https://cloud-native.slack.com/archives/C01TSERG0KZ/p1671558024083149
same issue after upgrading to 2.6.0
Workaround showed by aslafy-z did not work for me as-is with ARGOCD 2.6.3 Rancher monitoring helm chart 100.1.3+up19.0.3
Need to configure argocd-cm like ( NOTE:: obj.metadata.annotations ) :
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
data:
resource.customizations: |
monitoring.coreos.com/Prometheus:
health.lua: |
if obj.metadata.annotations ~= nil and obj.metadata.annotations["argocd.argoproj.io/skip-health-check"] ~= nil then
hs = {}
hs.status = "Healthy"
hs.message = "Ignoring Prometheus Health Check"
return hs
end
hs={ status = "Progressing", message = "Waiting for initialization" }
if obj.status ~= nil then
if obj.status.conditions ~= nil then
for i, condition in ipairs(obj.status.conditions) do
if condition.type == "Available" and condition.status ~= "True" then
if condition.reason == "SomePodsNotReady" then
hs.status = "Progressing"
else
hs.status = "Degraded"
end
hs.message = condition.message or condition.reason
end
if condition.type == "Available" and condition.status == "True" then
hs.status = "Healthy"
hs.message = "All instances are available"
end
end
end
end
return hs
resource.customizations.useOpenLibs.monitoring.coreos.com_Prometheus: 'true'
And pass the Prometheus annotation in the values.yaml file to the rancher-monitoring helm chart
prometheus:
annotations:
argocd.argoproj.io/skip-health-check: 'true'
Still happening with kube-prometheus-stack
45.27.2 (prometheus v0.65.1
)
Workaround showed by aslafy-z did not work for me as-is with ARGOCD 2.6.3 Rancher monitoring helm chart 100.1.3+up19.0.3
Need to configure argocd-cm like ( NOTE:: obj.metadata.annotations ) :
kind: ConfigMap metadata: name: argocd-cm namespace: argocd data: resource.customizations: | monitoring.coreos.com/Prometheus: health.lua: | if obj.metadata.annotations ~= nil and obj.metadata.annotations["argocd.argoproj.io/skip-health-check"] ~= nil then hs = {} hs.status = "Healthy" hs.message = "Ignoring Prometheus Health Check" return hs end hs={ status = "Progressing", message = "Waiting for initialization" } if obj.status ~= nil then if obj.status.conditions ~= nil then for i, condition in ipairs(obj.status.conditions) do if condition.type == "Available" and condition.status ~= "True" then if condition.reason == "SomePodsNotReady" then hs.status = "Progressing" else hs.status = "Degraded" end hs.message = condition.message or condition.reason end if condition.type == "Available" and condition.status == "True" then hs.status = "Healthy" hs.message = "All instances are available" end end end end return hs resource.customizations.useOpenLibs.monitoring.coreos.com_Prometheus: 'true'
And pass the Prometheus annotation in the values.yaml file to the rancher-monitoring helm chart
prometheus: annotations: argocd.argoproj.io/skip-health-check: 'true'
Argo 2.7.3 - rancher-monitoring 102.0.0+up40.1.2
application would deploy but never complete, adding this workaround 'fixed' the issue after a few syncs. Thanks @murand78. Ace work!
Hi! Same problem here with the latest kube-prometheus-stack
version 58.1.3
. Not sure if this should be from argocd side... Although, the patch proposed is quite an ad-hoc
solution that might be interesting to add in argocd. We could implement the argocd.argoproj.io/skip-health-check
in argocd, to allow healthcheck skipping to conflictive resources like this one
Edit: I think this should be considered https://github.com/argoproj/argo-cd/issues/11782
Checklist:
argocd version
.Describe the bug
Trying to deploy prometheus as a crd, the resource ist synced ok but stuck progressing
waiting for healthy state of monitoring.coreos.com/Prometheus/rancheer-monitoring-prometheus
.To Reproduce
Deploy the rancher-monitoring-crd + rancher-monitoring via helm
Create repo with
Expected behavior
Rancher-Monitoring Chart should be fully synced and healthy.
Screenshots
Version
Logs
Might be related to https://github.com/argoproj/argo-cd/pull/10508
regards, strowi