Open garyd2 opened 4 months ago
from which version the upgrade happened? Are you not able to view the resources at all or you can but NOAUTH
error message keeps popping up? Also, can you share application controller logs?
It was previously on 1.12.4
I believe.
I am able to see the application tiles fine and it shows synced and healthy, but when I click into it and try and look at the pods it thows the error and I can't see any further pod details.
Logs of the application controller look like this (masked some details)
time="2024-07-22T07:49:00Z" level=info msg="Loading TLS configuration from secret xxxxx/argocd-server-tls"
time="2024-07-22T07:49:00Z" level=warning msg="Failed to save cluster info: NOAUTH Authentication required."
time="2024-07-22T07:49:04Z" level=info msg="Refreshing app status (controller refresh requested), level (0)" application=xxx
time="2024-07-22T07:49:04Z" level=warning msg="Failed to get cached managed resources for tree reconciliation, fall back to full reconciliation" application=xxxxxxx dest-name= dest-namespace=development dest-server="https://kubernetes.default.svc" fields.level=0
time="2024-07-22T07:49:04Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: development)" application=xxxxxxx
time="2024-07-22T07:49:04Z" level=info msg="GetRepoObjs stats" application=xxxxxxx build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=54 unmarshal_ms=54 version_ms=0
time="2024-07-22T07:49:04Z" level=error msg="DiffFromCache error: error getting managed resources for app xxxx: NOAUTH Authentication required."
time="2024-07-22T07:49:04Z" level=error msg="Failed to cache app resources: error setting app resource tree: NOAUTH Authentication required." application=xxxxxxx dedup_ms=0 dest-name= dest-namespace=development dest-server="https://kubernetes.default.svc" diff_ms=13 fields.level=0 git_ms=54 health_ms=0 live_ms=0 settings_ms=0 sync_ms=0
time="2024-07-22T07:49:04Z" level=info msg="Skipping auto-sync: application status is Synced" application=xxxxxxx
time="2024-07-22T07:49:04Z" level=info msg="Update successful" application=xxxxxxx
time="2024-07-22T07:49:04Z" level=info msg="Reconciliation completed" application=xxxxxxx dedup_ms=0 dest-name= dest-namespace=development dest-server="https://kubernetes.default.svc" diff_ms=13 fields.level=0 git_ms=54 health_ms=0 live_ms=0 patch_ms=33 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=132
time="2024-07-22T07:49:12Z" level=info msg="Refreshing app status (controller refresh requested), level (0)" application=xxxxxxx
time="2024-07-22T07:49:12Z" level=warning msg="Failed to get cached managed resources for tree reconciliation, fall back to full reconciliation" application=xxxxxxx dest-name= dest-namespace=development dest-server="https://kubernetes.default.svc" fields.level=0
time="2024-07-22T07:49:12Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: development)" application=xxxxxxx
time="2024-07-22T07:49:12Z" level=info msg="GetRepoObjs stats" application=xxxxxxx build_options_ms=0 helm_ms=0 plugins_ms=0 repo_ms=0 time_ms=58 unmarshal_ms=58 version_ms=0
time="2024-07-22T07:49:12Z" level=error msg="DiffFromCache error: error getting managed resources for app xxx: NOAUTH Authentication required."
time="2024-07-22T07:49:12Z" level=error msg="Failed to cache app resources: error setting app resource tree: NOAUTH Authentication required." application=xxxxxxx dedup_ms=0 dest-name= dest-namespace=development dest-server="https://kubernetes.default.svc" diff_ms=15 fields.level=0 git_ms=59 health_ms=1 live_ms=0 settings_ms=0 sync_ms=0
time="2024-07-22T07:49:12Z" level=info msg="Skipping auto-sync: application status is Synced" application=xxxxxxx
time="2024-07-22T07:49:13Z" level=info msg="Update successful" application=xxxxxxx
Strange. 1.12.4
has the redis authentication change so it shouldn't cause upgrade issues. Is it happening only on one cluster or seeing similar behavior on others aswell? Can you also check if redis-initial-password
secret is present and not empty in the ArgoCD instance namespace and REDIS_PASSWORD
env is referencing this secret in repo-server, application-controller and argocd-server deployments correctly.
I see an argocd-redis-initial-passord
secret but the 2 data values are admin.password
and immutable
and both values are set, I don't have a REDIS_PASSWORD
data value in it
Oh sorry, I should have framed it better😅. Check the REDIS_PASSWORD env var in repo-server, application-controller and argocd-server deployments, not in the redis secret. The secret contains only 2 values.
Thanks
repo-server
deployment looks good has a REDIS_PASSWORD
and it point to the secretapplication-controller
statefulset - has NO REDIS_PASSWORD
environment variable argocd-server
deployment has NO REDIS_PASSWORD
environment variable either.Would it be OK to just edit the statefuleset and deployment to add this in if they are missing?
Yes, we can try that. But the operator should have handle this automatically. Could be bug...
Sorry for the delay in getting back, I updated the deployments and statefulset with the REDIS_PASSWORD
environment variables and all is back working again, I can get the resources of apps again. Thanks a lot for your help
Feel free to close this, or if you want to keep it open to investigate a bug work away.
Great. I will keep this issue open until the bug is triaged. Thanks for reporting it.
@garyd2 - Just to rule out the possibility of broken operator reconciliation, can you confirm if there are any unusual error messages in the operator manager pod logs?
I have reviewed the openshift-gitops-operator-controller-manager
logs and no errors are thown.
The GitOps operator in one env updated yesterday to
1.13.0
since then I cannot get to the resouces of any apps without hittingRestarted all pods in the Openshift-gitops namespace
Anyone seen this before? Is it something with Redis?