Open andrea-avanzi opened 2 months ago
What is the error you're getting? Also, have you tried adding the EKS cluster to the argocd using the argocd cluster add
command?
Error is "failed to sync cluster https://10.100.0.1:443: failed to load initial state of resource AWSManagedControlPlane.controlplane.cluster.x-k8s.io: Internal error occurred: error resolving resource"
I have the same error when using cluster added by argocd cluster add
command
I have recreated sequence and this is argocd-application-controller-0 logs
time="2024-09-11T15:33:16Z" level=info msg="Processing all cluster shards"
time="2024-09-11T15:33:16Z" level=info msg="Processing all cluster shards"
time="2024-09-11T15:33:16Z" level=info msg="appResyncPeriod=3m0s, appHardResyncPeriod=0s, appResyncJitter=0s"
time="2024-09-11T15:33:16Z" level=info msg="Starting configmap/secret informers"
time="2024-09-11T15:33:17Z" level=info msg="Configmap/secret informer synced"
time="2024-09-11T15:33:17Z" level=warning msg="Cannot init sharding. Error while querying clusters list from database: server.secretkey is missing"
time="2024-09-11T15:33:17Z" level=warning msg="Failed to save clusters info: server.secretkey is missing"
time="2024-09-11T15:33:17Z" level=info msg="0xc000e209c0 subscribed to settings updates"
time="2024-09-11T15:33:17Z" level=info msg="Cluster https://kubernetes.default.svc has been assigned to shard 0"
time="2024-09-11T15:33:17Z" level=info msg="Starting secretInformer forcluster"
time="2024-09-11T15:33:17Z" level=warning msg="Unable to parse updated settings: server.secretkey is missing"
time="2024-09-11T15:33:17Z" level=info msg="Notifying 1 settings subscribers: [0xc000e209c0]"
time="2024-09-11T15:35:28Z" level=info msg="Refreshing app status (spec.source differs), level (3)" application=argocd/guestbook
time="2024-09-11T15:35:28Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: default)" application=argocd/guestbook
time="2024-09-11T15:35:28Z" level=info msg="Start syncing cluster" server="https://kubernetes.default.svc"
time="2024-09-11T15:35:30Z" level=error msg="Failed to sync cluster" error="failed to load initial state of resource AWSManagedControlPlane.controlplane.cluster.x-k8s.io: Internal error occurred: error resolving resource" server="https://kubernetes.default.svc"
time="2024-09-11T15:35:30Z" level=info msg="Normalized app spec: {\"status\":{\"conditions\":[{\"lastTransitionTime\":\"2024-09-11T15:35:28Z\",\"message\":\"Failed to load live state: failed to get cluster info for \\\"https://kubernetes.default.svc\\\": error synchronizing cache state : failed to sync cluster https://10.100.0.1:443: failed to load initial state of resource AWSManagedControlPlane.controlplane.cluster.x-k8s.io: Internal error occurred: error resolving resource\",\"type\":\"ComparisonError\"},{\"lastTransitionTime\":\"2024-09-11T15:35:28Z\",\"message\":\"Failed to load target state: failed to get cluster version for cluster \\\"https://kubernetes.default.svc\\\": failed to get cluster info for \\\"https://kubernetes.default.svc\\\": error synchronizing cache state : failed to sync cluster https://10.100.0.1:443: failed to load initial state of resource AWSManagedControlPlane.controlplane.cluster.x-k8s.io: Internal error occurred: error resolving resource\",\"type\":\"ComparisonError\"},{\"lastTransitionTime\":\"2024-09-11T15:35:28Z\",\"message\":\"error synchronizing cache state : failed to sync cluster https://10.100.0.1:443: failed to load initial state of resource AWSManagedControlPlane.controlplane.cluster.x-k8s.io: Internal error occurred: error resolving resource\",\"type\":\"UnknownError\"}],\"sync\":{\"comparedTo\":{\"destination\":{},\"source\":{\"repoURL\":\"\"}}}}}" application=argocd/guestbook
time="2024-09-11T15:35:30Z" level=error msg="Failed to cache app resources: error getting resource tree: failed to get app hosts: error synchronizing cache state : failed to sync cluster https://10.100.0.1:443: failed to load initial state of resource AWSManagedControlPlane.controlplane.cluster.x-k8s.io: Internal error occurred: error resolving resource" application=argocd/guestbook dedup_ms=0 dest-name= dest-namespace=default dest-server="https://kubernetes.default.svc" diff_ms=0 fields.level=3 git_ms=2406 health_ms=0 live_ms=0 settings_ms=0 sync_ms=0
time="2024-09-11T15:35:30Z" level=info msg="Updated sync status: -> Unknown" application=guestbook dest-namespace=default dest-server="https://kubernetes.default.svc" reason=ResourceUpdated type=Normal
time="2024-09-11T15:35:30Z" level=info msg="Updated health status: -> Healthy" application=guestbook dest-namespace=default dest-server="https://kubernetes.default.svc" reason=ResourceUpdated type=Normal
time="2024-09-11T15:35:30Z" level=info msg="Update successful" application=argocd/guestbook
time="2024-09-11T15:35:30Z" level=info msg="Reconciliation completed" application=argocd/guestbook dedup_ms=0 dest-name= dest-namespace=default dest-server="https://kubernetes.default.svc" diff_ms=0 fields.level=3 git_ms=2406 health_ms=0 live_ms=0 patch_ms=9 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=2447
Did you configure sharding, if so, which algorithm? Is the cluster ArgoCD tries to connect to a local or a remote one? And did the issue occur after upgrading the EKS cluster, after upgrading ArgoCD, or were both upgraded together? What versions was the upgrade from and to? Does restarting the controller solve the issue (not recommending this is a workaround, of course, asking for understanding the problem better)?
Cluster is local, i use https://kubernetes.default.svc The issue occur after i have updated both, eks and after argocd On EKS i cannot restart controller I haven't modify sharding config on argo, i used default configuration during installing
It may be a caching issue. Can you connect to ArgoCD's Redis and clear cluster info?
I also just ran into this with v2.10.5+335875d
. Remote cluster, default sharding algo (not any of the new ones). One of my engineers installed a custom developed CRD and then threw the application controller with this error.
Turns out the CRD had helm templating within it that made it invalid/erroring out. Once the CRD was corrected, the error cleared for ArgoCD.
I have the same problem argocd v2.12 argocd own cluster - eks-v1.29 argocd remote cluster - eks-v1.30 have istio v1.22 and api v1alpha3 is supported the app is sync status Unkown and last sync error with although other apps having similar istio resources deployed fine ComparisonError: Failed to load live state: Get "https://remote-cluster/apis/networking.istio.io/v1alpha3?timeout=32s"
EDIT: it seems that k8s-v1.30 and istio-v1.22 is not working consistently anymore with networking.istio.io/v1alpha3 for DR,VS,SE,GW but envoyfilter still works with v1alpha3 , even it is not mentioned a deprecation and replacement of v1alpha3 by v1 https://istio.io/latest/blog/2024/v1-apis/
I couldn't find Argo-specific code for error resolving resource
. Do you have any EKS logs that can help?
It seemed like the issue was the control plane/Kubernetes SDK was puking on a bad CRD in my case. Once the CRD was corrected all was well again (So I'd almost say out of scope for ArgoCD).
Checklist:
argocd version
.Describe the bug
To Reproduce
I've already check https://bit.ly/argocd-faq, but i don't try to check "Argo CD is unable to connect to my cluster, how do I troubleshoot it?" because kubectl missing from argocd-server pod.
Expected behavior
Argocd cannot sync demo app and cannot connect to cluster
Screenshots
Version
Logs
Thanks