Open amarjayr opened 2 years ago
this issue is not resolved , after deploy 76 app and Resource Exclusion/Inclusion, the problem appear
According to the manual resource inclusions/exclusions are configured by manual edit to the cm. Am I correct? Is it not configurable in declarative way?
According to the manual resource inclusions/exclusions are configured by manual edit to the cm. Am I correct? Is it not configurable in declarative way?
Yes you can with argocd operator
Bump on this, facing similar pains when using resource exclusion to stop argocd from tracking cilium.
same here
We are also hitting this issue. Please prioritize.
Same here. Bump
Same here. Bump. Are the any workarounds for this issue?
Same error here: "error getting cached app managed resources: error getting application by query: application refresh deadline exceeded"
Same error here and applications still refreshing, can not display some resources(such as endpoint for service, pods for deployment):
msg="finished unary call with code Unknown" error="error getting cached app resource tree: error getting application by query: application refresh deadline exceeded" grpc.code=Unknown grpc.method=ResourceTree grpc.service=application.ApplicationService
Same error here, trying to understand why this happens
Same here, after testing a bit and changing timeout values in the values.yaml , my solution was to moved the applications to a new "light" repo. This light repo has less files so maybe the problem in mine was the "heavy" repo was too heavy
Checklist:
argocd version
.Describe the bug
I recently ran into a case where Argo CD was getting stuck "refreshing" an app. It would never finish refreshing the app and would never show the resource tree in the UI. The application controller OOM frequently which impacted other apps.
To Reproduce
Create an app which includes a resource that generates >100,000 child resources. In my case, I think this was a
cert-manager.io/v1
Certificate
generating thousands ofCertificateRequests
(this seems to be something that happens? https://github.com/cert-manager/cert-manager/issues/4846#issue-1132714441).I suspect #10009 has a similar root cause, but they resolved it with Resource Exclusion/Inclusion. Also maybe #4863, #3864
Expected behavior
A log warning or some sort of error surfaced in the UI. There should also be a limit on the number of resources so the application-controller doesn't OOM (which will impact other apps syncing). Ideally the warning includes the trouble resource.
Would this have been surfaced in any way through the metrics?
Obviously the more important resolution is cleaning up the child resources (which eventually would have overwhelmed etcd), but ideally Argo is able to identify edge cases like this.
Version
Logs
No logs on the
argocd-application-controller
that were relevantIf you can give any guidance on what the (new) child resource limit should be or how it should be set, I'm happy to make a PR.