Open bo0ts opened 2 years ago
HI
Since when is this slowness found? Is it a daily slowness for AgroCD? What about other users ?, are they also experiencing same slowness?
@Maxcoder-net Like I said, since the upgrade to 2.3.3 from 2.2.5. The problem appears when doing a full reload, but not while navigating in the app. It happens for every user on Firefox and Chrome (we haven't tested other browsers).
Happenes to me 2 !
Maybe it's a different issue but I'm seeing the same behavior with v2.2.2+03b17e0
. If I haven't hit the website in a bit it can take up to a minute to load. Once I'm there it's fast doing what I need.
The differences in the timings is surprising. @bo0ts sees 9s, and @viggin543 sees 3min.
This is gonna be pretty difficult to diagnose. It involves the API server, potentially an ingress, the network between that and your client, the client itself... A packet trace and/or logs from each of those components would be a start.
Appears to be main.js for myself as well, can try to look into logs...
This is reproducible on every refresh (or maybe hard refresh)? Do you happen to know if the API server is under heavy CPU and/or network load?
This is reproducible on every refresh (or maybe hard refresh)? Do you happen to know if the API server is under heavy CPU and/or network load?
I thought it was only when logged out but it does seem to be the same amount of time on a page refresh (normal or hard).
I didn't setup Argo and am not too familiar with its internals but if this helps:
It's also strange as some members of our team aren't seeing it. They're on the west coast, I'm on east, deployed to us-east-1
A log file if it helps: argocd-server-69fdcc9dc8-cjmkv.log. Not sure what other component logs or steps would help.
Low load. Looks like JS requests aren't logged, so that's not really gonna help us tell whether the API server is to blame. The fact that it's different in different locations makes me want to blame the network, but I don't want to jump to conclusions.
I think my next step would be running a packet trace on the client and on the API server. If the clocks are reasonably well synced, you can compare when the data packets arrive on the client vs. when the ACK packets make it back to the server to tell whether the network is to blame.
Thinking back to my web perf days, I guess it's also possible that a lot of packets are being dropped, forcing the packet size to stay super low.
@crenshaw-dev I would be happy to help but this issue randomly disappeared in our instances. API Timings across all involved clusters did not change during that time. Networking in all other deployed components was healthy meanwhile. Sorry :(
This was mostly fixed for me by turning on gzip compression (not sure why it wasn't on by default): https://github.com/argoproj/argo-cd/discussions/10238#discussioncomment-3942411
It can still take 6-9 seconds or so to initially load the page but much better than a minute.
Yeah, that's still absurdly slow. But the fact that gzip helped is interesting.
Any updates on this one? I'm able to reproduce it on v2.4.14
@jlongo-encora I think we need more details about the problem. So far, I think it's unclear whether Argo CD is misbehaving (sending stuff over the network too slowly) or if network conditions are the problem (or some combination of both).
One helpful piece of information would be seeing which assets are taking so long to transfer and at what rate they're being transferred. We could compare transfer rate of, say, the main JS bundle to some other network response.
@crenshaw-dev
My internet is fast to open other sites
This is wild. The tiny extensions.js, a static asset, takes 9s. Two different userinfo responses have very different reaponse times.
Is this API server pod under a lot of load? Is it possible CPU throttling is monkeying with response times?
@crenshaw-dev we have ~350 applications. I'm not sure about the CPU thing. I'm using Chrome. Let me try with another browser
Actually, extensions.js isn't quite static. So it could be a victim of throttling.
@jlongo-encora throttling won't be a direct function of number of applications. It happens server-side rather than on your browser. When a Kubernetes Pod uses more CPU than its configured limits, Kubernetes will "pause" CPU activity to avoid letting the Pod take too much time from other Pods on the node.
@crenshaw-dev ok thx for the explanation
I think kubectl has the ability to show CPU usage. I always use Grafana, which my team set up to monitor our stuff. Unfortunately I don't know the details of that setup.
I had the same issue where a 600-ish KB main.cxxx....js
file was taking over 50 seconds to load and it was fixed after I cleared the cache.
Describe the bug
After the upgrade from 2.2.5 to 2.3.3 the ArgoCD webinterface takes a very long time to load. We have two ArgoCD instances with a small amount of apps (~50 per instance) and both exhibit this behavior. When looking at the network traffic in a developer console the main problem seems to be
main.c7ea22e999b3805bc676.js
.I could reproduce the problem on Chrome and Firefox.
None of the pods running ArgoCD show signs of resource starvation.
Screenshots
Version
We deploy ArgoCD using the ArgoCD Operator on OKD 4.9. The ArgoCD 2.3.3 Update was triggered by the Operator Update to version 0.3.0.