Open moio opened 8 months ago
When we set a short timer, a similar situation occurred (although our cluster size is not yet large)
What i find is
Could this be solved by using watch handlers to get the current state then the watchers that just update the local cache?
I am looking at a user case with ~1.4k one-node clusters managed by Rancher, and I see
prometheus-rancher-exporter
generating considerable Kubernetes API load, especially to retrieve cluster and node information.Here is an excerpt of the 10 slowest API calls within 8 minutes:
All are due to
prometheus-rancher-exporter
(actually all the way down to the top ~250 in the sample I observed).Unfortunately I do not know enough about the exporter's internals to suggest any solutions yet.