Closed tringuyen-yw closed 3 years ago
Good point, in the next versions, I will try to add some protections regarding those limitations. I am not sure what would be the best approach though. So far, I was more edging toward implementing a cache and, in case of fast consecutive call, returns the cached result instead of querying the API.
We could also implement a retry or wait in the exporter, but there could be a risk that we will be hitting timeout due to Prometheus timeout.
I added a cache mechanism with a configurable delay in #93 , this should protect the exporter from getting 429 response with the default configuration.
By the way, there is another liveness/readiness endpoint to avoid sending queries nowadays :)
@Dabz does the docker image dabz/ccloudexporter:latest
have this fix as well as the liveness/readiness endpoints?
Yes it is @tringuyen-yw :)
It is a contribution of @raytung and it has been merged as part of #91
The endpoint is http://xx:2112/health
by default.
Description
When running the
ccloud-exporter
container within Kubernetes. The CCloud Metrics API rate limit of 50 requets / second is easily hit. Even when the pod haslivenessProbe
andreadinessProbe
disabled. Example of such an errorOnce the API rate limit is triggered, it tends to self-maintain in an infinite loop. Probably because
ccloud-exporter
retries in quick succession without giving enough wait time between metrics collection.Proposed solution
Add a new config
secondsBetweenRetry
as waiting time between a retries when an access to CCloud Metrics API had failed. Ideally this pause should be at the individual API request, and not at the batch of requests, like 9 at a time as in config.simple.yaml.Even better, this pause duration should follow an "exponential backoff" for example, starts at 5 seconds, then doubles at each new retry and be capped off at, let's say 5 minutes.