Open AM-Dani opened 1 year ago
was there a reset of the exporter in that time?
No, it is very stable. The last reboot was to test version 22.12.0-beta0, we wanted to check if we were able to solve the gaps with this version. We couldn't, but we kept using it.
Do you see metric gaps also in the Azure console?
I hit something similar in my usage of the library - sometimes the metrics are missing. I believe our timeout for Prometheus scraping (20 seconds) might be too short in cases when Service Discovery is needed.
@mblaschke - I was considering contributing some extra logging and/or some other ways of understanding what happens under the hood (is service discovery slow, is the metrics fetching slow etc. - maybe restrict it to only when doing --log.debug
?). Before I do anything, do you have some thoughts, guidelines, ideas regarding this area?
Thanks!
@cdavid
if scape exceeds timeout duration you can lookup metrics scrape_duration_seconds
from Prometheus. if it's at your limit the scrape took too long.
with latest version you can now switch to subscription scoped metrics (path /probe/metrics
) which requests all metrics from the subscription instead from each resource.
this doesn't cover all use cases but reduces the api calls and is much faster.
so i suggest to try the subscription scoped metrics first. if that's not enough you can still increase concurrency so more requests are triggered at the same time.
Hello Team,
We are trying to extract metrics from various resources (service bus, storage accounts, and VMs) and we are experiencing gaps several times a day in all of them. For some of these gaps, we clearly see that the problem is on Azure's side (QueryThrottledException), but for others, we don't see anything in the log entries or in the exporter's metrics. This is an example from today, for the ServiceBus:
azurerm_api_request_count (rate):
azurerm_api_request_bucket (30s):
With the following configuration:
No failure found in the metrics exporter logs, and we have the same problems using '/probe/metrics/list' in other resources.
Can you please help me with this?