webdevops / azure-metrics-exporter

Azure Monitor metrics exporter for Prometheus with dimension support, template engine and ServiceDiscovery
MIT License
118 stars 24 forks source link

Clarification required on Request count ,rate limit metric calculation #73

Open tharun2258 opened 6 months ago

tharun2258 commented 6 months ago

@mblaschke We have an observability requirement, to fetch metrics from Azure resources and ingest the data to our centralized Platform.

Endpoint used: /probe/metrics Input parameters: subscription, resourcetype and region.

Number of jobs: 3 Job1 - PT1M metrics Job2 - PT1H metrics Job3 - rate limit and request count metrics

Scrape interval - Every 1 minute

please find the exporter configuration file below. exporter-configuration.txt

Lately we are facing throttling errors, as per references Azure rate limit and request count metric should give an insight on the throttling behavior

We added separate job to bring Rate limit and request count metrics using the /metrics endpoint

ISSUE : But we are having trouble understanding the pattern between the API hits, request count metric and rate limit metric.

We are considering the Request count metrics at below 2 levels i.e

1.Request count Per TenantID, SubscriptionID & RP

  1. Request count Per SubscriptionID

As per MS documentation: Throttling limits:- Subscription reads - 12000 k per hour

Since we have 2 jobs configured, so 2 API calls should be triggered for each scrape interval

With this context below are our assumptions

  1. for every scrape interval (1 minute), request count should increase by 2 and api rate limit per subscription decrease by 2.
  2. There should be a correlation between Request count per subscription and rate limit

Observation 1:

Once the exporter is hosted, for the 1st hour we observe the same pattern as mentioned in assumption 1. but from the next hour the pattern disappears i.e request count Per Tenant, Sub & RP is keep on increasing but rate limit is kind of maintained ~11850. please refer the below screenshot

Rate limit subscription id vs Request count Resource provider

image

Observation 2: And for every 1 hour request count Per Subscription increased by 21, though we see a dip in rate limit count again its restored to ~11850.

please refer below screenshot.

Rate limit subscription id vs Request Count Subscription id

Rate limit sub id vs Request count sub id

Kindly help us to understand this behavior and pattern observed from rate limit and request count metrics.