webdevops / azure-metrics-exporter

Azure Monitor metrics exporter for Prometheus with dimension support, template engine and ServiceDiscovery
MIT License
122 stars 25 forks source link

/query endpoint is giving zero metrics but with status 200 #98

Open nohupped opened 3 weeks ago

nohupped commented 3 weeks ago

Hi, I am trying to get azure metrics exporter running in our kubernetes cluster with workload identity. I created a local repository from helm using

helm pull webdevops/azure-metrics-exporter --untar

and I have a federated identity created with the following permissions.

        permissions = {
          actions = [
            "Microsoft.Insights/AlertRules/*"
          ]
          not_actions = []
          data_actions = [
            "Microsoft.Insights/Metrics/*",
          ]
          not_data_actions = []
        }

Since I was doubtful about the permission scope, I've added the Monitoring reader role as well from our azure portal just to be sure.

I have the pod up and running and I have the following from my pod logs

001b[34mINFO\u001b[0m","C":"azure-metrics-exporter/main.go:57","M":"starting azure-metrics-exporter v24.2.0 (8ba3def; go1.22.0; by webdevops.io)"}
{"L":"\u001b[34mINFO\u001b[0m","C":"azure-metrics-exporter/main.go:58","M":"{\"Logger\":{\"Debug\":true,\"Development\":true,\"Json\":true},\"Azure\":{\"Environment\":\"AZUREPUBLICCLOUD\",\"AdResourceUrl\":null,\"ServiceDiscovery\":{\"CacheDuration\":0},\"ResourceTags\":[\"owner\"]},\"Metrics\":{\"Template\":\"{name}\",\"Help\":\"Azure monitor insight metric\"},\"Prober\":{\"ConcurrencySubscription\":5,\"ConcurrencySubscriptionResource\":10,\"Cache\":false},\"Server\":{\"Bind\":\":8080\",\"ReadTimeout\":5000000000,\"WriteTimeout\":10000000000}}"}
{"L":"\u001b[34mINFO\u001b[0m","C":"azure-metrics-exporter/main.go:62","M":"init Azure connection"}
{"L":"\u001b[34mINFO\u001b[0m","C":"armclient/client.go:137","M":"connecting to Azure Environment \"AzurePublicCloud\" (AzureAD:https://login.microsoftonline.com/ ResourceManager:https://management.azure.com)"}
{"L":"\u001b[34mINFO\u001b[0m","C":"armclient/client.go:152","M":"using Azure client: appid=d3f72ce8-1d49-4f62-9156-3caf9c0399e9, oid=413a0d31-5f0b-4799-9449-8d45cffc4d88","client":{"appid":"d3f72ce8-1d49-4f62-9156-3caf9c0399e9","aud":"https://management.azure.com","oid":"413a0d31-5f0b-4799-9449-8d45cffc4d88","tid":"3d7995a7-a475-466f-ab08-ed4bcfdca0f8"}}
{"L":"\u001b[34mINFO\u001b[0m","C":"armclient/client.go:162","M":"found 1 Azure Subscriptions"}
{"L":"\u001b[35mDEBUG\u001b[0m","C":"armclient/client.go:164","M":"found Azure Subscription \"<redacted>" (redacted)"}
{"L":"\u001b[34mINFO\u001b[0m","C":"azure-metrics-exporter/main.go:66","M":"starting http server on :8080"}

From the logs, it appears good and I could see its own metrics at the :8080/metrics endpoint.

When I am trying to use the /query endpoint, I get a status 200 but with empty metrics for any metrics I am trying to discover.

For eg: I am trying to get one metric connections_succeeded from Microsoft.DBforPostgreSQL/flexibleServers using my subscription ID from both /probe/metrics and /probe/metrics/list and both gives me zero metrics but with a status 200. However, when I was checking the actual monitor from within the azure postgres database, I could see connections_succeeded isn't empty and has values. I didn't see any errors in the pod logs either. Attaching a screenshot of what I see in /query endpoint.

Screenshot 2024-08-24 at 2 55 56 PM

Kindly advise.

Thanks.

nohupped commented 2 weeks ago

Managed to figure this out. It was an azure role assignment's scope issue for the managed identity. I understand that the frontend is supposed to give a status 200 if the cache is empty, but is the exporter backend expected to not throw any errors when it doesn't have the right scope/role when querying metrics?

antoniovalenzuela commented 2 weeks ago

Hi,

azure-metrics-exporter v24.2.0 Debian 12 x64 Go v1.23.0

Please help. I need get metrics from Virtual Machine Insights and KV, but I get zero results.

{"Logger":{"Debug":true,"Development":false,"Json":false},"Azure":{"Environment":"AZUREPUBLICCLOUD","AdResourceUrl":null,"ServiceDiscovery":{"CacheDuration":1800000000000},"ResourceTags":["owner"]},"Metrics":{"Template":"{name}","Help":"Azure monitor insight metric"},"Prober":{"ConcurrencySubscription":5,"ConcurrencySubscriptionResource":10,"Cache":false},"Server":{"Bind":":8080","ReadTimeout":5000000000,"WriteTimeout":10000000000}}

found 1 Azure Subscriptions

RG tested permissions:

Is there something missing? I don't see any errors in debug

imagen

imagen

imagen

nohupped commented 2 weeks ago

@antoniovalenzuela I had your same issue where it was giving zero metrics with no error. I had to assign the Monitoring Reader role assignment to both the Resource Group and the Subscription from Azure managed identity for this to work. It will look something like this from the UI.

Screenshot 2024-08-29 at 5 56 55 PM

For terraform, I used

data "azurerm_resource_group" "this" {
  name = var.resource_group_name
}

data "azurerm_subscription" "this" {
}

to get both these resource to assign to them.

antoniovalenzuela commented 2 weeks ago

Thanks for the information.

Finally I tested with Prometheus Windows Exporter, I can replace Insights with Grafana in both metrics and alerts.

I could migrate VMs to another cloud without being tied to the Azure architecture.