tomkerkhove / promitor

Bringing Azure Monitor metrics where you need them.
https://promitor.io
MIT License
248 stars 91 forks source link

Timestamps in promitor-agent-resource-discovery metrics produces TargetDown alert in prometheus #2528

Open schmidt-i opened 3 days ago

schmidt-i commented 3 days ago

Report

Since some discovery jobs are scheduled, the metrics that they produce can be up to 1h in the past. All produced gauges by the promitor agend discovery have a timestamps on the metric included and this brings our prometheus instance into trouble since it reports "too old samples" during the scraping.

Expected Behavior

timestamps on gauges are optional and can be turned off like on the promitor-agent-scraper

Actual Behavior

Old timestamps are reported for each gauge metric: example:

Current time 17:49: Extract from the metrics:

# HELP promitor_azure_landscape_resource_group_info Provides information concerning the Azure resource groups in the landscape that Promitor has access to.
# TYPE promitor_azure_landscape_resource_group_info gauge
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxx",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxx-rg",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxx",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxx-rg",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxx-rg",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxxrg",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxxressources",region="westeurope",provisioning_state="Succeeded",managed_by="/subscriptions/xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx/resourcegroups/xxxxaks-rg/providers/Microsoft.ContainerService/managedClusters/xxxxaks"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxxrg",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxxrg",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxxrg",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxx",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
promitor_azure_landscape_resource_group_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",resource_group_name="xxxxarg",region="westeurope",provisioning_state="Succeeded",managed_by="n/a"} 1 1719588601162
# HELP promitor_azure_landscape_subscription_info Provides information concerning the Azure subscriptions in the landscape that Promitor has access to.
# TYPE promitor_azure_landscape_subscription_info gauge
promitor_azure_landscape_subscription_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_name="YYYYYYYY-DEV",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",state="Enabled",spending_limit="Off",quota_id="EnterpriseAgreement_2014-09-01",authorization="n/a"} 1 1719586800463
promitor_azure_landscape_subscription_info{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",subscription_name="YYYYYYYY-PROD",subscription_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",state="Enabled",spending_limit="Off",quota_id="EnterpriseAgreement_2014-09-01",authorization="n/a"} 1 1719586800463
# HELP promitor_ratelimit_resource_graph_remaining Indication how many calls are still available before Azure Resource Graph is going to throttle us.
# TYPE promitor_ratelimit_resource_graph_remaining gauge
promitor_ratelimit_resource_graph_remaining{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",cloud="Global",auth_mode="UserAssignedManagedIdentity",app_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx"} 4 1719588909324
# HELP promitor_ratelimit_resource_graph_throttled Indication concerning Azure Resource Graph are being throttled. (1 = yes, 0 = no).
# TYPE promitor_ratelimit_resource_graph_throttled gauge
promitor_ratelimit_resource_graph_throttled{tenant_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx",cloud="Global",auth_mode="UserAssignedManagedIdentity",app_id="xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxxxxx"} 0 1719588909324

Steps to Reproduce the Problem

  1. Start promitor agent discovery
  2. look at the metrics endpoint - timestamps are reported on the gauges and can't be turned off

includeTimestamp is hard-coded to "true" https://github.com/tomkerkhove/promitor/blob/a457a98b6e2920ea2751f4d07d0d8e085946eeec/src/Promitor.Agents.ResourceDiscovery/Scheduling/DiscoveryBackgroundJob.cs#L29

Component

Resource Discovery

Version

0.8.0

Configuration

Configuration:

# Add your scraping configuration here

Logs

example

Platform

Microsoft Azure

Contact Details

No response

github-actions[bot] commented 3 days ago

Thank you for opening an issue! We rely on the community to maintain Promitor. (Learn more)

Is this something you want to contribute?