After running for a weeks quite often the metric collection stops with below error. Restarting the scraper helps. Resource discovery agent seems to be fine, no errors.
[10:54:01 ERR] Failed to scrape metric azure_postgres_memory_percent for resource files-Flexible.
System.IndexOutOfRangeException: Index was outside the bounds of the array.
at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
at Promitor.Agents.Scraper.Scheduling.ResourcesScrapingJob.ScrapeMetric(ScrapeDefinition`1 scrapeDefinition) in /src/Promitor.Agents.Scraper/ResourcesScrapingJob.cs:line 294
azureMetadata:
cloud: Global
resourceGroupName: promitor
subscriptionId: xyz
tenantId: xyz
metricDefaults:
aggregation:
interval: "00:05:00"
scraping:
schedule: 0 * * ? * *
metrics:
- azureMetricConfiguration:
aggregation:
type: Average
metricName: UsedCapacity
description: The average capacity in bytes used in the storage account
name: azure_storage_account_used_capacity_bytes
resourceDiscoveryGroups:
- name: storage-accounts
resourceType: StorageAccount
- azureMetricConfiguration:
aggregation:
type: Average
metricName: active_connections
description: Average active connection used by an Azure Postgre instance
name: azure_postgres_active_connections
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: backup_storage_used
description: Average backup storage used in bytes used by an Azure Postgre instance
name: azure_postgres_backup_storage_used
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Count
metricName: connections_failed
description: Average failed active connection used by an Azure Postgre instance
name: azure_postgres_connections_failed
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: cpu_percent
description: Average CPU used by an Azure Postgre instance
name: azure_postgres_cpu_percent
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: memory_percent
description: Average memory used by an Azure Postgre instance
name: azure_postgres_memory_percent
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: network_bytes_egress
description: Average outgoing trafic in bytes used by an Azure Postgre instance
name: azure_postgres_network_bytes_egress
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: network_bytes_ingress
description: Average incoming trafic in bytes used by an Azure Postgre instance
name: azure_postgres_network_bytes_ingress
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Maximum
metricName: storage_percent
description: Average storage percent used by an Azure Postgre instance
name: azure_postgres_storage_percent
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Maximum
metricName: storage_used
description: Average storage used in bytes used by an Azure Postgre instance
name: azure_postgres_storage_used
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: iops
description: Average IOPS used by an Azure Postgre instance
name: azure_postgres_iops
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: disk_iops_consumed_percentage
description: Average IOPS used percentage
name: azure_postgres_iops_consumed_percentage
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: read_iops
description: Average Read IOPS used by an Azure Postgre instance
name: azure_postgres_read_iops
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: read_throughput
description: Average bytes read per second from disk.
name: azure_postgres_read_throughput
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Minimum
metricName: storage_free
description: Minimum amount of storage space that's available.
name: azure_postgres_storage_free
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: write_iops
description: Average Write IOPS used by an Azure Postgre instance
name: azure_postgres_write_iops
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: txlogs_storage_used
description: Average WAL files used by an Azure Postgre instance
name: azure_postgres_txlogs_storage_used
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: write_throughput
description: Average bytes written to disk per second.
name: azure_postgres_write_throughput
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
- azureMetricConfiguration:
aggregation:
type: Average
metricName: is_db_alive
description: Status database alive Azure Postgre instance
name: azure_postgres_is_db_alive
resourceDiscoveryGroups:
- name: postgres-databases
resourceType: PostgreSql
version: v1
Report
After running for a weeks quite often the metric collection stops with below error. Restarting the scraper helps. Resource discovery agent seems to be fine, no errors.
Expected Behavior
No errors, metrics
Actual Behavior
No metrics, errors
Steps to Reproduce the Problem
...
Component
Scraper
Version
v2.11.0
Configuration
Configuration - we're fetching almost all DB metrics from here https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-monitoring
metrics-declaration.yaml
runtime.yaml
Logs
Platform
Microsoft Azure
Contact Details
No response