justwatchcom / sql_exporter

Flexible SQL Exporter for Prometheus.
MIT License
402 stars 109 forks source link

Duplicate Metrics Collection for MSSQL with sql_exporter #135

Closed YuriiBudnyi closed 2 months ago

YuriiBudnyi commented 3 months ago

Issue: Duplicate Metrics Collection for MSSQL with sql_exporter

Hi Team,

Thank you for your continuous efforts on improving sql_exporter.

I have encountered an issue when using sql_exporter with an MSSQL server. The application returns the following error related to duplicate collection of metrics:

An error has occurred while serving metrics:

8 error(s) occurred:
* collected metric "sql_mssql_connections" { label:{name:"test_test"  value:""}  label:{name:"col"  value:"gauge"}  label:{name:"database"  value:""}  label:{name:"driver"  value:"sqlserver"}  label:{name:"host"  value:"mssql:1433"}  label:{name:"sql_job"  value:"mssql"}  label:{name:"user"  value:"some_user"}  gauge:{value:0}} was collected before with the same name and label values
* collected metric "sql_mssql_log_growths" { label:{name:"test_test"  value:""}  label:{name:"col"  value:"cntr_value"}  label:{name:"database"  value:""}  label:{name:"driver"  value:"sqlserver"}  label:{name:"host"  value:"mssql:1433"}  label:{name:"sql_job"  value:"mssql"}  label:{name:"user"  value:"some_user"}  gauge:{value:0}} was collected before with the same name and label values
...

Logs from sql_exporter:

{"caller":"query.go:134","column":"gauge","job":"mssql","level":"warn","msg":"Column not found in query result","query":"mssql_connections","resultColumns":{"count":35,"db":"master"},"ts":"2024-08-01T10:19:33.780645111Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_connections","ts":"2024-08-01T10:19:33.780755752Z"}
{"caller":"job.go:344","job":"mssql","level":"debug","msg":"Running Query","query":"mssql_deadlocks","ts":"2024-08-01T10:19:33.780846301Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_deadlocks","ts":"2024-08-01T10:19:33.782088725Z"}
{"caller":"job.go:344","job":"mssql","level":"debug","msg":"Running Query","query":"mssql_user_errors","ts":"2024-08-01T10:19:33.782110619Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_user_errors","ts":"2024-08-01T10:19:33.7833227Z"}
{"caller":"job.go:344","job":"mssql","level":"debug","msg":"Running Query","query":"mssql_kill_connection_errors","ts":"2024-08-01T10:19:33.78334444Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_kill_connection_errors","ts":"2024-08-01T10:19:33.784647723Z"}
{"caller":"job.go:344","job":"mssql","level":"debug","msg":"Running Query","query":"mssql_page_life_expectancy_seconds","ts":"2024-08-01T10:19:33.785328451Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_page_life_expectancy_seconds","ts":"2024-08-01T10:19:33.786681559Z"}
{"caller":"job.go:344","job":"mssql","level":"debug","msg":"Running Query","query":"mssql_batch_requests","ts":"2024-08-01T10:19:33.786703039Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_batch_requests","ts":"2024-08-01T10:19:33.787993815Z"}
{"caller":"job.go:344","job":"mssql","level":"debug","msg":"Running Query","query":"mssql_log_growths","ts":"2024-08-01T10:19:33.788017095Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_log_growths","ts":"2024-08-01T10:19:33.789353335Z"}
{"caller":"job.go:344","job":"mssql","level":"debug","msg":"Running Query","query":"mssql_buffer_cache_hit_ratio","ts":"2024-08-01T10:19:33.789378578Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_buffer_cache_hit_ratio","ts":"2024-08-01T10:19:33.790670914Z"}
{"caller":"job.go:344","job":"mssql","level":"debug","msg":"Running Query","query":"mssql_checkpoint_pages_sec","ts":"2024-08-01T10:19:33.790691015Z"}
{"caller":"job.go:350","job":"mssql","level":"debug","msg":"Query finished","query":"mssql_checkpoint_pages_sec","ts":"2024-08-01T10:19:33.791974813Z"}
{"caller":"job.go:317","job":"mssql","level":"debug","msg":"Sleeping until next run","sleep":"5m0s","ts":"2024-08-01T10:19:33.794482352Z"}

Configuration

Here is my config.yml:

jobs:
  - connections:
    - sqlserver://{{SQL_USER}}:{{SQL_PASS}}@{{SQL_HOST}}:1433?database={{SQL_DB}}
    interval: 5m
    name: mssql
    queries:
      - help: Number of active connections.
        labels:
          - test_test
        name: mssql_connections
        query: |
          SELECT DB_NAME(sp.dbid) AS db, COUNT(sp.spid) AS count
          FROM sys.sysprocesses sp
          GROUP BY DB_NAME(sp.dbid)
        values:
          - gauge
      - help: Number of times the transaction log has been expanded, per database.
        labels:
          - test_test
        name: mssql_log_growths
        query: |
          SELECT rtrim(instance_name) AS db, cntr_value
          FROM sys.dm_os_performance_counters WITH (NOLOCK)
          WHERE counter_name = 'Log Growths' AND instance_name <> '_Total'
        values:
          - cntr_value

Issue Description

The error suggests that the metrics "sql_mssql_connections" and "sql_mssql_log_growths" are being collected with identical name and label values more than once. This could be due to how the query results are processed and labeled.

Request for Assistance

Could you please help me resolve this issue? I would appreciate any guidance on how to modify the configuration or the queries to prevent the duplication of metrics.

Thank you in advance for your help.

Best regards, Yurii

marevers commented 3 months ago

This is most likely because your test_test label does not actually exist in the columns (result of the query) and therefore gets added with an empty value. Try removing that label from the configuration or use a column name that exists.

YuriiBudnyi commented 2 months ago

Hello @marevers,

I'm sorry for not getting back to you sooner. Yes, it helped.

Thank you)