SUSE / hanadb_exporter

Prometheus exporter for SAP HANA databases
Apache License 2.0
46 stars 27 forks source link

custom HANA query does not report the expected result #102

Open pirat013 opened 2 years ago

pirat013 commented 2 years ago

Running specific, custom HANA queries and transforming them into a Prometheus metric format with a dedicated hanadb_exporter instance is reporting an error.

The query which was used is:

"Select host, CPU, data_read_time, data_read_size, data_write_time, data_write_size, log_write_time, log_write_size from M_LOAD_HISTORY_SERVICE;"

The exporter was started manually with the new profile and the custom metrics file. After triggering the exporter via curl from an external host, this message was shown:

usr/etc/hanadb_exporter # ----------------------------------------
Exception happened during processing of request from ('192.168.144.1', 49712)
Traceback (most recent call last):
  File "/usr/lib64/python3.6/socketserver.py", line 654, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib64/python3.6/socketserver.py", line 364, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib64/python3.6/socketserver.py", line 724, in __init__
    self.handle()
  File "/usr/lib64/python3.6/http/server.py", line 418, in handle
    self.handle_one_request()
  File "/usr/lib64/python3.6/http/server.py", line 406, in handle_one_request
    method()
  File "/usr/lib/python3.6/site-packages/prometheus_client/exposition.py", line 152, in do_GET
    output = encoder(registry)
  File "/usr/lib/python3.6/site-packages/prometheus_client/exposition.py", line 121, in generate_latest
    output.append(sample_line(s))
  File "/usr/lib/python3.6/site-packages/prometheus_client/exposition.py", line 79, in sample_line
    for k, v in sorted(line.labels.items())]))
  File "/usr/lib/python3.6/site-packages/prometheus_client/exposition.py", line 79, in <listcomp>
    for k, v in sorted(line.labels.items())]))
AttributeError: ("'int' object has no attribute 'replace'", Metric(test_data_read_time_ms, Hana Data Read time, gauge, ms, [Sample(name='test_data_read_time_ms', labels={'sid': 'ETU', 'insnr': '00', 'database_name': 'SYSTEMDB', 'host': 'hana02', 'cpu': 0}, value=0, timestamp=None, exemplar=None), Sample(name='test_data_read_time_ms', labels={'sid': 'ETU'
....

The metric file looks like this:

# cat newmetric.json
{
 "Select host, CPU, data_read_time, data_read_size, data_write_time, data_write_size, log_write_time, log_write_size from M_LOAD_HISTORY_SERVICE;":
  {
    "enabled": true,
    "hana_version_range": ["1.0.0", "3.0.0"],
    "metrics": [
      {
        "name": "test_data_read_time",
        "description": "Hana Data Read time",
        "labels": ["HOST", "CPU"],
        "value": "DATA_READ_TIME",
        "unit": "ms",
        "type": "gauge"
      }
    ]
  }
}

The server who has triggered the command is getting this message:

# curl 192.168.144.11:9667/metrics
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
        <title>Error response</title>
    </head>
    <body>
        <h1>Error response</h1>
        <p>Error code: 500</p>
        <p>Message: error generating metric output.</p>
        <p>Error code explanation: 500 - Server got itself in trouble.</p>
    </body>
</html>
diegoakechi commented 2 years ago

The problem occurs because the CPU column on the database returns an Int value and this column is used as Labels for the metric. The code expect strings and does not check the result set value type, raising the presented exception. The code could be adjusted to be more resilient in such scenario (like converting the resultset field into string before manipulating that), and also providing a better message in case of error.

For reference, converting the CPU field explicitly to string (varchar) on the SQL query avoids the problem:

Select host, to_varchar(CPU) CPU, data_read_time, data_read_size, data_write_time, data_write_size, log_write_time, log_write_size from M_LOAD_HISTORY_SERVICE;