centreon / centreon-plugins

Collection of standard plugins to discover and gather cloud-to-edge metrics and status across your whole IT infrastructure.
https://www.centreon.com
Apache License 2.0
311 stars 274 forks source link

Random Use of uninitialized value with database::influxdb::plugin wiht multiple queries #4572

Closed wp-perc closed 1 year ago

wp-perc commented 1 year ago

Hi all,

I am facing a really strange behavior. I need to monitor systems based on Telegraf metrics. These metrics are store in an influxdb, making database::influxdb::plugin the obvious choice.

If I use mode query with a single query, everything goes well. When I try to add two or more queries on the same invokation, things go south. As you can see in the next screenshot, I embedded the plugin in a simple script, then launched it consecutively: Some times it works right, other times I gen several errors, and I am not able to understand why.

image

Data are provided from Telegraf every 30 seconds without issue and InfluxDB is always up&running with no error, so I can exclude everything related to data and data flow.

Can you help me?

To provide more data, I tried to run the plugin using --debug, but I don't get any useful information...

[root@neteye4testn1 ~]# bash test_influxdb_query.sh
Use of uninitialized value $options{"column"} in array element at /usr/lib/centreon/plugins/centreon_influxdb.pl line 8722.
Use of uninitialized value $column_index in array element at /usr/lib/centreon/plugins/centreon_influxdb.pl line 9475.
Use of uninitialized value $options{"column"} in array element at /usr/lib/centreon/plugins/centreon_influxdb.pl line 8722.
Use of uninitialized value $column_index in array element at /usr/lib/centreon/plugins/centreon_influxdb.pl line 9475.
Use of uninitialized value $options{"column"} in array element at /usr/lib/centreon/plugins/centreon_influxdb.pl line 8722.
Use of uninitialized value $column_index in array element at /usr/lib/centreon/plugins/centreon_influxdb.pl line 9475.
OK: status : skipped (no value(s))
URL: 'https://influxdb.neteyelocal:8086/query?epoch=s'
Parameters: 'q=SELECT max(load5) AS load5 FROM "telegraf_master"."autogen"."system" WHERE host=~/neteye4n1.wp.lan/ AND time > (now() - 10m) GROUP BY host;'
======> request send
POST https://influxdb.neteyelocal:8086/query?epoch=s
Authorization: Basic dGVsZWdyYWZfbWFzdGVyX3JlYWRlcjo0bmY3MDkzbm1mODczNGpubWc1
User-Agent: centreon::plugins::backend::http::useragent
Content-Type: application/x-www-form-urlencoded

q=SELECT+max(load5)+AS+load5+FROM+%22telegraf_master%22.%22autogen%22.%22system%22+WHERE+host%3D~%2Fneteye4n1.wp.lan%2F+AND+time+%3E+(now()+-+10m)+GROUP+BY+host%3B
======> response done
HTTP/1.1 200 OK
Date: Mon, 24 Jul 2023 07:54:59 GMT
Content-Type: application/json
Client-Date: Mon, 24 Jul 2023 07:54:59 GMT
Client-Peer: 192.168.232.34:8086
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /DC=neteyelocal/DC=*/O=NetEye/OU=NetEye Root CA/CN=*.neteyelocal
Client-SSL-Cert-Subject: /C=IT/ST=Bolzano/L=Bolzano/O=GlobalSecurity/OU=NeteyeInfluxDB/CN=influxdb.neteyelocal
Client-SSL-Cipher: TLS_AES_256_GCM_SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
Client-Transfer-Encoding: chunked
Request-Id: 633b84b6-29f7-11ee-928d-0050569e3eff
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.8.10
X-Request-Id: 633b84b6-29f7-11ee-928d-0050569e3eff

{"results":[{"statement_id":0,"series":[{"name":"system","tags":{"host":"neteye4n1.wp.lan"},"columns":["time","load5"],"values":[[1690185290,22.97]]}]}]}
URL: 'https://influxdb.neteyelocal:8086/query?epoch=s'
Parameters: 'q=SELECT max(load15) AS load15 FROM "telegraf_master"."autogen"."system" WHERE host=~/neteye4n1.wp.lan/ AND time > (now() - 10m) GROUP BY host;'
======> request send
POST https://influxdb.neteyelocal:8086/query?epoch=s
Authorization: Basic dGVsZWdyYWZfbWFzdGVyX3JlYWRlcjo0bmY3MDkzbm1mODczNGpubWc1
User-Agent: centreon::plugins::backend::http::useragent
Content-Type: application/x-www-form-urlencoded

q=SELECT+max(load15)+AS+load15+FROM+%22telegraf_master%22.%22autogen%22.%22system%22+WHERE+host%3D~%2Fneteye4n1.wp.lan%2F+AND+time+%3E+(now()+-+10m)+GROUP+BY+host%3B
======> response done
HTTP/1.1 200 OK
Date: Mon, 24 Jul 2023 07:54:59 GMT
Content-Type: application/json
Client-Date: Mon, 24 Jul 2023 07:54:59 GMT
Client-Peer: 192.168.232.34:8086
Client-Response-Num: 2
Client-SSL-Cert-Issuer: /DC=neteyelocal/DC=*/O=NetEye/OU=NetEye Root CA/CN=*.neteyelocal
Client-SSL-Cert-Subject: /C=IT/ST=Bolzano/L=Bolzano/O=GlobalSecurity/OU=NeteyeInfluxDB/CN=influxdb.neteyelocal
Client-SSL-Cipher: TLS_AES_256_GCM_SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
Client-Transfer-Encoding: chunked
Request-Id: 633e8a16-29f7-11ee-928e-0050569e3eff
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.8.10
X-Request-Id: 633e8a16-29f7-11ee-928e-0050569e3eff

{"results":[{"statement_id":0,"series":[{"name":"system","tags":{"host":"neteye4n1.wp.lan"},"columns":["time","load15"],"values":[[1690185290,21.24]]}]}]}
URL: 'https://influxdb.neteyelocal:8086/query?epoch=s'
Parameters: 'q=SELECT max(load1) AS load1 FROM "telegraf_master"."autogen"."system" WHERE host=~/neteye4n1.wp.lan/ AND time > (now() - 10m) GROUP BY host;'
======> request send
POST https://influxdb.neteyelocal:8086/query?epoch=s
Authorization: Basic dGVsZWdyYWZfbWFzdGVyX3JlYWRlcjo0bmY3MDkzbm1mODczNGpubWc1
User-Agent: centreon::plugins::backend::http::useragent
Content-Type: application/x-www-form-urlencoded

q=SELECT+max(load1)+AS+load1+FROM+%22telegraf_master%22.%22autogen%22.%22system%22+WHERE+host%3D~%2Fneteye4n1.wp.lan%2F+AND+time+%3E+(now()+-+10m)+GROUP+BY+host%3B
======> response done
HTTP/1.1 200 OK
Date: Mon, 24 Jul 2023 07:54:59 GMT
Content-Type: application/json
Client-Date: Mon, 24 Jul 2023 07:54:59 GMT
Client-Peer: 192.168.232.34:8086
Client-Response-Num: 3
Client-SSL-Cert-Issuer: /DC=neteyelocal/DC=*/O=NetEye/OU=NetEye Root CA/CN=*.neteyelocal
Client-SSL-Cert-Subject: /C=IT/ST=Bolzano/L=Bolzano/O=GlobalSecurity/OU=NeteyeInfluxDB/CN=influxdb.neteyelocal
Client-SSL-Cipher: TLS_AES_256_GCM_SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
Client-Transfer-Encoding: chunked
Request-Id: 633eeb17-29f7-11ee-928f-0050569e3eff
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.8.10
X-Request-Id: 633eeb17-29f7-11ee-928f-0050569e3eff

{"results":[{"statement_id":0,"series":[{"name":"system","tags":{"host":"neteye4n1.wp.lan"},"columns":["time","load1"],"values":[[1690185270,26.63]]}]}]}
status : skipped (no value(s))
[root@neteye4testn1 ~]#
garnier-quentin commented 1 year ago

Could you provide the output (text) with the option --debug please ?

wp-perc commented 1 year ago

Yes, I updated the topic with the --debug. Thanks

garnier-quentin commented 1 year ago

thanks, it should fix it: https://github.com/centreon/centreon-plugins/pull/4581

wp-perc commented 1 year ago

Uhm... I think there is still something wrong. I checked out branch fix-influxdb-query to test.

[root@neteye4testn1 ~]# bash test_influxdb_query.sh
UNKNOWN: 400 Bad Request
URL: 'https://influxdb.neteyelocal:8086/query?epoch=s'
Parameters: 'q=load1,SELECT max(load1) AS load1 FROM "telegraf_master"."autogen"."system" WHERE host=~/neteye4n1.wp.lan/ AND time > (now() - 10m) GROUP BY host;'
======> request send
POST https://influxdb.neteyelocal:8086/query?epoch=s
Authorization: Basic dGVsZWdyYWZfbWFzdGVyX3JlYWRlcjo0bmY3MDkzbm1mODczNGpubWc1
User-Agent: centreon::plugins::backend::http::useragent
Content-Type: application/x-www-form-urlencoded

q=load1%2CSELECT+max(load1)+AS+load1+FROM+%22telegraf_master%22.%22autogen%22.%22system%22+WHERE+host%3D~%2Fneteye4n1.wp.lan%2F+AND+time+%3E+(now()+-+10m)+GROUP+BY+host%3B
======> response done
HTTP/1.1 400 Bad Request
Date: Fri, 28 Jul 2023 15:30:28 GMT
Content-Length: 150
Content-Type: application/json
Client-Date: Fri, 28 Jul 2023 15:30:28 GMT
Client-Peer: 192.168.232.34:8086
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /DC=neteyelocal/DC=*/O=NetEye/OU=NetEye Root CA/CN=*.neteyelocal
Client-SSL-Cert-Subject: /C=IT/ST=Bolzano/L=Bolzano/O=GlobalSecurity/OU=NeteyeInfluxDB/CN=influxdb.neteyelocal
Client-SSL-Cipher: TLS_AES_256_GCM_SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
Request-Id: ae4a69c2-2d5b-11ee-95d0-0050569e3eff
X-Influxdb-Build: OSS
X-Influxdb-Error: error parsing query: found load1, expected SELECT, DELETE, SHOW, CREATE, DROP, EXPLAIN, GRANT, REVOKE, ALTER, SET, KILL at line 1, char 1
X-Influxdb-Version: 1.8.10
X-Request-Id: ae4a69c2-2d5b-11ee-95d0-0050569e3eff

{"error":"error parsing query: found load1, expected SELECT, DELETE, SHOW, CREATE, DROP, EXPLAIN, GRANT, REVOKE, ALTER, SET, KILL at line 1, char 1"}
[root@neteye4testn1 ~]#

I think the error is visible in the HTTP request: 'q=load1,SELECT max(load1) AS load1 FROM "telegraf_master"."autogen"."system" WHERE host=~/neteye4n1.wp.lan/ AND time > (now() - 10m) GROUP BY host;' The label used by centreon plugin seems to be present in the payload sent to InfluxDB...

garnier-quentin commented 1 year ago

My bad. I have fixed the PR. You can test it again (it should be ok). Thanks for the feedback!

wp-perc commented 1 year ago

Ok, now it works fine (at least, my tests completed right). Can I ask when this will be released through the RPM channel?

omercier commented 1 year ago

Hi @wp-perc, it should be released on thursday if everything is OK.