Cacti / plugin_thold

Thold Plugin for Cacti
GNU General Public License v2.0
68 stars 63 forks source link

Thresholds are not applying CDEF's correctly #361

Closed nicolapiazzi closed 4 years ago

nicolapiazzi commented 5 years ago

Create a thold template Interface_traffic (traffic_in or out is the same)

Alert > High Threshold 10000000 (ten millions) Save and in the list you can see High 10.00M i intended that these are bits, like in past versions, but now it intend 10Mbytes !!! If i apply this template to a line and go in Thold Status i see 80M not 10M

k4y53r commented 5 years ago

Same thing, try to apply CDEF to turn bytes into bits but some tholds give 8x values (100 Mpbs on graph and system monitor but 800 M on thold data alert)

I guess it could be related to fix...

https://github.com/Cacti/plugin_thold/issues/78

Anyway if I created a new thold template with 150 Mbps high alert it shows correct value en thold template management as you could check on

thold error mbps 00

But on thold management page it shows incorrect alert high value, same on thold alert email subject

thold error mbps 01

netniV commented 5 years ago

I can confirm that this is definitely a bug introduced during an upgrade. Existing templates that are defined as having a hi/low value of 15/8 are showing as 142 despite the CDEF being divide by 10 on the template.

When looking at the thold_modify_values_by_cdef function, this appears to grab that maximum value for a CDEF which in my case there was more than one.

An example threshold:

mysql> select id,name,local_data_id,local_graph_id,data_template_rrd_id rrd_id,data_source_name from thold_data where id = 1152;                                                                                     
+------+------+---------------+----------------+--------+------------------+
| id   | name | local_data_id | local_graph_id | rrd_id | data_source_name |
+------+------+---------------+----------------+--------+------------------+
| 1152 | Test |          9735 |           6512 |  15024 | apc_total_load   |
+------+------+---------------+----------------+--------+------------------+
1 row in set (0.00 sec)

Listing of the CDEF's used by the above threshold:

mysql> SELECT cdef_id, COUNT(cdef_id) FROM graph_templates_item AS gti INNER JOIN data_template_rrd AS dtr ON gti.task_item_id = dtr.id WHERE local_graph_id = 6512 and dtr.id = 15024 and gti.graph_type_id IN (4,5,6,7,8,20) AND dtr.data_Source_name = 'apc_total_load' GROUP BY cdef_id;
+---------+----------------+
| cdef_id | COUNT(cdef_id) |
+---------+----------------+
|      16 |              1 |
|      18 |              1 |
|      33 |              2 |
+---------+----------------+
3 rows in set (0.00 sec)

Listing of the CDEF definitions:

mysql> select * from cdef where id in (16,18,33);
+----+----------------------------------+--------+--------------+
| id | hash                             | system | name         |
+----+----------------------------------+--------+--------------+
| 16 | 1a1ef565dca00df47c2be2498d4c01e7 |      0 | Divide by 10 |
| 18 | 73af255b6b679f796be0d4d475aba67b |      0 | Divide By 10 |
| 33 | a10d100efb7c5bc2927ebe72184081a7 |      0 | kW for 120   |
+----+----------------------------------+--------+--------------+
3 rows in set (0.00 sec)

Spot the obvious issue here, the CDEF for kW isn't going to work when this needs the Divide by 10 one (of which there happen to be two for some strange reason!).

k4y53r commented 5 years ago

I think it also happens with CDEF "Turn bytes into bits" applied to graphs showns traffic in bits but tholds on these graphs are reporting incorrect values

netniV commented 5 years ago

If you get the graph id, try running down the above queries replacing the appropriate ID values to see what you get. If you have the same multiple CDEF's showing, it would corroborate what I found.

netniV commented 5 years ago

So, I believe the latest develop version should now correct this issue, certainly for type 1 (hi/low) thresholds. I do have a sneaky feeling that this should also be applied to the RPN type but for now I'll assume that if you are using an RPN you should be applying the CDEF too!

k4y53r commented 5 years ago

If you get the graph id, try running down the above queries replacing the appropriate ID values to see what you get. If you have the same multiple CDEF's showing, it would corroborate what I found.

I get only one CDEF...

mysql> select id,name,local_data_id,local_graph_id,data_template_rrd_id rrd_id,data_source_name from thold_data where id = 659;
+-----+-----------------------------+---------------+----------------+--------+------------------+
| id  | name                        | local_data_id | local_graph_id | rrd_id | data_source_name |
+-----+-----------------------------+---------------+----------------+--------+------------------+
| 659 | |graph_title| - In_150 Mbps |           995 |            904 |   2267 | traffic_in       |
+-----+-----------------------------+---------------+----------------+--------+------------------+
mysql> SELECT cdef_id, COUNT(cdef_id) FROM graph_templates_item AS gti INNER JOIN data_template_rrd AS dtr ON gti.task_item_id = dtr.id WHERE local_graph_id = 904 and dtr.id = 2267 and gti.graph_type_id IN (4,5,6,7,8,20) AND dtr.data_Source_name = 'traffic_in' GROUP BY cdef_id;
+---------+----------------+
| cdef_id | COUNT(cdef_id) |
+---------+----------------+
|       2 |              2 |
+---------+----------------+

mysql> select * from cdef where id in (2);
+----+----------------------------------+--------+----------------------+
| id | hash                             | system | name                 |
+----+----------------------------------+--------+----------------------+
|  2 | 73f95f8b77b5508157d64047342c421e |      0 | Turn Bytes into Bits |
+----+----------------------------------+--------+----------------------+

I'm not sure if i run queries as you need....

You could see that current value it's different between bytes and bits graphs (it should be equal to bits*8=bytes) Firsts tholds referred to bits graphs and second tholds to bytes graph for same interface

tholds_bits_error 00

--- EDITED:

After deselect to use CDEF "Turn bytes into bits" and set to "exact value" on thold it seems its working fine on device thold

tholds_bits_error 02

But after modified and propagated templates it shown weird values again on high

tholds_bits_error 03

tholds_bits_error 04

tholds_bits_error 05

UPDATE:

After update to latest develop version still showns 1,2 G as high on 150 M thold, also try to delete and create again tholds with same result

tholds_bits_error 06

Checked on database it looks high = 150 M

mysql> SELECT id,name,name_cache,local_data_id,data_template_rrd_id,local_graph_id,graph_template_id,data_template_hash,data_template_id,data_source_name,thold_hi,thold_low,thold_fail_trigger,thold_fail_count,time_hi,time_low,time_fail_trigger,time_fail_length,thold_warning_hi,thold_warning_low,thold_warning_fail_trigger,thold_warning_fail_count,time_warning_hi,time_warning_low,time_warning_fail_trigger,time_warning_fail_length,thold_alert,prev_thold_alert,thold_enabled,thold_type FROM cactidb.thold_data where id = 1787;
+------+-----------------------------+-----------------------------------------------------------------+---------------+----------------------+----------------+-------------------+----------------------------------+------------------+------------------+-----------+-----------+--------------------+------------------+---------+----------+-------------------+------------------+------------------+-------------------+----------------------------+--------------------------+-----------------+------------------+---------------------------+--------------------------+-------------+------------------+---------------+------------+
| id   | name                        | name_cache                                                      | local_data_id | data_template_rrd_id | local_graph_id | graph_template_id | data_template_hash               | data_template_id | data_source_name | thold_hi  | thold_low | thold_fail_trigger | thold_fail_count | time_hi | time_low | time_fail_trigger | time_fail_length | thold_warning_hi | thold_warning_low | thold_warning_fail_trigger | thold_warning_fail_count | time_warning_hi | time_warning_low | time_warning_fail_trigger | time_warning_fail_length | thold_alert | prev_thold_alert | thold_enabled | thold_type |
+------+-----------------------------+-----------------------------------------------------------------+---------------+----------------------+----------------+-------------------+----------------------------------+------------------+------------------+-----------+-----------+--------------------+------------------+---------+----------+-------------------+------------------+------------------+-------------------+----------------------------+--------------------------+-----------------+------------------+---------------------------+--------------------------+-------------+------------------+---------------+------------+
| 1787 | |graph_title| - In_150 Mbps | 4.XXXXXXX - Traffic - XXXXXXX.37 (Ethernet)  - In_150 Mbps |           995 |                 2267 |            904 |                 2 | 6632e1e0b58a565c135d7ff90440c335 |                2 | traffic_in       | 150000000 |           |                  2 |                0 |         |          |                 1 |                1 |                  |                   |                          1 |                        0 |                 |                  |                         1 |                        1 |           0 |                0 | on            |          0 |
+------+-----------------------------+-----------------------------------------------------------------+---------------+----------------------+----------------+-------------------+----------------------------------+------------------+------------------+-----------+-----------+--------------------+------------------+---------+----------+-------------------+------------------+------------------+-------------------+----------------------------+--------------------------+-----------------+------------------+---------------------------+--------------------------+-------------+------------------+---------------+------------+
netniV commented 5 years ago

Grab the latest develop version and see how it fairs. I did find it was also double calculating on values in certain places too

k4y53r commented 5 years ago

Hi,

Tested latest develop version fix some things but breaks others....

After update to latest and change template to use CDEF "Turn bytes into bits" high value shows fine but current gets mad as you could see below

tholds_bits_error 07

If I disable template to use CDEF high still shows 1.2 G not 150 M

tholds_bits_error 08

You could check what happens if I use CDEF on threshold (thold 1795 use BITS and 1740 BYTES)

mysql> SELECT id,data_source_name,thold_hi,thold_enabled,thold_type,lastread,oldvalue,cdef FROM cactidb.thold_data where id = 1795 or id = 1740;
+------+------------------+-----------+---------------+------------+-----------------+-------------+------+
| id   | data_source_name | thold_hi  | thold_enabled | thold_type | lastread        | oldvalue    | cdef |
+------+------------------+-----------+---------------+------------+-----------------+-------------+------+
| 1740 | traffic_in       | 18750000  | on            |          0 | 177.5           | 2291371039  |    0 |
| 1795 | traffic_in       | 150000000 | on            |          0 | 654677439.71429 | 18330968312 |    2 |
+------+------------------+-----------+---------------+------------+-----------------+-------------+------+
2 rows in set (0.00 sec)

tholds_bits_error 09

Bytes graph thold shows current 203.38 bytes and bits graph thold shows 632.08 M, if I disable CDEF current works fine but high still fails as you could check with traffic_out values bytes = 171.03 bits = 1.37 K

After change 1795 thold to not use CDEF Turn bytes into bits

mysql> SELECT id,data_source_name,thold_hi,thold_enabled,thold_type,lastread,oldvalue,cdef FROM cactidb.thold_data where id = 1795 or id = 1740;
+------+------------------+-----------+---------------+------------+-----------------+------------+------+
| id   | data_source_name | thold_hi  | thold_enabled | thold_type | lastread        | oldvalue   | cdef |
+------+------------------+-----------+---------------+------------+-----------------+------------+------+
| 1740 | traffic_in       | 18750000  | on            |          0 | 248.51851851852 | 2291408238 |    0 |
| 1795 | traffic_in       | 150000000 | on            |          0 | 248.51851851852 | 2291408238 |    2 |
+------+------------------+-----------+---------------+------------+-----------------+------------+------+
2 rows in set (0.00 sec)

It looks like first query after I've enabled CDEF looks fine but second query go weird

netniV commented 5 years ago

Unfortunately, the site were I was having this issue aren't actually monitoring network interfaces so I'm going to have to replicate it with my own stuff later tonight. Do you have the data_type for each of these thresholds?

k4y53r commented 5 years ago

Sure...

mysql> SELECT id,data_source_name,thold_hi,thold_enabled,thold_type,lastread,oldvalue,data_type,cdef FROM cactidb.thold_data where id = 1795 or id = 1740;
+------+------------------+-----------+---------------+------------+-----------------+------------+-----------+------+
| id   | data_source_name | thold_hi  | thold_enabled | thold_type | lastread        | oldvalue   | data_type | cdef |
+------+------------------+-----------+---------------+------------+-----------------+------------+-----------+------+
| 1740 | traffic_in       | 18750000  | on            |          0 | 134.13793103448 | 2292101845 |         0 |    0 |
| 1795 | traffic_in       | 150000000 | on            |          0 | 134.13793103448 | 2292101845 |         0 |    2 |
+------+------------------+-----------+---------------+------------+-----------------+------------+-----------+------+
2 rows in set (0.00 sec)

Also as I look before first query after enable CDEF works fine

mysql> SELECT id,data_source_name,thold_hi,thold_enabled,thold_type,lastread,oldvalue,data_type,cdef FROM cactidb.thold_data where id = 1795 or id = 1740;
+------+------------------+-----------+---------------+------------+-----------------+-------------+-----------+------+
| id   | data_source_name | thold_hi  | thold_enabled | thold_type | lastread        | oldvalue    | data_type | cdef |
+------+------------------+-----------+---------------+------------+-----------------+-------------+-----------+------+
| 1740 | traffic_in       | 18750000  | on            |          0 | 94.829268292683 | 2292119238  |         0 |    0 |
| 1795 | traffic_in       | 150000000 | on            |          0 | 758.63414634146 | 18336953904 |         1 |    2 |
+------+------------------+-----------+---------------+------------+-----------------+-------------+-----------+------+
2 rows in set (0.00 sec)

Second query goes crazy....

mysql> SELECT id,data_source_name,thold_hi,thold_enabled,thold_type,lastread,oldvalue,data_type,cdef FROM cactidb.thold_data where id = 1795 or id = 1740;
+------+------------------+-----------+---------------+------------+-------------+-------------+-----------+------+
| id   | data_source_name | thold_hi  | thold_enabled | thold_type | lastread    | oldvalue    | data_type | cdef |
+------+------------------+-----------+---------------+------------+-------------+-------------+-----------+------+
| 1740 | traffic_in       | 18750000  | on            |          0 | 214.3       | 2292125667  |         0 |    0 |
| 1795 | traffic_in       | 150000000 | on            |          0 | 611233511.2 | 18337005336 |         1 |    2 |
+------+------------------+-----------+---------------+------------+-------------+-------------+-----------+------+
2 rows in set (0.00 sec)

And back to normal after disable CDEF

mysql> SELECT id,data_source_name,thold_hi,thold_enabled,thold_type,lastread,oldvalue,data_type,cdef FROM cactidb.thold_data where id = 1795 or id = 1740;
+------+------------------+-----------+---------------+------------+-----------------+------------+-----------+------+
| id   | data_source_name | thold_hi  | thold_enabled | thold_type | lastread        | oldvalue   | data_type | cdef |
+------+------------------+-----------+---------------+------------+-----------------+------------+-----------+------+
| 1740 | traffic_in       | 18750000  | on            |          0 | 105.85714285714 | 2292149737 |         0 |    0 |
| 1795 | traffic_in       | 150000000 | on            |          0 | 105.85714285714 | 2292149737 |         0 |    2 |
+------+------------------+-----------+---------------+------------+-----------------+------------+-----------+------+
2 rows in set (0.00 sec)
nicolapiazzi commented 5 years ago

Hi, why this issue is tagged as SOLVED ? Idonloaded latest version but same problem !

netniV commented 5 years ago

Your solution would be to create a CDEF that does nothing, apply it to the template. Otherwise, the threshold template values are assumed to be the same as that graph.

nicolapiazzi commented 5 years ago

Hi Netniv I created a CDEF that add 0 to the value and apply to thold Now thold limit is correct but Data collected is divided by 8 (or 10) If i put Exact Value data are correct but cdef divided by 8 Cattura

netniV commented 5 years ago

Is the data_type = 1 and the cdef > 0 on that thold_data entry?

nicolapiazzi commented 5 years ago

all traffic have data_type = 1 i dont undesttand what means with cdef > 0

MariaDB [cacti]> SELECT id,data_source_name,thold_hi,thold_enabled,thold_type,lastread,oldvalue,data_type,cdef FROM cacti.thold_data WHERE data_source_name LIKE "traff%" LIMIT 1;
+-----+------------------+----------+---------------+------------+-----------------+-----------+-----------+------+
| id  | data_source_name | thold_hi | thold_enabled | thold_type | lastread        | oldvalue  | data_type | cdef |
+-----+------------------+----------+---------------+------------+-----------------+-----------+-----------+------+
| 446 | traffic_in       | 80000000 | on            |          0 | 4082803.5810811 | 983171227 |         1 |   22 |
+-----+------------------+----------+---------------+------------+-----------------+-----------+-----------+------+
netniV commented 5 years ago

I edited your post for better formatting. Your CDEF value is 22. However, your lastread vs oldvalue is not the same scale so can you check that you definitely have the latest thold code from the develop branch.

nicolapiazzi commented 5 years ago

361.again.19.09.2019.docx Sorry but i still have the problem, pplease read docx

k4y53r commented 5 years ago

Any update about this issue?

k4y53r commented 5 years ago

I've checked last working version is 1.2, all versions 1.3 fails with this issue