Closed tddcacti closed 5 months ago
@xmacan , what do you think of this suggestion?
@TheWitness I missed this issue, sorry. I will look at it asap.
Thanks @xmacan!
It seems reasonable. I will test it and prepare PR if I will not find any other issue
Describe the bug See https://forums.cacti.net/viewtopic.php?t=62916
Recently enabled thold version 1.5.2 on a Cacti server running version 1.2.23. Subsequently tried thold 1.8 with Cacti 1.2.25, and observed the same behavior.
References to line numbers have been updated below to reflect current 1.8 develop branch as of this writing.
We have graphs for temperature sensors and want to enable both high and low thresholds on these. Some of these sensors occasionally experience connectivity issues, which results in NaNs in the graphs. When the NaNs occur, thold is considering the value to be zero.
I am aware of the thold setting "Consider Unknown Data as Zero", and have verified this option is disabled.
Dove into the code a bit and I believe I've found what is happening. In thold_functions.php, the function thold_get_currentval starting on line 737:
The variable currentval is set to zero on line 754. That variable will be changed within the function based on the logic of the switch statement starting in line 759, however that switch statement is only evaluated when the if statement on line 758 is true.
That if statement checks the value of the item to see whether or not it is a numeric value. If it is, then we proceed to the switch statement. If it isn't, then we skip all of that and simply return currentval. This means that if the current value of the item is non-numeric, we are always returning zero.
In polling.php, we use this function on line 157, and use the result to populate the value of the postvalue variable, which is fed into another "currentval" variable in the next line.
The logic related to considering unknown as zero appears shortly thereafter within the same function, starting lines 190 to 201. However the if statement on line 190 is never true:
if (!is_numeric($currentval))
because per the above explanation, whenever that value is non-numeric, we've already set it to zero.
In my deployment I made a simple change to thold_functions.php on line 754....changed it from this:
$currentval = 0;
To this:
$currentval = '';
With that change, any thresholds where the graph has a NaN as the current value, the threshold shows current value as a dash, instead of a zero, and we no longer get low threshold breach alerts. Hopefully this was the correct change to make, and doesn't create any problems elsewhere that I'm not aware of.
To Reproduce To reproduce the issue, take a graph that occasionally has NaNs, or perhaps easier, intentionally create a graph that is always NaNs, such as a device which is offline or failing SNMP.
Expected behavior With the setting "Consider Unknown Data as Zero" disabled, the current value of the threshold in this situation should not show zero. It should show no value at all, and should not trigger the alert state.
To Reproduce the proposed fix Starting from where the above steps leave off:
Edit thold_functions.php line 754, change it from this:
$currentval = 0;
To this:
Plugin (please complete the following information):
Desktop (please complete the following information):
Additional context Add any other context about the problem here.