domoticz / domoticz

Open source Home Automation System
http://www.domoticz.com
GNU General Public License v3.0
3.46k stars 1.12k forks source link

Sanity checking of Z-Wave power meter counter values is faulty #4541

Closed tih closed 3 years ago

tih commented 3 years ago

I'm running a fresh build of the development branch on NetBSD/amd64-current:

Version: 2020.2 (build 12785) Build Hash: c3a1521d0-modified Compile Date: 2020-12-26 15:03:31 dzVents Version: 3.0.19 Python Version: None

I've discovered a problem with commit 7933f659d668a24d6e8eaf8a54998672f5ca63d9, added on December 6th, to do some sanity checking of Z-Wave power meter counter values before using them to update the database. (This sort of checking is welcome, by the way, as I've had to clean up bad entries from time to time.)

The problem is that some power meters will stop working after they count up a certain amount of power used, because there is code in the above commit that mistakenly compares a counter (the total accumulated consumption, before scaling) to a limit on the maximum acceptable power consumption per interval. Further, the ignoring of what is judged to be erroneous data happens without logging anything. In my case, I have a NorthQ Q-Power NQ-92021 meter monitoring the main power consumption meter in my home, and reporting the accumulated consumption as a counter (i.e. in Wh, which are then scaled to kWh by Domoticz) every 15 minutes.

I have a working patch to correct the code in question, and will create a pull request.

rrozema commented 3 years ago

That sanity check shouldn't be in there at all: I've run a few queries over the data collected in my domoticz.db and found that these extremely low negative and extremely high values happen in several types of (z-wave) devices. My setup has mostly only z-wave devices and multiple of them show this erratic behavior sometimes. They are not limited to power devices: among them i see power devices, but also voltage devices and illuminance devices, Also, it is not limited to 'cheap chinese products' as gets suggested time and time again. I see the same behavior for example with values collected from a fibaro FGS-223. Both these observations lead me to believe that these 'spikes' aren't caused by the z-wave device at all. Instead I think they are caused by some bug in the software in either openzwave or domoticz. I don't have a cause located yet, but I'm looking for it, and once found I'd like to have this 'sanitizing-code' sanitized itself: It does more damage than good to our data...

kiddigital commented 3 years ago

... these observations lead me to believe that these 'spikes' aren't caused by the z-wave device at all. Instead I think they are caused by some bug in the software in either openzwave or domoticz. I don't have a cause located yet, but I'm looking for it, and once found I'd like to have this 'sanitizing-code' sanitized itself: It does more damage than good to our data...

Interesting observations and line of thought. Maybe we could start with some (debug)logging in the OpenZwave part or the place where Domoticz reads the data from OpenZwave? This gives and indication to the direction of the search. If/once we see which direction to debug further, additional debuglogging can be added.

Very interested to see what we learn from this.

gizmocuz commented 3 years ago

Well this is what is reported and the sanity checks are needed. If these values (and they are consistent at a certain value for NeoCoolcam) it needs to be fixed upstream Just add some log lines where the meter counter values are received and log this for a few days to disk (before the checks) (with Node ID, Instance and Index ID's) Then we can say for 100% it comes from the driver (OZW in this case) Then we can debug OZW as well