Closed Spirituss closed 4 years ago
+1
Hey there @afaucogney, mind taking a look at this issue as its been labeled with a integration (derivative
) you are listed as a codeowner for? Thanks!
having the same on my server
Any news regarding the issue? The component is in production but doesn't work.
same problem here
Does your sensor update its value even if it doesn't change ? I mean do you have a real values table with several item of the same value ? or just a single that state constant ?
Could you please reproduce the issue and send the values table, I can add this to a test and see what's happen.
I need to understand if this is an issue in the "derivative component" algorithm or because of its integration.
BTW have you tried to add a "time_window"
Get more info : Issue : https://github.com/home-assistant/core/issues/31395 MR that adds the time_window attribut : https://github.com/home-assistant/core/pull/31397
I someone has any idea, feel free to comment ! @basnijholt @dgomes
statistics sensor runs periodically regardless if there are any changes in the sensor. This also means that it doesn't track changes during the period.
derivate sensor (and integration sensor in which it is based) tracks changes in the source sensor. That means that if the source sensor doesn't change, the derivate sensor will keep it's value for long periods of time.
One possible solution is to combine both methods: track changes and periodically read the source sensor to detect "no changes".
derivate sensor (and integration sensor in which it is based) tracks changes in the source sensor. That means that if the source sensor doesn't change, the derivate sensor will keep it's value for long periods of time.
It DOES NOT work thus, as you can see on me initial screenshots. In case, when water meter keeps its value (on screenshot - 7, February, 1:18 AM and later on) what means that the first order derivative is zero, while HA continues to show 0.12 l/min. When you tell that "derivate sensor will keep it's value" you possibly mean that source sensor keeps its value, but not the derivative sensor. Otherwise, it does work not as derivative.
@Spirituss Are you sur that "HA continue to show 0.12", or maybe this is the chart which draw a line between 2 points. This is the reason why I ask about value table ?
Otherwise, it does work not as derivative.
There are plenty way of implementing derivative when it come to digital. Unfortunately.
@Spirituss Are you sur that "HA continue to show 0.12", or maybe this is the chart which draw a line between 2 points. This is the reason why I ask about value table ?
How can I get if from HA? I physically switched off source sensor for my water flow derivative but HA still show 0.16 l/min in states list. No matter what does chart show.
Otherwise, it does work not as derivative.
There are plenty way of implementing derivative when it come to digital. Unfortunately.
It does not explain the issue. Digital calculation of derivative can cause its inaccurate calculation, but in case when there is no change it must show zero.
@Spirituss Are you sur that "HA continue to show 0.12", or maybe this is the chart which draw a line between 2 points. This is the reason why I ask about value table ?
How can I get if from HA? I physically switched off source sensor for my water flow derivative but HA still show 0.16 l/min in states list. No matter what does chart show.
This is the point, if you switch off the sensor, the value is not updated, and its value keep the same value, but its timestamp is not updated. So everything is normal from my side.
Did you try the time_window ? I'm sur this is what you are looking for !
Otherwise, it does work not as derivative.
There are plenty way of implementing derivative when it come to digital. Unfortunately.
It does not explain the issue. Digital calculation of derivative can cause its inaccurate calculation, but in case when there is no change it must show zero.
When you say 'there is no change' : What is the diff between "no change" and "waiting for the next value". How can the component know if before getting the new value:
In your context it is maybe obvious, but I designed to provide derivate values indexed on sensor values, nothing else ! because my sensor do not have any update frequency (or I do not want to care about) @basnijholt added the time_widow to discard a part of the issues caused by sampled sensor.
If your case doesn't work with time_widow, feel free to open a PR, we can look on that.
Did you try the time_window ? I'm sur this is what you are looking for !
Possibly, it is what I need. I read the manual but It's not clear how it works. What value should I use for the time_window? Is it a time delta that derivative uses for calculation of increment? In this case I think it's better to use the time interval that my sensor is being updated (15 sec).
When you say 'there is no change' : What is the diff between "no change" and "waiting for the next value". How can the component know if before getting the new value:
I'm not agree with you, since we are talking about physical conception 'derivative' what means the speed of value changing versus time line, no matter what is the reason of such changes, either "no change" or "waiting for the next value". This is the nature of any derivative. But in case you start to matter regarding the reason of changes you mean statistics but not derivative. Home Assistant already has statistics sensor which does work exactly the way you tell about.
The irony is that after one of the issue made for statistics sensor its behaviour has been updated and now can work just as derivative, while the derivative component started to work as statistics.
If your case doesn't work with time_widow, feel free to open a PR, we can look on that.
I added time_window to my sensors and nothing has changed. Derivatives show the same value as before.
About a month there was no any news. Do you still support the component?
Hi @Spirituss, I still support the component, and of course PR are also welcomes. IMO, there is no issue. You are looking for a perfect derivative mechanism in a sampled world, that's not possible. Every derivative of a sampled signal is an approximation, because the sampled signal is also an approximation. Why don't you use stat component if it offers the expected behavior ? Because from you word, this is what you expect !
If your case doesn't work with time_widow, feel free to open a PR, we can look on that.
I added time_window to my sensors and nothing has changed. Derivatives show the same value as before.
Maybe you miss configure it, please post your configuration, and the output. Extract of the datatable would also be suitable.
Maybe you miss configure it, please post your configuration, and the output. Extract of the datatable would also be suitable.
Config:
sensor:
- name: raw_water_drink_filter_kitchen_flow
platform: derivative
source: sensor.raw_water_drink_filter_kitchen
round: 2
unit_time: min
time_window: "00:00:15"
The sensor sensor.raw_water shows nothing for long time:
But derivative sensor which physically means flow is still showing non-zero value:
It is definitely not the problem of the approximation, but the obvious mistake in the calculation algorithm realisation.
This happens because to calculate the derivate it uses the last known values and if new data comes in the older values are discarded (of time window). In your case your sensor didn’t admit any data for over a day so the derivative is based on that data from a day ago.
I am not sure whether we want to change this logic. If we did, the following happens, you have a time window of 15 seconds, and if data comes in every (let’s say) 20 seconds, the derivative would never be able to be calculated because you would have only one point.
This happens because to calculate the derivate it uses the last known values and if new data comes in the older values are discarded (of time window). In your case your sensor didn’t admit any data for over a day so the derivative is based on that data from a day ago.
What is the value of the time_window parameter in this case? In such way it looks ridiculous.
I am not sure whether we want to change this logic. If we did, the following happens, you have a time window of 15 seconds, and if data comes in every (let’s say) 20 seconds, the derivative would never be able to be calculated because you would have only one point.
This is the point! Of course, in case no data received during last 20 seconds with 15 sec update interval in terms of approximation definitely means that flow is zero!
Otherwise, it is just another realisation of old-known integration statistics sensor
.
@basnijholt @afaucogney I believe that the problem lies in the plotting. Since people usually use such sensors for plotting, it's highly desirable to see when change stops. It means that if the time window has exceeded and no new values are present, sensor should report 0, null, undef whatever, but not the last value. The statistics integration does this by issuing a timer for time window time and resetting sensor's value on it's exceeding.
I fully agree with @divanikus - the problem is that the function shows value at the current moment but the value is being calculated on the past sensor values. Ir order to automate the solution is the value out-of-date, the integration can use time window parameter.
Seriosly? Without a solution?
@afaucogney already posted :
IMO, there is no issue. You are looking for a perfect derivative mechanism in a sampled world, that's not possible. Every derivative of a sampled signal is an approximation, because the sampled signal is also an approximation.
@dgomes I don't know how to explain even better, but max_age
setting isn't working at all. That is this issue all about. It should reset sensor value after max_age
of no new data, instead it just freezes on the last value, which is simply obsolete after max_age
time.
Hi, I'm also having the weird behavior when the sensor value doesn't change by much.
The sensor value is changing by a few here and there and that causes a good change on the derivative value. Home Assistant have not beed restarted for that time window.
Here's my config:
- platform: derivative
source: sensor.plancher_chauffant_temp_retour
name: Variation de température du retour du plancher
round: 2
time_window: "00:10:00"
unit_time: h
unit: "°C/h"
After looking at the raw data, I see what is going on.
First we have to know that if a value doesn't change, there's no change in the sensor value, there will be no data stored since the last change. This graph shows the actual data points that are different that the ones beside them (green = sensor, yellow = derivative):
If you set no time_window
or a very short one, the graph will be noisy because even a small variation of value over a small amount of time can result in a large derivation. That's why we usually use the time_window
to have the compared values taken a little more apart in time so that it makes the derivative less subject to small changes.
The problem we face seems to be related to the second value we get after the time_window
period.
The first value obtained after the time_window
is calculated against the precedent value (even if it's older than time_window
.) This is fine.
The second value after that seems to be calculated against the first value where they have a very close timestamp.
This doesn't make sense. The minimum time_window
is not respected for that case.
I think the logic should discard any points that are closer than the time_window
at any time.
So for these examples, the first point after time_window
still gets calculated with the latest knows point. (best we can do with this data) and the point just after it, since it's time difference with the first point is inferior than time_window
, should be again calculated with the same "old" data.
If that logic is too complicated to implement, I think that just returning "0.0" for the cases where the last value is inside the time_window
would do the trick. Those missing points can be taken as a 0 derivative since they normally represent no change of the sensor value.
I have this issue as well, sadly #45822 is closed, since I think @popperx it has a better description of the issue and the solution.
The discussion centers I think around this comment:
IMO, there is no issue. You are looking for a perfect derivative mechanism in a sampled world, that's not possible. Every derivative of a sampled signal is an approximation, because the sampled signal is also an approximation.
Because, even though this is true, the approach in the code seems to use the incorrect approximation (disclaimer: on a quick read from me, I could misunderstand the code). To calculate the derivative in the code there appears to be the assumption of a linear increase between datapoints, which I think is a reasonable one. However, when there is no data point, this assumption is dropped, so essentially suddenly the code uses different logic. Put into different words, the time_window appears to be a maximum, not a constant.
# It can happen that the list is now empty, in that case
# we use the old_state, because we cannot do anything better.
if len(self._state_list) == 0:
self._state_list.append((old_state.last_updated, old_state.state))
self._state_list.append((new_state.last_updated, new_state.state))
if self._unit_of_measurement is None:
unit = new_state.attributes.get(ATTR_UNIT_OF_MEASUREMENT)
self._unit_of_measurement = self._unit_template.format(
"" if unit is None else unit
)
try:
# derivative of previous measures.
last_time, last_value = self._state_list[-1]
first_time, first_value = self._state_list[0]
elapsed_time = (last_time - first_time).total_seconds()
delta_value = Decimal(last_value) - Decimal(first_value)
derivative = (
delta_value
/ Decimal(elapsed_time)
/ Decimal(self._unit_prefix)
* Decimal(self._unit_time)
)
I think you can do better. The old state to use within the time window is rarely if ever the old state, but should be the interpolated state at ten minutes in the past. In essence I think the list should always contain at least one value outside of the time window and use that to interpolate the starting value. I assume this will have the effect of 'dampening' all values, not just these spikes, but it will make it much more predictable and the window more meaningful.
After my last comment a year ago, I copied the integration as a custom_component
trying to solve those spikes.
In short, I made sure that the actual window respects the minimum time_window
.
At the start, nothing better can be done than using the values as they are building the data set. (No output until the window is obtained doesn't seem a good alternative.)
After the actual window meets the time_window
, we can evaluate if it is better to keep or discard the oldest value knowing the new one.
That way, if there's a long time between 2 state changes, and then we quickly get 2 values, The derivative wont be made only with the 2 newests value that don't respect the time_window
at all.
Here's what I've done based on this version from a year ago :
The main differences are :
time_window
to be able to manage it latertime_window
it's added to grow the data set.time_window
(usual case)
time_window
. If it does, we expand the actual window with the latest value. The actual window will be larger until we can use the second oldest value and move the window.
[...]
now = new_state.last_updated
last = old_state.last_updated
# If it's the first valid data (empty list) or if the last data received exceeds
# the `time_window`, the `_state_list` gets (re)initialized
if (
len(self._state_list) == 0
or (now - last).total_seconds() > self._time_window
):
self._state_list = [(old_state.last_updated, old_state.state),(new_state.last_updated, new_state.state)]
# Checking if the new value makes the data set too short to respect the `time_window` if so, it's added to the `_state_list`
elif (now - self._state_list[0][0]).total_seconds() < self._time_window:
self._state_list.append((new_state.last_updated, new_state.state))
# If the new data makes the data set larger than the `time_window`, then the same check is made with the second data
# This will confirm if we still need to add another data to respect the `time_window` or if the time window can be moved
elif (now - self._state_list[1][0]).total_seconds() < self._time_window:
self._state_list.append((new_state.last_updated, new_state.state))
# Moving the window and adding the new value
else:
self._state_list.pop(0)
self._state_list.append((new_state.last_updated, new_state.state))
[...]
I went back in time with Grafana, and I think this is about the time where I did the changes. A lot less spikes...
I wanted to submit those changes to Github at the time but I don't really know how to do it. Since then, the code has evolved but I think my logic can be migrated over without much work. If someone is interrested, I could try to migrate it learn how to make a PR or feel free to use/modify my logic and submit a PR for me. 😄
I'm a bit surprised you have any spikes at all, have you looked at those data points to see what is going on?
In the mean time I've been thinking about this problem a bit more and I think I have an easier/better solution. We have three issues:
A solution to all imho is to average all measured derivatives, weighted by time. It is also easy to implement, because in practice we will only need to keep the previous calculated derivative in order to calculate the total. I plan to code this soon, since I don't expect this to be too hard. An additional advantage of this approach could be that we can do different weighting, such as an exponential moving average. This way new measurements will have more weight, but you can still have smoothing. That would require us to keep all states or averages though.
Here's some quick pseudo python code which I think should work:
# derivative is intialized at 0
[...]
# if it is the first data point, return
if (old_state is None): return
# calculate linear derivative between new and old state (so this **always** happens)
delta_t = new_state.last_updated - old_state.last_updated
delta_y = new_state.state - old_state.state
new_derivative = delta_y/delta_t
# if delta_t is larger than time window, just use the new derivative,
# otherwise calculate weighted average with old average
if (delta_t > self._time_window):
derivative = new_derivative
else:
time_left = self._time_window - delta_t
derivative = (new_derivative*delta_t + self._state*time_left)/self._time_window
self._state = derivative
The biggest advantage to this method is that - after the first time window has passed after init - the time window will always be applied as a constant smoothing factor.
[edit] Now that I think more about it, I think the most correct implementation would be a least squares linear regression. I'll have to think a bit more about it :P
Just a post to signify I'm working on this.
I've been playing with some of the data and the logic, but I can't accurately replicate the issue. Strangely enough it shows some of the spikes, but not all. Even worse, I can't get the correct measurements the same either. Until I can replicate the original, I don't feel confident about working on any alternatives.
This is what I currently have:
It might be hard to see, due to overlap, but the positive spikes I can simulate, but the negative spikes are completely ignored and the signal is slightly off.
The code I use is as follows to simulate the sensor (it is bit 'hacky' since I'm not entirely used to pandas yet, but mostly a copy from the original):
from xmlrpc.client import DateTime
import pandas as pd
from decimal import Decimal
def d_hass(
times: DateTime,
values: float,
window=600,
unit_time=60,
type_name="simulated derivative",
) -> float:
output = pd.DataFrame({"last_changed": [], "state": [], "type": []})
state_list = [] # temp variable that hold the values in the window
for i, new_state in values.items():
new_time = times[i]
if i == 0:
output = output.append(
{
"last_changed": new_time,
"state": 0.0,
"type": type_name,
},
ignore_index=True,
)
continue
old_state = values[i - 1]
old_time = times[i - 1]
now = times[i]
state_list = [
(timestamp, state)
for timestamp, state in state_list
if (now - timestamp).total_seconds() < window
]
# It can happen that the list is now empty, in that case
# we use the old_state, because we cannot do anything better.
if len(state_list) == 0:
state_list.append((old_time, old_state))
state_list.append((new_time, new_state))
last_time, last_value = state_list[-1]
first_time, first_value = state_list[0]
elapsed_time = (last_time - first_time).total_seconds()
delta_value = Decimal(last_value) - Decimal(first_value)
derivative = float(delta_value / Decimal(elapsed_time) * Decimal(unit_time))
output = output.append(
{
"last_changed": new_time,
"state": derivative,
"type": type_name,
},
ignore_index=True,
)
return output
Since I don't have the tools installed to do a proper pull request I copied the code and made a custom component with my changes. This is what I ended up with and it works pretty well:
def calc_derivative(event):
"""Handle the sensor state changes."""
old_state = event.data.get("old_state")
new_state = event.data.get("new_state")
if (
old_state is None
or old_state.state in [STATE_UNKNOWN, STATE_UNAVAILABLE]
or new_state.state in [STATE_UNKNOWN, STATE_UNAVAILABLE]
):
return
now = new_state.last_updated
if len(self._state_list) == 0:
self._state_list.append((old_state.last_updated, old_state.state))
self._state_list.append((new_state.last_updated, new_state.state))
#Keep one older than window. i.e. del oldest if next is too old:
while (now - self._state_list[1][0]).total_seconds() > self._time_window:
del self._state_list[0]
if self._unit_of_measurement is None:
unit = new_state.attributes.get(ATTR_UNIT_OF_MEASUREMENT)
self._unit_of_measurement = self._unit_template.format(
"" if unit is None else unit
)
try:
# derivative of previous measures.
last_time, last_value = self._state_list[-1]
first_time, first_value = self._state_list[0]
elapsed_time = (last_time - first_time).total_seconds()
delta_value = Decimal(last_value) - Decimal(first_value)
myderivative = (
delta_value
/ Decimal(elapsed_time)
/ Decimal(self._unit_prefix)
* Decimal(self._unit_time)
)
if self._maximum is not None:
myderivative = min(myderivative, Decimal(self._maximum))
if self._minimum is not None:
myderivative = max(myderivative, Decimal(self._minimum))
if elapsed_time < self._time_window:
_LOGGER.warning("Derivative time is smaller than window: %d s", elapsed_time)
assert isinstance(myderivative, Decimal)
except ValueError as err:
_LOGGER.warning("While calculating derivative: %s", err)
except DecimalException as err:
_LOGGER.warning(
"Invalid state (%s > %s): %s", old_state.state, new_state.state, err
)
except AssertionError as err:
_LOGGER.error("Could not calculate derivative: %s", self._minimum, self._maximum, err)
else:
self._state = myderivative
self.async_write_ha_state()
async_track_state_change_event(
self.hass, [self._sensor_source_id], calc_derivative
)
I hope this helps.
I had just achieved succes as well :D. I have added your code to test as well and it achieves similar results, but introduces some 'lag' in te response. Below I have plotted several approaches on top, from this test at least the weighted average approach appears to work best.
The current code looks like this:
Your code solves the spikes ( I think the first one could happen on a start, but isn't very likely, just a result of my data selection):
The weighted average achieves basically the same:
But if I overlap them and zoom in on the signal, you can see the weighted average reacts slightly quicker:
All in all only a tiny difference, but I since I think my method is more consistent, I think I'll prepare a pull request with that.
Regarding the original issue @Spirituss described: I had the same problem that the derivative integration didn't update values when my source sensor values were constant. I use this integration for calculating the power (in kW) based on the energy (kWh) which my energy meters provide. The latter is collected by using the RESTful Sensor integration:
rest:
- resource: http://192.168.60.11/cm?cmnd=status%2010
scan_interval: 30
sensor:
- name: "Verbrauch Normalstrom"
state_class: total_increasing
device_class: energy
unit_of_measurement: kWh
value_template: >
{% set v = value_json.StatusSNS.normal.bezug_kwh %}
{% if float(v) > 0 -%}
{{ v }}
{%- endif %}
sensor:
- platform: derivative
source: sensor.verbrauch_normalstrom
name: "Verbrauch Normalstrom Leistung"
time_window: "00:03:00"
unit_time: h
unit: kW
I guess the value updates didn't take place because hass didn't write values of the source sensor in the database. I haven't verified this, I just took a look at the Prometheus metric hass_last_updated_time_seconds
of the source sensor which I collect. As you can see, the source sensor didn't update for quite some time:
I could fix it by adding force_update: true
to the sensor specification of the rest integration. Now it seems the source sensor (sensor.verbrauch_normalstrom
) values are updated regularly -- even when the value doesn't change (which can't be seen on my screenshots):
Just wanted to quickly post this solution in case someone else finds this issue and uses the restful integration. Maybe other integrations provide similar functionality.
Pretty amazing this issue goes back 2 years. It seems pretty obvious that because the source sensor doesn't update when the value no longer changes, thus the derivative sensor, which needs more than 1 data point, doesn't update either until another value gets sent from the source. It seems most people that "fix" this issue does so like the above, with some means to force an update. I'm no different.
In my case, I created a template sensor of the source and added an attribute that updates every minute. Then I based my derivative sensor off of this.
- sensor:
- name: "bathroom humidity"
unit_of_measurement: "%"
state: "{{ state_attr('sensor.wiser_roomstat_bathroom', 'humidity') }}"
attributes:
attribute: "{{ now().minute }}"
Hope that helps someone.
I assumed this issue to be the same as the one I was having, but apparently it is not fixed? (Because my issue has definitely been fixed by my pull request).
So, to be clear, the issue is that with no change in the source sensor, the derivative is not changed but of course it should trend to 0 in actuality? I just had a look at a few derivatives I use and noticed that although they are almost zero, they aren't exactly zero and indeed never updated to that. For my applications it doesn't matter, because the last value is always very close to zero, but I can see this might be problematic.
I'm a bit surprised that this happens as well, since the derivative sensor essentially keeps it's own history list and doesn't depend on the database. So it must indeed be that a non state change is not communicated. I'd have to check, but I expect this is because we use the 'changed' signal, where we could/should use the 'updated' signal (I thought that was already the case tbh, but I will check).
Yeah it gets to that last value and doesn’t calculate the derivative is zero until one more value is updated. So it gets close to 0 and is trending that way, but that last final step can take hours (however long it takes for the source sensor to update one more time).
It’s the same behavior as with the trend integration, which is where I stole the work around above.
https://community.home-assistant.io/t/add-force-update-support-to-template-sensor/106901/2
I presume that to get a derivative of 0, you need two consecutive values of the same number, but the way HA tends to work until specified otherwise is that it won’t send a second value from a device until the value changes. So the more I think about it, the more it seems the fault of HA as a whole and how derivatives work, and not the code or integration.
In my case, I created a template sensor of the source and added an attribute that updates every minute. Then I based my derivative sensor off of this.
- sensor: - name: "bathroom humidity" unit_of_measurement: "%" state: "{{ state_attr('sensor.wiser_roomstat_bathroom', 'humidity') }}" attributes: attribute: "{{ now().minute }}"
Hope that helps someone.
@zSprawl , thanks a lot. That fixed my problem with my diy gas counter sensor.
In my case, I created a template sensor of the source and added an attribute that updates every minute. Then I based my derivative sensor off of this.
- sensor: - name: "bathroom humidity" unit_of_measurement: "%" state: "{{ state_attr('sensor.wiser_roomstat_bathroom', 'humidity') }}" attributes: attribute: "{{ now().minute }}"
Hope that helps someone.
@zSprawl , thanks a lot. That fixed my problem with my diy gas counter sensor.
Hi, I also faced an issue with derivative never changing to 0, when value does not change, but surprisingly it works fine with "raw" sensors, but not with templates.
The solution with now() attribute
does not work for me, as I get 0 almost all the time - only the initial minute shows proper derivative. Hard to explain but after looking at graph you will see.
I believe time window will work around the issue of never going to 0, but something seems to be wrong.
# Loads default set of integrations. Do not remove.
default_config:
# Load frontend themes from the themes folder
frontend:
themes: !include_dir_merge_named themes
# Text to speech
tts:
- platform: google_translate
automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml
mqtt:
sensor:
#derivative works fine here
- name: Depth
unique_id: "depth"
state_topic: "rtl_433/43/depth_cm"
device_class: distance
unit_of_measurement: "cm"
force_update: true
expire_after: 1830
#derivative does not work here
template:
- sensor:
- name: "Sounding"
unique_id: "sounding_calculated"
device_class: distance
unit_of_measurement: "cm"
state: >
{{ 138.2 - states('sensor.depth') | int }}
#this works
sensor:
- platform: derivative
source: sensor.sounding
name: Flow rate
round: 1
unit_time: min
time_window: "00:20:00"
I'm so astounded that many are focusing on "how would I implement" and not starting with the obvious...
A "Derivative" measures the rate of change occurring.
My humidity sensor sends a reading every 60 seconds.
HA DISCARDS repetitive readings.
The "15 minute window" derivative of an hour of, say, 45% humidity, is 0. No question.
The "15 minute window" derivative of "no readings" is 0. No question.
I understand the concern for "non-reporting sensors". But since HA drops the repetitive values, it seems best to work with what we have. No data within the window? Report the derivative as '0.0'. Only one reading within the window? Report the derivative as '0.0'.
It almost feels like the derivative helper is not written with HA in mind, since HA, by default, discards repeated readings, yet a derivative sensor, by its nature, needs multiple readings INCLUDING repeated readings.
Related to "how do I work around the current state of things"...
Do I just need to use "some" technique to get HA to log values that haven't changed, either the 'now()' hack, the force_update config, or some other approach?
I'm so astounded that many are focusing on "how would I implement" and not starting with the obvious... A "Derivative" measures the rate of change occurring. My humidity sensor sends a reading every 60 seconds. HA DISCARDS repetitive readings. The "15 minute window" derivative of an hour of, say, 45% humidity, is 0. No question. The "15 minute window" derivative of "no readings" is 0. No question.
I understand the concern for "non-reporting sensors". But since HA drops the repetitive values, it seems best to work with what we have. No data within the window? Report the derivative as '0.0'. Only one reading within the window? Report the derivative as '0.0'.
It almost feels like the derivative helper is not written with HA in mind, since HA, by default, discards repeated readings, yet a derivative sensor, by its nature, needs multiple readings INCLUDING repeated readings.
I absolutely agree with this. If there is no signal within the window, you have to assume a rate of zero and the derivative sensor has to return a state of 0 until it gets a change in signal.
Once it gets the new signal, the slope is equal to the change in signal divided by the derivative window.
So say you have the following:
Time Signal
0:00:00 0.0
0:00:30 0.5
0:01:00 1.0
0:01:30 --
0:02:00 --
0:02:30 --
0:03:00 --
0:03:30 1.5
0:04:00 2.0
0:04:30 2.5
0:05:00 3.0
The derivative at 0:03:30 for a window of one minute is equal to (1.5-1.0)/(0:03:30 - 0:02:30) because regardless of whether you got a repeated signal of 1.0 at 1:30 - 3:00 and HA discarded them OR you just didn't get a signal at all, it doesn't matter and you can't know when that last signal changed. The answer is NOT (1.5 - 1.0) / (0:03:30 - 0:01:00) or whatever the slope was the last time it had multiple signals within a window.
ALso, it seems to me that calculating slopes between every signal is fairly cpu-intensive and unnecessary. Wouldn't it be better to obtain the slope between the oldest signal and the newest signal and discard signals as they age out of the window? You could assume a pseudo last signal equal which was read at "null" time until a new signal is received. Then you'd assign a time of "now() minus window" to that signal value and take the slope between it and your current signal value. That's far, far more efficient than weighted averages of slopes, especially if you have hundreds of signals within your window.
Any update on this? I see a few workarounds - but they appear to be more yaml based. Ive started transitioning away from yaml since that appears to be the suggested way of doing things going forward.
TLDR: I propose to update this integration as described in the final chapter of this loooong post.
I would like to give my view on the topic, which is based on mathematical insight:
Let me first start with a bit of explanation of how I approached this topic.
Sensor values are non-uniform (not equi-distant, i.e. not always with equal time between them) samples of the value a real-world "function" f(t)
at a specific time. Since sample values are digital numbers, these are an approximation of the real value of the function f(t)
. Furthermore, consecutive equal values are ignored by HA, meaning we can interpret our samples as a list of pairs (t1,v1),(t2,v2),(t3,v3),...
, where the sequence t
is increasing, and any value of v
is always different from the previous one.
This means that even if the method by which new values are obtained is a polling method with a regular interval, the samples may still be non-uniform because HA throws out equal values. Thus, any method we think of to process data should always assume the data is non-uniform, because even though we might know that data is checked regularly, we never know when the next different value comes in.
Now, based on the sample list (t1,v1),(t2,v2),(t3,v3),...
we want to create a "best representation" function g(t)
that resembles f(t)
as close as possible.
So, how do we obtain a best representation g(t)
of the function f(t)
from its unique samples? There are multiple ways to do this:
g(t)
is based on only sample values that are in the past, i.e. only those samples (ti,vi)
with ti <= t
. We call this a "causal" relation, because every change is caused by and can be computed from only those things that have already happened, for instance:
last
: g(t)
is equal to the value of the last recorded (unique) sample.g(t)
is based on previous and future values of f(t)
, for instance:
linear
: g(t)
is a straight line between the previous and next (unique) samples.The problem with any method that would fall in the category of item 2 is that they use future values of f(t)
, which means that at time t
we cannot compute g(t)
yet, because we also need at least 1 future sample f(t')
for some t'> t
, which we cannot know yet.
Let me pose an assumption, without trying to clarify too much why I think it is true:
g(t)
from the method last
. This basically means it is a weigted average of the previous values, where the weights depend on how long each sample was active, and how far it is in the past.So back to this integration. In my opinion, there is 1 property that would need to be satisfied for this integration to be useful, and that is:
0
to x
of the derivative f'(t)
of a function f(t)
, and for all x
the outcome will be exactly f(x) - f(0)
, i.e. the original function will come out, although possibly with an offset.
g(t)
is based on reconstruction method last
, the only riemann sum integral method that makes sense is to use left
-integration, which is what I will assume from here on.So, based on all of the above, I can now properly explain why I personally have a couple of "issues" with the current implementation of the Derivative integration:
time_window
is configured, the integration calculates a new derivative value at the moment when the next value comes in, from the previous value (t1,v1)
and the new value (t2,v2)
, simply by calculating the rate of change (v2 - v1) / (t2 - t1)
. Though this is a good calculation to calculate the average derivative in the time window [t1, t2]
, the problem I have with it is that the integration applies this value at time t2
, whereas (assuming we use the reconstruction method last
) it actually makes more sense to apply this value at time t1
. But that would mean at time t1
the value v2
at time t2
would need to be known, and thus we do not have a "causal" sensor!time_window
is configured, then the following example shows that the property is not guaranteed: Assume a cumulative sensor that has values (0,0),(2,1),(5,4)
. Then the derivative will calculate the following derivative sensor values: (0,0),(2,0.5),(5,1)
, which, using left
-integration results in the integrated values (0,0),(2,0),(5,1.5)
, which is not equal to the original. (Note: right
-integration would actually result in the original sensor values, but that does not match the assumption that our reconstruction function is based on method last
)time_window
is given or not, if no sensor values take place for a long time, the derivative never resets to 0 (which is the main problem of this issue, of course), so taking the example sensor values above, if then the next value of (1005, 5)
comes in, the derivative integration will add sample (1000, 0.001)
, and the riemann sum integral integration will add sample (1005,1.5 + 1 * (1005 - 5))=(1005, 1001.5)
, which is not even close!Let me give an example of what the derivative samples should look like in my opinion. Assume the samples are (0,0),(2,1),(5,4),(12,5)
and time_window=4
. Then the derivative samples should be:
(-inf, 0)
(0,0)
enters the time_window
. New derivative sample: (0,0)
.
(2,1)
enters the time_window
. New derivative sample: (2,0.25)
.
1
w.r.t the sensor value time_window
seconds ago, so the average derivative is 1/time_window = 1/4 = 0.25
.(0,0)
leaves the time_window
. New derivative sample: (4,0.25)
.
last
-interpretation, its value is still applicable in the first 2 seconds of the window. This might seem like a strange thing to do, because it is the same value as the previous sample, but below we shall see that it makes sense to update the derivative also once a sample goes outside the time_window
, not just when it enters it.(5,4)
enters the time_window
. New derivative sample: (5,1)
.
4
w.r.t. the sensor value time_window
seconds ago, which was time 1
, i.e. when sample (0,0)
was active, this its value was 0, so the derivative should be (4-0)/4 = 1
.(2,1)
leaves the time_window
. New derivative sample: (6, 0.75)
.
(4-0)/4
to (4-1)/4
.(5,4)
leaves the time_window
. New derivative sample: (9, 0)
.
(5,4)
is active throughout its entirety, so the derivative should become 0.(12,5)
enters the time_window
. New derivative sample: (12, 0.25)
.(12,5)
leaves the time_window
. New derivative sample: (16, 0)
.So our derivative samples become: (0,0),(2,0.25),(5,1),(6,0.75),(9,0),(12,0.25),(16,0)
(we dropped the duplicate (4,0.25)
, as HA would do).
Now lets ty to left
-integrate this using the riemann sum integral integration:
(0,0)
: Integral = 0
=> sample (0,0)
(2,0.25)
: Integral += 2 * 0
=> sample (2,0)
(5,1)
: Integral += 3*0.25
=> sample (5,0.75)
(6,0.75)
: Integral += 1*1
=> sample (6,1.75)
(9,0)
: Integral += 3*0.75
=> sample (9,4)
(12,0.25)
: Integral += 3*0
=> sample (12,4)
(16,0)
: Integral += 4*0.25
=> sample (16,5)
Wait what?!? That is not even close to the original list of samples! Indeed, but due to the time_window
we apply, what we are actually doing is applying a 4-second moving average to the original sensor before calculating its derivative. Suppose we would calculate a 4-second moving average of the original sample list every second:
(0,0)
=> sample (0,0)
(1,0)
(2,1)
=> sample (2,0)
(3, 0.25)
(4, 0.5)
(5,4)
=> sample (5,0.75)
(0*1 + 1*3)/4=0.75
.(6,1.75)
(3*1 + 1*4)/4=7/4=1.75
(7,2.5)
(8,3.25)
(9,4)
(10,4)
(11,4)
(12,5)
=> sample (12,4)
(13,4.25)
(14,4.5)
(15,4.75)
(16,5)
And if we cross-check the times at which the riemann sum integral integration calculated a new value, it matches 100% with the above list.
So, basically I would say that this integration needs an update as follows:
time_window
is given, it should default to 1, as otherwise we are creating a non-derivative sensor.time_window
interval. This means the derivative will have 2 times as many state changes as the original sensor, but this comes with the added benefit that it actually makes sense as a derivative!
async_call_later
function, with a delay of exactly time_window
.@afaucogney Would you be so kind as to read the above rationale and comment on whether you think this is a good improvement of this integration? If so, let me know, I can start working on implementing it on relatively short notice.
The problem
I use derivative sensor to measure the water flow through water counters. When the flow is changing, derivative shows realistic values. But when the flow becomes zero, derivative is still showing the last measured value for a very long period (about some hours). In the meanwhile, 'change' attribute of the statistics sensor becomes zero with zero flow after the last HA update 0.105 (the values keeping logic was changed). Below is the measurements of derivative and statistics sensors, and the historical values of the water meter itself as a prove.
Environment
Home Assistant 0.105.1 (ex. Hass.io) arch: x86_64
Problem-relevant
configuration.yaml
Traceback/Error logs
No error, but incorrect behaviour.
Additional information
Here is the water meter values, which became zero at 01:18
The derivative sensor (sensor.raw_water_flow) was still showing non-zero (0.12 l/min) value after 01:18
The statistics sensor (sensor.raw_water_flow_stat) showed zero at 01:18