Closed grigi closed 6 months ago
I have no problem about implementing this, however we probably should do it configurable by users, as in some cases this might not be desired. Please take a look in the comment from @andythomas on https://github.com/Blackymas/NSPanel_HA_Blueprint/discussions/1857#discussioncomment-872433. Perhaps we can have here some discussion around averaging.
Reducing the interval between ADC measurements may be an option to compensate averaging responsiveness.
I read that entire issue now, Highly noisy reading like I have there is one of the things that make a simple bang-bang algorithm for floor heating unreliable. But I see wanting to reduce latency.
So multiple samples a minute could give us the best of both worlds?
It looks like the thermistor is always powered, so taking more readings won't result in any extra self-heating above what's already happening.
This can be relevant also: https://github.com/esphome/esphome/pull/6330
So multiple samples a minute could give us the best of both worlds?
I believe so.
It looks like the thermistor is always powered, so taking more readings won't result in any extra self-heating above what's already happening.
That's right. Disabling sleep may have a much bigger impact, as the CPU and display will increase temp.
Yes, Marlin (the 3D printing firmware) takes 16 samples in a row, every second. And this gives it reliable readings to about 0.05°C on most thermistors.
Until that lands, doing a reading every 20 seconds and emitting them once a minute will probably help significantly.
My concern is then on the CPU needed for these additional readings and averaging. I've seen messages related to ADC taking longer than expected, so this will probably happen more often.
That's right. Disabling sleep may have a much bigger impact, as the CPU and display will increase temp.
I, um tried to measure it, and temp readings get raised by about 0.4°C when doing the screen firmware update, but then drops to ~ 0.1°C elevated with screensaver for me only. Much less than I expected.
My concern is then on the CPU needed for these additional readings and averaging. I've seen messages related to ADC taking longer than expected, so this will probably happen more often.
You could minimise CPU load increase by doing the rolling window over the ADC reading only, and not the post-processing steps?
Surprising that reading an ADC value is "slow" as it shouldn't block, it's not a bit-banging interface where you might have to get timing right with something external like the DHT sensors.
Until that lands, doing a reading every 20 seconds and emitting them once a minute will probably help significantly.
We don't have to wait. We can try it now... just add this customization to your panel's yaml and please let us know the results:
sensor:
- id: !extend ntc_source
samples: 60
external_components:
- source: github://pr#6330
components:
- adc
I've tried that... still have to wait more to say anything about the improvements, but I can say the ADC related warnings aren't there.
Thanks, I tried it as well, I can say that the temperature result is definitely less scattered. Deviation is about ⅓ of before. Though it still isn't "smooth". Deviation is spread across ~0.2C now instead of the prior ~0.7C
Still not as clean as the Sonoff sensors, but those may have aggressive smoothing applied. (but not as aggressive as the Aqara devices which seem to lag by over 10 minutes compared to the Sonoff ones)
A very definite improvement :+1:
FYI: I reduced the samples from 60 to 8 to see what it does, and there isn't any noticeable difference that I can see.
What if you try a much bigger number? Let's say 10000 samples?
value must be at most 255.
So 255 it is
I honestly can't differentiate between 8 and 255 samples. But either looks notably better than 1.
Maybe the ±0.2°C deviation is "accurate" :man_shrugging:
And what about adding the filter you suggested?
Your customization should be like this:
sensor:
- id: !extend ntc_source
samples: 255
- id: !extend temp_nspanel
filters:
- sliding_window_moving_average:
window_size: 5
send_every: 1
- lambda: |-
return x + temperature_correction->state;
external_components:
- source: github://pr#6330
components:
- adc
I tried this (on purpose to keep the lag low):
sensor:
- id: !extend ntc_source
samples: 8
update_interval: 30s
filters:
- sliding_window_moving_average:
window_size: 4
send_every: 2
And it reduced the spread to about 0.1°C.
I'm trying a more agressive config now.
I've tried samples: 255
but then I get the Component adc took a long time for an operation
quite often.
It's completely smooth with this: (255,10/2,30s)
sensor:
- id: !extend ntc_source
samples: 255
update_interval: 30s
filters:
- sliding_window_moving_average:
window_size: 10
send_every: 2
But, yeah, that delays response times by ~2 minutes and is a bit overkill ito resource usage.
I also tried a few other configurations to reduce the delay, but I basically settled again on what I said in my previous message. (8,4/2,30s)
(8,2/1,60s) → doesn't help (8,3/3,20s) → doesn't help So I'm thinking that just the multi-sampling fixes up the inaccuracies, and anything above that is smoothing.
I feel we should at least do: (8,1/1,60s)
sensor:
- id: !extend ntc_source
samples: 8
As that improves the situation but doesn't introduce any delay, and this above if you want it smooth.
I'm leaving it on the intermediate (8,4/2,30s) from the previous message for now as that seems to hit the balance of smoothness and delay for possibly most people, and still only updates the result once every 60s.
It's completely smooth with this: (255,10/2,30s)
sensor: - id: !extend ntc_source samples: 255 update_interval: 30s filters: - sliding_window_moving_average: window_size: 10 send_every: 2
But, yeah, that delays response times by ~2 minutes and is a bit overkill ito resource usage.
I also tried a few other configurations to reduce the delay, but I basically settled again on what I said in my previous message. (8,4/2,30s)
(8,2/1,60s) → doesn't help (8,3/3,20s) → doesn't help So I'm thinking that just the multi-sampling fixes up the inaccuracies, and anything above that is smoothing.
I feel we should at least do: (8,1/1,60s)
sensor: - id: !extend ntc_source samples: 8
As that improves the situation but doesn't introduce any delay, and this above if you want it smooth.
I'm leaving it on the intermediate (8,4/2,30s) from the previous message for now as that seems to hit the balance of smoothness and delay for possibly most people, and still only updates the result once every 60s.
You mean 8 samples, 30s update interval, window size 4 and send every 2, right?
Yes, it was getting too much to type out so I lazily shortened it.
I'm planning to leave it like that tonight so I can post a picture comparing the different configurations.
I'm doing a few more experiments, trying to see what is it that makes it "smooth".
(8,9/3,20s):
(8,12/12,5s):
So, many samples seem quite important. Once you're at ~10 samples in a moving window, things look good.
So now the question is, how often are you prepared to sample...
(8,8/4,15s)
8 samples doesn't really help. It needs to be >10 to be smooth.
I'm doing a (12,12/3,20s) test now overnight. It should be pretty much perfectly smooth. And have an average delay of 1½min.
What is your goal? I mean, what is your threshold to consider it good?
I want the impossible. it should be smooth, but have no artificial delay added on. But now I'm just trying to optimise for as smooth as possible with acceptably small delay.
It does depend on what is considered a good enough delay. And how many samples/minute is fine for low cpu.
I kinda feel the optimal is either (8,1/1,60s)
(just multisampling) or (8,9/3,20s)
multisampling and smoothing of 9 samples spread over 3 minutes. (average delay of 1 minute)
I feel that more samples/minute is probably a bit too high load on the panel.
So choose between either of those two I think.
I will add this sampling to the next release, so hopefully we can get feedback from more people. I'm planning to use samples: 12
.
I wouldn't be so much afraid of reducing collection interval and increasing samples/average. We have critical memory limitation, but I don't think CPU would be an issue.
Thanks samples: 12 will be a notable improvement already.
First, I would also vouch against an averaging window, because it would delay the response. Then, I tried to 'simulate' the expected results for samples: 12
and came to the following results. First, I plotted my living room temperature:
Then, I made a histogram, indicating that most values are between 20.9°C and 21.5°C, i.e. within +/-0.3°C. The Gaussian fit is depicted in red.
Finally, I 'simulated' a histogram assuming a textbook noise reduction of 1/sqrt(12):
I would say that this would suffice. I will compare that to real measurements once I get to it. However, I would expect diminishing returns at some point and would like to stay within the 0.1°C granularity range for the temperature measurements and control, i.e. I would not sample much more than 12 readings.
I would also vouch against an averaging window, because it would delay the response.
We probably can compensate it by reducing the measurement interval. We are currently collecting temperature every minute, and I believe we can collect more without a significant impact on CPU, so maybe collect every 15s (or maybe even smaller interval) and doing some averaging in a way that it keeps the reporting every minute.
The multisampling doesn't smoothen out the graph, it reduces inaccuracies of the ESP32's ADC. I did do tests and there was no noticeable difference between 8 and 255 samples.
Note that these samples all happen in a tight loop, and the purpose is just to improve the ADC accuracy.
It helps a lot, as it is. But doesn't eliminate all the noise. It can filter out noise in the MHz range I suppose. Likely some other part of the thermistor system introduces noise at a lower frequency.
So ALSO doing multiple spread out sample intervals helps more. Doing a measurement every 5 seconds and then averaging over the 12 samples for each minute gave a really smooth graph. It also should introduce virtually no real delay.
Something strange is going on when using ADC multisampling. I added
sensor:
- id: !extend ntc_source
samples: 8
external_components:
- source: github://pr#6330
components:
- adc
to my yaml and wanted to do some data analysis. However, there was way less data available than anticipated. I looked into it and saw 'missing' data. It was there before, but not as obvious, so I did not notice. Please have a look at the following data, I added the multisampling at 8.10pm.
HA does not collect data every minute, but there are some gaps. If I use the multisampling with 8 samples, these gaps increase in frequency leading to gaps of more than 1 minute. Interestingly, when the panel was updated HA collects 'unavailable,' but the other data simply is not there...
First, could someone confirm this issue? @grigi Do you also have fewer datapoints? It would also look smoother... @edwardtfn Edward, if you could explain to me how the data is reported to HA, I will look deeper into it.
While it might be caused by the lack of datapoints, the data also looks too regular and there are very few measurements at .1
compared to .2
and .0
But this could also indicate a weak bit in the ADC.
A new value is sent only when it is different than the last one sent, so this is probably the reason you see these gaps and this is probably the reason you have more gaps when sampling, as the values should be more stable and therefore will repeat more.
You can force the sensor to report all values with this:
sensor:
- id: !extend temp_nspanel
force_update: true
I updated to v4.3.1
yesterday and tested some more. I added:
sensor:
- id: !extend ntc_source
samples: 2
- id: !extend temp_nspanel
force_update: true
to my bedroom panel and
sensor:
- id: !extend temp_nspanel
force_update: true
to my living room panel. My understanding is that this leads to 12 samples for the living room and 2 for the bedroom. However, the improvement is the same and was further the same for 4 samples as well. No samples led to 0.2°C width in the Gaussian fit and all the others lead to 0.04 [sic!]. It is not really noise causing the fluctuations, but something else. Please see the following histogram data to illustrate. Top is living room (12 samples) and bottom is bedroom (2 samples):
EDIT: uploaded correct plot
It is even more obvious in the raw data:
It rarely takes 21.1°C as a value and does not scatter around the average value, but rather only jumps up from time to time. While I can only guess the reason for this (weak bit in ADC, nonlinearity in the temperature calculation, internal warming of the chip when doing stuff, decimal cutoff in integer operations...) I am not sure anyone but me is interested in this ;)
My bottom line would be: Sampling improves the accuracy independent [sic!] of the value (2, 4 or 12). I could test
So ALSO doing multiple spread out sample intervals helps more. Doing a measurement every 5 seconds and then averaging over the 12 samples for each minute gave a really smooth graph. It also should introduce virtually no real delay.
next with the floor heating.
I apologize, but I cannot let it go. The temperature increase is (with one exception) equidistantly showing every 15/16 minutes. I am almost convinced that
internal warming of the chip when doing stuff
is causing it. @edwardtfn Does the 15min interval ring a bell with any of the internal operations of the panel? Could also be aliasing of a different interval and the 1min temperature measurements. I would probably test
sensor:
- id: !extend ntc_source
update_interval: 13s
- id: !extend temp_nspanel
force_update: true
Unfortunately, the sun only allows reliable measurements at night...
Interesting findings. I'm in the process of packing to move, so will probably only be able to spend time on this in a few weeks again.
@edwardtfn Does the 15min interval ring a bell with any of the internal operations of the panel?
We don't have any routine in the panel other than the time update every minute (which also updated the Wi-Fi icon).
But some things in the sensor are updated even when the display is sleeping, so depending on what happens on your Home Assistant entities, it could drive to some load to the panel.
And I'm sure the CPU and screen activity will affect the temperature measurement. In other discussion this was already reported.
I will close this just to make my backlog easier to manage, but let's keep the discussion here. The system is currently using 12 samples, but I'm totally open to revert that back or to implement a different approach. Just don't wanna to handle specific needs on the main code. We have customizations for that. 😉
I tried Nickolas' (@grigi) suggestion the make a moving average filter measuring several times a minute, but only reporting the value once per minute. This would address my concern, because it would be as fast as before but supposedly more accurate and, therefore, smoother. I chose 6 times, because we probably cannot overcome the occasional warm up leading to the 0.2°C spikes. I want to keep the 0.1°C accuracy and need at least 5 measurements for that (0.2/5 = 0.04 < 0.05). 6 should be on the safe side.
sensor:
- id: !extend ntc_source
# samples: 1
update_interval: 10s
filters:
- sliding_window_moving_average:
window_size: 6
send_every: 6
- id: !extend temp_nspanel
accuracy_decimals: 2
force_update: true
That worked perfectly. It is amazing, how accurate the measurement can be done with such a simple device. Just for the testing, I set accuracy_decimals: 2
and force_update: true
to allow a better/easier data analysis. Please look at the raw data:
It is possible to track the slow temperature cooldown without heating during the night. If I correct for this decrease and only look the the scattering of the data, I get the following:
Textbook behavior only limited by the resolution :) Finally, I checked for Edward's concern about the blocking of the device.
Component adc took a long time for an operation
I get this every other temperature measurement, but I also get it with vanilla v4.3.1
so my changes did not increase that. I will finally use
sensor:
- id: !extend ntc_source
# samples: 12 already used in v4.3.1
update_interval: 10s
filters:
- sliding_window_moving_average:
window_size: 6
send_every: 6
@edwardtfn It could be used at the default setting, I cannot see any disadvantage other than the possible readjustment of the heater setpoints from (e.g. in my case from cold_tolerance: "0.3"
and hot_tolerance: "0.1"
to values such as cold_tolerance: "0.2"
and hot_tolerance: "0.1"
) If you want to change the main code and you think it is taking load from the panel, I could test for samples: 4
instead of samples: 12
I think we are fine with this Component adc took a long time for an operation
. I don't have it with v4.3.2d1 (which have no changes on the ADC compared to v4.3.1), but ESPHome is using a deprecated ADC driver and they will probably change that soon when migrating to IDF v5, so I won't look at that for now.
If you have some time to try samples: 4
, I think it would be nice, as in some messages above was mentioned this has no impact to the results. Or even samples: 1
(which means no sampling as before).
I will try to summarize my results. The following code is added to the panels' yaml:
sensor:
- id: !extend ntc_source
samples: A
update_interval: B
filters:
- sliding_window_moving_average:
window_size: C
send_every: C
Width denotes the width of the Gaussian fit as depicted in previous posts.
A | B | C | width (°C) | comments | |
---|---|---|---|---|---|
v.4.3.0 | 1 | 60s | 1 | 0.2 | starting point |
v.4.3.1 | 12 | 60s | 1 | <=0.04 | 0.2°C spikes every 15min -> bad for control loops |
(1) | 12 | 10s | 6 | <=0.04 | reference, perfect for accuracy_decimals: 1 |
(2) | 1 | 10s | 6 | 0.05 | slightly worse than (1), pr#6330 not required |
(3) | 4 | 10s | 6 | <=0.04 | as good as (1), but uses less resources |
(4) | 1 | 5s | 12 | <=0.04 | as good as (1), slightly more resources than (3), pr#6330 not required |
I have a couple of side questions here:
accuracy_decimals: 2
? Will that add any real value? Plots will probably look a bit better, but other than this, any value added? The cost may be on the network usage.send_every
and if the value isn't changing, it won't affect the network usage as it isn't sent, but it will be much faster to react to a window opened or some other event.Any thoughts?
No, accuracy of 0.1C is perfectly good. I think that was used as a way to accurately calculate the deviation on his system.
On Sat, 23 Mar 2024, 16:34 Edward Firmo, @.***> wrote:
I have a couple of side questions here:
- Should we start using accuracy_decimals: 2? Will that add any real value? Plots will probably look a bit better, but other than this, any value added? The cost may be on the network usage.
- Should we send more frequently than 1min? This is just about playing with send_every and if the value isn't changing, it won't affect the network usage as it isn't sent, but it will be much faster to react to a window opened or some other event.
Any thoughts?
— Reply to this email directly, view it on GitHub https://github.com/Blackymas/NSPanel_HA_Blueprint/issues/1918#issuecomment-2016541380, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7T2AMEUV4UK5H2RDZBRLYZWVKHAVCNFSM6AAAAABEXSNH3GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJWGU2DCMZYGA . You are receiving this because you were mentioned.Message ID: @.***>
accuracy_decimals: 2
. As Nickolas already pointed out, I only did it for the data analysis, i.e. to be able to plot the histograms and I would switch to accuracy_decimals: 1
during normal use. I do not think we can really measure temperature differences and certainly not temperature with that accuracy using the panel's thermistor. Two decimals would suggest a significance that is not there. In addition, I believe the HA thermostat card always uses 1 decimal anyway. Finally, accuracy_decimals: 2
would require the Gaussian width to be <=0.004 and much more averaging. update_interval: B
and keep window_size: C
and send_every: C
the same.However, for the same measurement accuracy we would have to change update_interval: B and keep window_size: C and send_every: C the same.
Are you sure about that? It shoukd be able to do averaging and still having a send_every < window_size.
hmm, I am not sure for this use case. I wanted the response to a step function also be a step function. But now thinking about it, it does not make much sense for a temperature measurement. If T was 20°C and 19.8°C shortly after, it was certainly 19.9°C in between these measurements.
I updated the table to its final (?!) form. I think (3) is the best solution and the most robust against, e.g., variations of the ADC quality between panels. If the samples
do not make it into ESPHome, (4) is as good but probably a little bit less robust and consumes slightly more resources. I will advertise the PR and share the data here. And I do think send_every
can pretty much take any value deemed useful in line with @edwardtfn comment and as opposed to what I said before.
In a quick test, I compared for 2h each
sensor:
- id: !extend ntc_source
samples: 4
update_interval: 10s
filters:
- sliding_window_moving_average:
window_size: 6
send_every: 6
with the same but send_every: 1
and both without force_update: true
. The 1
led to only 10% more values being reported, so it might be worthwhile to be
much faster to react to a window opened
Enhancement Summary
Filter the temperature reading to make it smoother to make it useful for automations
Detailed Description
The internal temp sensor is noisy, consider smoothening it by adding a
sliding_window_moving_average
filter. The current reading looks like this:If you could e.g. add a simple filter to
id: temp_nspanel
it could help make that smoother. e.g.:Additional Context
Mostly the variation is minute to minute, so I don't think a large window size would be needed. If you really want faster response times you could set
window_size
to 2 and half the read interval to 30s, but I don't need the temp meter to be that responsive, just a little smoother.