Open Mariusthvdb opened 9 months ago
Hey there @thomdietrich, mind taking a look at this issue as it has been labeled with an integration (statistics
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
statistics documentation statistics source (message by IssueLinks)
The problem is almost certainly in https://github.com/home-assistant/core/blob/92842c28d212479d9a977b989cbe8b5ca6aee50e/homeassistant/components/statistics/sensor.py#L548
Its calling _add_state_to_queue
which for a power sensor that updates every 5s in a 24h period will be 17280 calls.
It likely needs an optimized version for restoring data
The whole integration is doing a lot of datetime math could likely be all floats instead since we don't need local time
forgot to post the yaml configs, sorry, do you need these too, or is it a generic issue
btw, the power sensor updates way more frequently than every 5 seconds... it is per state change of the smart meter, and that can be more than once per second so it seems. guess you can safely multiply by 5 there
so, this is using a template sensor (UI) based on 2 integrations (reading smart meter (mqtt) and solar(mqtt)),
state: >
{{expand('sensor.zp_actuele_opbrengst','sensor.netto_verbruik')
|selectattr('state','is_number')
|map(attribute='state')
|map('float')|sum}}
then fed into a filter:
- unique_id: bruto_stroom_verbruik_filter
platform: filter
entity_id: sensor.bruto_stroom_verbruik
filters:
- filter: outlier
window_size: 5
radius: 1.5
- filter: time_simple_moving_average
window_size: '00:00:30'
precision: 3
- filter: range
lower_bound: 0
and finally the statistics:
- unique_id: standby_power_laatste_24_uur
platform: statistics
entity_id: sensor.bruto_stroom_verbruik_filter
state_characteristic: percentile
percentile: 30
max_age:
hours: 24
precision: 3
also, maybe consider, add a polling/update frequency option. seems unnecessary to update this frequently, once a minute should suffice as absolute minimum, maybe even lower.
think trigger template, or scan_interval:
Aware this has been discussed before, maybe revisit that?
This one is still a problem even after https://github.com/home-assistant/core/pull/112127 in 2024.4
Hey guys, this is a difficult issue. Generally the statistics integration is NOT optimized for huge sampling settings. I did always try to highlight that to users, however I can see that the warning did not make it into docs. Of course we can agree that this is not a valid justification or solution.
Recently I didn't have the time to add further improvements but I had planned to implement optimizations per characteristic (e.g. for the max
value we need to store only one value, the largest one so far). For the percentile
an optimization is not trivial.
( If I may go a bit on a tangent. Some time ago I did discuss the nature of the statistics component with one of the core developers. The component provides timeseries data processing capabilities the main HA database does not provide. Because of that the statistics component needs to maintain its own data storage and implements many typical arithmetic functions. It is my humble opinion that the HA database (a traditional relational database) should be replaced by one of the amazing open source timeseries databases out there. This would solve many core issues (and feature requests) and performance challenges, and the statistics component would be nearly unnecessary. My recommendation was InfluxDB but there might be an even better fit out there. Working on the statistics component always felt like building a house on muddy soil. )
Anyhow. @bdraco back when you worked on https://github.com/home-assistant/core/pull/110438 did you compare performance/memory of an extreme use case?
Anyhow. @bdraco back when you worked on #110438 did you compare performance/memory of an extreme use case?
I was only looking at improving the performance of the 1 day of date with a sample rate of 5s ~17280 samples
The problem
running debugpy reveals several remarkable timings during setup. writing this up per request.
What version of Home Assistant Core has the issue?
2024.2.x
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
statistics
Link to integration documentation on our website
https://www.home-assistant.io/integrations/statistics/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
No response