BottlecapDave / HomeAssistant-OctopusEnergy

Unofficial Home Assistant integration for interacting with Octopus Energy
https://bottlecapdave.github.io/HomeAssistant-OctopusEnergy/
MIT License
613 stars 61 forks source link

Excessive number of target rate sensor states written to the HA database #997

Closed gcoan closed 1 month ago

gcoan commented 1 month ago

Describe the bug

There is a broader discussion #957 about amount of data stored in the HA database, but this is a specific example for target rate sensors.

I have two target rate sensors, one for a cheap rate 3 hour block, another for a 1.5 hour block. By default I have 14 days of state history stored in HA with the 3 hour binary sensor resulting in 15,800 records in the state table and 13,500 for the 1.5 hour sensor.

Focussing on just the 3 hour sensor, the number of records in the state table does not correspond to the actual state changes HA is recording, far from it. In the last 7 days the history and logfile show just a small number of state changes: image image

But yet in the last 7 days there are 9566 state table changes written: image

I am no expert on this table but suspect that a new states and states_attributes record is being written each time the sensor is being evaluated every minute as target times last evaluated is changing even though the state and most of the other attributes are not changing

Reproduction steps

Create a target rate sensor, continuous evaluation type

Leave running for a few days

In SQLite web, execute the following query to see the number of state table records in the HA database:

-- Updated query Dec 2023
SELECT
  COUNT(*) AS cnt,
  COUNT(*) * 100 / (SELECT COUNT(*) FROM states) AS cnt_pct,
  states_meta.entity_id
FROM states
INNER JOIN states_meta ON states.metadata_id=states_meta.metadata_id
WHERE states_meta.entity_id LIKE '%binary_sensor.octopus_energy_target_%'
GROUP BY states_meta.entity_id
ORDER BY cnt DESC

Expected behaviour

In the short term I can purge this sensor to reduce the unnecessary history, but whilst the target times last evaluated is useful to know, not to the extent of the history it generates.

Maybe only update this attribute when target times, next times or sensor state changes? i.e. only update the time last evaluated when a meaningful change actually occurs.

Or supress writing some of the attribute state changes to the recorder db (if possible)

Tariff Code

E-1R-AGILE-FLEX-22-11-25-A

Integration Version

12.2.1

Home Assistant Version

2024.9.2

Fresh Install?

Not specified

Home Assistant Logs

See screenshots above

Confirmation

BottlecapDave commented 1 month ago

Hello and sorry for the late response.

I'm not really in control of how often HA writes to the database. They'll do it whenever any of the attributes change (including omitted attributes for some reason) or the state changes and is fundamentally how HA works. If nothing changes (e.g. target rates stay the same), then it should update at most roughly every 30 minutes causing around 48 records a day. This is far less than other sensors you probably have within your HA instance, like energy monitoring sensors and can't be much lower. HA will also only keep these records for a configurable amount of time (default is 10 days I believe).

If the non omitted attributes don't change, then the record will point to the same attribute table record. It was this that caused large database sizes and a new record would be produced everytime one attribute changed, which depending on the volume of attributes was causing the majority of unnecessary data and is something more in my control.

gcoan commented 1 month ago

I have identified a very similar issue with sensor.octopus_energyelectricity[MPAN_serial no]_current_total_consumption

which in my database currently has 20390 records, or 1 every minute for the last 14 days of history I retain - this one sensor is approximately 1% of the rows in my entire HA sensor table and 5% of the state_attributes table!

Same problem, the sensor and attributes are being written every time the 'data last retrieved' is updated, regardless of whether there is actually a different sensor value or not. Looking at the sensor history I have for example only imported 0.04kWh since 7am this morning, but there's been a record written every minute image

For this particular sensor the value of my import meter isn't of any interest to me and it only goes back to early June 2024 in the long term stats, so I will reduce the data volume by purging the sensor history or more likely just disable the entity because I don't need it, but the issue remains for others

gcoan commented 1 month ago

Hi Dave,

Thanks for your reply and no problems about the time taken to response, I am sure you get lots of things to deal with.

My subsequent message came after you closed the issue and drilled into another sensor I have found an excessive number of records for, the total consumption energy sensor. For this one I did look at the number of records in the state attributes table and there were over 20,000 state attributes records, caused I believe by the date last retrieved being a logged attribute and this sensor thus being captured every minute for the past 14 days (my history setting).

The target rate binary sensor is also logging 1 record to state_attributes every minute.

Here's the SQL to show how many records on the state_attributes table.

SELECT
  COUNT(*) AS cnt,
  COUNT(*) * 100 / (SELECT COUNT(*) FROM state_attributes) AS cnt_pct,
  states_meta.entity_id
FROM states
INNER JOIN states_meta ON states.metadata_id=states_meta.metadata_id
INNER JOIN state_attributes ON states.attributes_id=state_attributes.attributes_id
WHERE states_meta.entity_id LIKE '%octopus%'
GROUP BY states_meta.entity_id
ORDER BY cnt DESC

Omit the where clause to see all records ordered by size. I am working my way through addressing the noisy power monitor sensors which were a big issue but the Octopus sensors are now almost top of the list.

Could you take a further look please, I don't seem to be able to reopen the issue

BottlecapDave commented 1 month ago

Thanks for raising this scenario. I'll look at pulling out the data last updated attribute into separate sensors to try and reduce the impact of the consumption sensor.

The target rate binary sensor is also logging 1 record to state_attributes every minute.

This shouldn't be happening. I'm guessing this is because there are no discovered target rates during the times where it's saving every minute (and therefore updating the target times last evaluated attribute)

BottlecapDave commented 1 month ago

The above fixes/changes for this issue are now available in v13.0.0.