Closed HertogArjan closed 4 months ago
@bdraco tested your addon. unfortunately don't have enough space on the device to make it rebuild the entire DB. Is there any way we can do this "offline"? I'm more than welcome to just shoot a bunch of sql statements into sqlite on another box and give it a try.
Here are the sql statements for an offline migration. Be sure to make a backup first. This code is only safe to run with schema 43 or 44. Do not run it on older database (from before May 2024).
BEGIN;
PRAGMA foreign_keys=OFF;
COMMIT;
BEGIN;
CREATE TABLE states_temp_1720542205 (
state_id INTEGER NOT NULL,
entity_id CHAR(0),
state VARCHAR(255),
attributes CHAR(0),
event_id SMALLINT,
last_changed CHAR(0),
last_changed_ts FLOAT,
last_reported_ts FLOAT,
last_updated CHAR(0),
last_updated_ts FLOAT,
old_state_id INTEGER,
attributes_id INTEGER,
context_id CHAR(0),
context_user_id CHAR(0),
context_parent_id CHAR(0),
origin_idx SMALLINT,
context_id_bin BLOB,
context_user_id_bin BLOB,
context_parent_id_bin BLOB,
metadata_id INTEGER,
PRIMARY KEY (state_id),
FOREIGN KEY(old_state_id) REFERENCES states (state_id),
FOREIGN KEY(attributes_id) REFERENCES state_attributes (attributes_id),
FOREIGN KEY(metadata_id) REFERENCES states_meta (metadata_id)
);
INSERT INTO states_temp_1720542205 SELECT state_id,entity_id,state,attributes,event_id,last_changed,last_changed_ts,last_reported_ts,last_updated,last_updated_ts,old_state_id,attributes_id,context_id,context_user_id,context_parent_id,origin_idx,context_id_bin,context_user_id_bin,context_parent_id_bin,metadata_id FROM states;
DROP TABLE states;
ALTER TABLE states_temp_1720542205 RENAME TO states;
CREATE INDEX ix_states_metadata_id_last_updated_ts ON states (metadata_id, last_updated_ts);
CREATE INDEX ix_states_last_updated_ts ON states (last_updated_ts);
CREATE INDEX ix_states_context_id_bin ON states (context_id_bin);
CREATE INDEX ix_states_attributes_id ON states (attributes_id);
CREATE INDEX ix_states_old_state_id ON states (old_state_id);
PRAGMA foreign_key_check;
COMMIT;
BEGIN;
PRAGMA foreign_keys=ON;
COMMIT;
could not get it to work, saw nothing in logs, but DB still does not record data, I see the sensors update, but as soon as I restart or reboot, all that data is lost, also after a period of time the sensors stop history updating.
Same here, very weird. I do have issue117263:
In my configuration.yaml and did run the command previously.
Only things in log is :
cat home-assistant.log | grep issue117263
2024-07-09 18:34:16.064 WARNING (SyncWorker_1) [homeassistant.loader] We found a custom integration issue117263 which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant
For now I have no idea how to properly apply the patch.
Guessing the DB is already "patched" but then, manually purging locks up the cpu at 100% so I'm really not sure.
@bdraco I seem to have different issue, my issue is not to do with the purge, but just 2024.7.x it does not record ANY history at all, it seems to store in memory for a while, but as soon as I restart HA all the current history is lost.
Is this an known issue with older database ( I started using HA around d September last year), should I open a new issue ?
I tried running the integration you created but I did not see anything other than
config # cat home-assistant.log | grep issue117263
2024-07-09 18:18:06.211 WARNING (SyncWorker_0) [homeassistant.loader] We found a custom integration issue117263 which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant
So take it I do not have the foreign_keys
issue ?
@bdraco I seem to have different issue
please open a fresh issue report. Thanks
@bdraco I seem to have different issue
please open a fresh issue report. Thanks
done 121614
Have the same problem. Ist there a way to save History valued for energy dashboard? I could then delete the homeassistant db and let it create a new db by doing a restart and afterwards import the energy data into this new empty DB. But how - never tried that.
With all these database and SQL solutions to the problem of the recorder not purging and stopping recording, can I just disable purging, wait for the August version, re-enable purging at it will all sort itself out ?
(I have oodles of disc space)
With all these database and SQL solutions to the problem of the recorder not purging and stopping recording, can I just disable purging, wait for the August version, re-enable purging at it will all sort itself out ?
(I have oodles of disc space)
I guess you can, I will do the same. Just make sure you have enough space inside home assistant. I run HAOS on KVM and the image is set to 32 GB disk size by default and my DB is around 10 GB now. So I just extended the disk by 30 GB.
Yeah that's what I was hoping too. As much as I'd love to get hands dirty, I'm beyond time poor (yes this is an oxymoron for HA lovers but nevertheless .... :))
Maybe HA is one of the reasons you are so time poor :-)
With all these database and SQL solutions to the problem of the recorder not purging and stopping recording, can I just disable purging, wait for the August version, re-enable purging at it will all sort itself out ?
(I have oodles of disc space)
yes, you can add the following to your configuration.yaml to disable the automatic purge:
recorder:
auto_purge: false
But don't forget to remove this again later ;)
Yeah that's what I was hoping too. As much as I'd love to get hands dirty, I'm beyond time poor (yes this is an oxymoron for HA lovers but nevertheless .... :))
That's not an oxymoron, that's cause and effect 😏
And keep in mind that the next purge may then take quite a while to complete.
See the PR #120779, the fix has been moved up to 2024.7.2.
Just to give my two cents regarding disk usage, now that it will come into 2024.7.2 ...
I converted using the custom_component
from above. It took ~2 hours. The ~28GB database grew to ~72GB during the conversion:
Before (~28.5 GB):
-rw-r--r-- 1 root root 28531658752 Jul 9 08:31 home-assistant_v2.db
-rw-r--r-- 1 root root 163840 Jul 9 08:31 home-assistant_v2.db-shm
-rw-r--r-- 1 root root 9076392 Jul 9 08:31 home-assistant_v2.db-wal
Right after the conversion (~72.8GB):
-rw-r--r-- 1 root root 38595018752 Jul 9 12:17 home-assistant_v2.db
-rw-r--r-- 1 root root 66453504 Jul 9 12:45 home-assistant_v2.db-shm
-rw-r--r-- 1 root root 34212661312 Jul 9 12:45 home-assistant_v2.db-wal
After a manual purge with repack (25.8 GB):
-rw-r--r-- 1 root root 25794158592 Jul 9 12:59 home-assistant_v2.db
-rw-r--r-- 1 root root 66453504 Jul 9 12:59 home-assistant_v2.db-shm
-rw-r--r-- 1 root root 22289232 Jul 9 13:00 home-assistant_v2.db-wal
This is on a 10 year old Core i5 with SATA-SSD.
br Christian
I tried using the custom module but after a while my home assistant stops responding. Now I tried using the queries on a standaline machine but when running the insert in the temp table query I get the following error: Runtime error near line 1: UNIQUE constraint failed: states_temp_1720542205.state_id (19)
https://github.com/home-assistant/core/releases/tag/2024.7.2
Which addresses this issue.
I tried using the custom module but after a while my home assistant stops responding. Now I tried using the queries on a standaline machine but when running the insert in the temp table query I get the following error: Runtime error near line 1: UNIQUE constraint failed: states_temp_1720542205.state_id (19)
That looks like you have some corruption in your states table. You’ll need to manually remove the duplicate state_id
That looks like you have some corruption in your states table. You’ll need to manually remove the duplicate state_id
What is the easiest way check/fix this?
That looks like you have some corruption in your states table. You’ll need to manually remove the duplicate state_id
What is the easiest way check/fix this?
That’s not something that can be solved in this issue. Please jump on discord and ask for help in #general
as it will require a much more interactive process and something everyone subscribed to this issue would likely not want to see unfold.
2024.7.2 resolved issue for me.
2024.7.2 resolved issue for me.
same here. I saw that last night there was no gap in my history
For what it is worth, 2024.7.2 seems to be working now and fixed the issue (unless it was something else I did).
I was one of the people with issues without the known integrations that causes the issue. I worked around the issue, by rolling back to 2024.6.4 which is a workaround, as by database would have grown too big without doing this.
While normally I have the recorder set to keep 4 days of records, I did a manual recorder:purge to keep only 1 day and repacked the DB which brought its size down from 8Gb to about 3GB (in fact the recorder was probably only storing about 3 days of data at this time). I this took a bit less than half an hour. Of course this was all working on 2024.6.4.
Then I did an upgrade to 2024.7.2 and again manually did a recorder:purge keeping only 1 day which seemed to complete without issues without hanging the recorder. So assume issue is fixed. It finished very quickly because I assume not much is being purged because I only just did it. In a few days when it is running over night and purging more, I will know for sure, but looks good.
Hope that is useful.
For me this issue started with 2024.7.2. Now two nights in a row that the recorder hangs at 4.12. No useful information in the logs, except that when I restart HA I see the following:
2024-07-11 07:17:07.219 WARNING (Recorder) [homeassistant.components.recorder.util] The system could not validate that the sqlite3 database at //config/home-assistant_v2.db was shutdown cleanly
2024-07-11 07:17:07.226 WARNING (Recorder) [homeassistant.components.recorder.util] Ended unfinished session (id=232 from 2024-07-10 13:04:46.547373)
Anything in particular I can take a look at?
Looks like the database state table rebuild failed (which might have been due to not having enough diskspace for that), and it was never restarted. I executed the migration now manually and run the recorder purge service manually afterwards, which did complete correctly now🎉Now running a repack as the database file did grow a couple of GBs because of the rebuild 😄
And repack removed those additional GBs and some extra 👍🏻
Great work. Update to homeassistant 2024.7.2 fixed the purge problem. Now i am happy with a tiny db of 300 mb instead of a 3 gb monster. This also reduced CPU load from 10% to 5% 👍
This is on 2024.7.2--seems to be getting stuck still. I manually restarted at 4:32 to get it to resume.
confirm: Update from 2024.7.1 to 2024.7.2 did fix the issue for me. Config-setting recorder: auto_purge: true
did not break recording anymore at about 4:00 o'clock. An additional service-call recorder.purge finished correctly this morning and shrinked my DB more than 60%. Thanks!
This is on 2024.7.2--seems to be getting stuck still. I manually restarted at 4:32 to get it to resume.
If you still have a problem, please verify you have enough disk space (you need ~2x the size of the database free -- maybe a bit less but 2x is safest), and restart again.
If that doesn't solve the problem, please download and post your logs https://my.home-assistant.io/redirect/logs
If you still have a problem, please verify you have enough disk space (you need ~2x the size of the database free -- maybe a bit less but 2x is safest), and restart again.
So.. my db file is 16.0 gb and even after removing all the local backup files, I only have 24 gb free. This doesn't seem to have been a problem with previous monthly purges though, so why is it suddenly a problem now?
I manually ran:
service: recorder.purge
data:
repack: true
And it seems to be the cause of the graph flatlining. It seems to be doing a lot of disk reads:
# iostat -dm sda 60
Linux 6.6.33-haos (homeassistant) 07/11/24 _x86_64_ (8 CPU)
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 181.56 1.17 0.23 2112071 418545
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 8176.10 60.04 0.01 3602 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7823.22 56.10 0.01 3366 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7287.07 45.77 0.01 2746 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 8298.15 63.96 0.01 3837 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7228.28 44.36 0.01 2661 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7961.37 57.30 0.01 3437 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7914.95 57.49 0.01 3449 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7458.41 45.69 0.01 2741 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 8111.10 63.14 0.01 3788 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7194.98 44.78 0.02 2686 1
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 8069.58 57.09 0.01 3425 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7678.67 56.21 0.01 3372 0
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 7325.13 44.97 0.01 2698 0
The average appears to be over 50 mb/s, which should take less than 6 minutes to go through 16 gb, but it's been doing this for over 5.5 hrs now. Any idea what's going on?
If that doesn't solve the problem, please download and post your logs https://my.home-assistant.io/redirect/logs
I'm not seeing anything in the logs related to the recorder, but here it is:
So.. my db file is 16.0 gb and even after removing all the local backup files, I only have 24 gb free. This doesn't seem to have been a problem with previous monthly purges though, so why is it suddenly a problem now?
24 GB is likely enough. It should not be a problem, but is always better to have 2x the database size in case a table rebuild fails.
I don't see any table rebuilds in your logs so either the table rebuild already happened, or there is something wrong with your database that prevents it from getting to that point.
You can check to see if the table rebuild already happened by manually running the following
% sqlite3 home-assistant_v2.db
SQLite version 3.43.2 2023-10-10 13:08:14
Enter ".help" for usage hints.
sqlite> .schema states
CREATE TABLE IF NOT EXISTS "states" (
"state_id" INTEGER NOT NULL ,
"entity_id" CHARACTER(0) NULL ,
"state" VARCHAR(255) NULL ,
"attributes" CHARACTER(0) NULL ,
"event_id" SMALLINT NULL ,
"last_changed" CHARACTER(0) NULL ,
"last_changed_ts" DOUBLE NULL ,
"last_updated" CHARACTER(0) NULL ,
"last_updated_ts" DOUBLE NULL ,
"old_state_id" INTEGER NULL ,
"attributes_id" INTEGER NULL ,
"context_id" CHARACTER(0) NULL ,
"context_user_id" CHARACTER(0) NULL ,
"context_parent_id" CHARACTER(0) NULL ,
"origin_idx" SMALLINT NULL ,
"context_id_bin" BLOB NULL ,
"context_user_id_bin" BLOB NULL ,
"context_parent_id_bin" BLOB NULL ,
"metadata_id" INTEGER NULL , last_reported_ts FLOAT,
PRIMARY KEY ("state_id"),
FOREIGN KEY("old_state_id") REFERENCES "states" ("state_id") ON UPDATE RESTRICT ON DELETE RESTRICT,
FOREIGN KEY("attributes_id") REFERENCES "state_attributes" ("attributes_id") ON UPDATE RESTRICT ON DELETE RESTRICT,
FOREIGN KEY("metadata_id") REFERENCES "states_meta" ("metadata_id") ON UPDATE RESTRICT ON DELETE RESTRICT
);
CREATE INDEX "ix_states_attributes_id" ON "states" ("attributes_id");
CREATE INDEX "ix_states_context_id_bin" ON "states" ("context_id_bin");
CREATE INDEX "ix_states_last_updated_ts" ON "states" ("last_updated_ts");
CREATE INDEX "ix_states_metadata_id_last_updated_ts" ON "states" ("metadata_id", "last_updated_ts");
CREATE INDEX "ix_states_old_state_id" ON "states" ("old_state_id");
sqlite>
If you only see the 3 FOREIGN KEYs, the problem one has been removed and the rebuild is complete.
If its still there, something else is wrong, and you'll need to enable debug
logging for sqlalchemy
and restart to get an idea what is going on.
I only see 2 FOREIGN KEYs:
# sqlite3 home-assistant_v2.db
SQLite version 3.44.2 2023-11-24 11:41:44
Enter ".help" for usage hints.
sqlite> .schema states
CREATE TABLE states (
state_id INTEGER NOT NULL,
domain VARCHAR(64),
entity_id VARCHAR(255),
state VARCHAR(255),
attributes TEXT,
event_id INTEGER,
last_changed DATETIME,
last_updated DATETIME,
created DATETIME,
old_state_id INTEGER, attributes_id INTEGER, origin_idx INTEGER, context_id VARCHAR(36), context_user_id VARCHAR(36), context_parent_id VARCHAR(36), last_updated_ts FLOAT, last_changed_ts FLOAT, context_id_bin BLOB, context_user_id_bin BLOB, context_parent_id_bin BLOB, metadata_id INTEGER, last_reported_ts FLOAT,
PRIMARY KEY (state_id),
FOREIGN KEY(event_id) REFERENCES events (event_id) ON DELETE CASCADE,
FOREIGN KEY(old_state_id) REFERENCES states (state_id)
);
CREATE INDEX ix_states_old_state_id ON states (old_state_id);
CREATE INDEX ix_states_attributes_id ON states (attributes_id);
CREATE INDEX ix_states_last_updated_ts ON states (last_updated_ts);
CREATE INDEX ix_states_context_id_bin ON states (context_id_bin);
CREATE INDEX ix_states_metadata_id_last_updated_ts ON states (metadata_id, last_updated_ts);
I only see 2 FOREIGN KEYs:
It looks like you don't have this issue. There should be a FK on metadata_id
but it shouldn't cause any issues if its missing.
It looks like you don't have this issue. There should be a FK on
metadata_id
but it shouldn't cause any issues if its missing.
Then why does the recorder purge cause my graphs to flatline?
It looks like you don't have this issue. There should be a FK on
metadata_id
but it shouldn't cause any issues if its missing.Then why does the recorder purge cause my graphs to flatline?
While the symptoms are the same, since the problematic foreign key is not there, its a different problem which will require a different solution. Please start a fresh issue report with debug logs for sqlalchemy
turned on.
# Example configuration.yaml entry
logger:
default: info
logs:
sqlalchemy: debug
2024.7.2 has certainly solved the "recorder locking-up" problem, but something is still not right in my system.
The home-assistant_v2.db file grew from 3.2GB to 3.8GB after the upgrade, but has been static in size over the last 5 days. Despite having purge_keep_days: 10, I can select data that is 20 days old.
I have the following in configuration.yaml: recorder: purge_keep_days: 10
Hey, I don't really know what to do, I'm having the same problem as above. recorder stops every day at4:30 and i have to reboot. I updated to 2024.7.2 and its the same. When i look in SQLite the temp table for events existed so it seems the migration did not complete. When i save the DB and try to do the migration externally it fails copying the data from the old table to the new because of "unique constraints". I really do not want to lose all of my energy data. I'm super annoyed that it was all working fine until a recent update and now I'm spending hours trying to fix it. Any ideas of where i might go for help? I tied the custom component but it didn't do anything (i assume it thinks the migration was completed).
I also have the problem, just like jasonwfrancis1
As an added datapoint, I have been having this issue on mariadb. Upgrading didn't fix it.
i managed to fix it myself, i basically migrated to maria DB and used a good tutorial to migrate all of the data. a few gaps while i was doing the whole process but i think im back up and running properly. just kicked off a purge, and its running...fingers crossed.
Hi @bdraco
After some time, my database keeps increasing. Don't know how to reduce it? I don't know if your corrections work correctly.
Hey, I don't really know what to do, I'm having the same problem as above. recorder stops every day at4:30 and i have to reboot. I updated to 2024.7.2 and its the same. When i look in SQLite the temp table for events existed so it seems the migration did not complete. When i save the DB and try to do the migration externally it fails copying the data from the old table to the new because of "unique constraints". I really do not want to lose all of my energy data. I'm super annoyed that it was all working fine until a recent update and now I'm spending hours trying to fix it. Any ideas of where i might go for help? I tied the custom component but it didn't do anything (i assume it thinks the migration was completed).
Also having this problem, even with 2024.7.2:
# sqlite3 home-assistant_v2.db
SQLite version 3.44.2 2023-11-24 11:41:44
Enter ".help" for usage hints.
sqlite> .schema event_data
CREATE TABLE event_data (
data_id INTEGER NOT NULL,
hash BIGINT,
shared_data TEXT,
PRIMARY KEY (data_id)
);
CREATE INDEX ix_event_data_hash ON event_data (hash);
sqlite> .schema events
CREATE TABLE events (
event_id INTEGER NOT NULL,
event_type VARCHAR(64),
event_data TEXT,
origin VARCHAR(32),
time_fired DATETIME,
created DATETIME,
context_id VARCHAR(36),
context_user_id VARCHAR(36),
context_parent_id VARCHAR(36), data_id INTEGER, origin_idx INTEGER, time_fired_ts FLOAT, context_id_bin BLOB, context_user_id_bin BLOB, context_parent_id_bin BLOB, event_type_id INTEGER,
PRIMARY KEY (event_id)
);
CREATE INDEX ix_events_data_id ON events (data_id);
CREATE INDEX ix_events_time_fired_ts ON events (time_fired_ts);
CREATE INDEX ix_events_context_id_bin ON events (context_id_bin);
CREATE INDEX ix_events_event_type_id_time_fired_ts ON events (event_type_id, time_fired_ts);
sqlite> .tables
event_data schema_changes states_temp_1720627841
event_types state_attributes statistics
events states statistics_meta
migration_changes states_meta statistics_runs
recorder_runs states_temp_1720625711 statistics_short_term
It happens some time after the purge of 4 AM. However, events are no longer recorded after 4 AM.
2024-07-20 08:15:58.622 ERROR (MainThread) [homeassistant.components.recorder.core] The recorder backlog queue reached the maximum size of 255442 events; usually, the system is CPU bound, I/O bound, or the database is corrupt due to a disk problem; The recorder will stop recording events to avoid running out of memory
Until 04:12:xx log messages like this:
2024-07-20 04:10:10.268 INFO (Recorder) [sqlalchemy.engine.Engine] [insertmanyvalues 9/11 (ordered; batch not supported)] (None, '90.9', None, None, None, None, None, None, 1721441409.5084128, 1225118537, 12273833, None, None, None, 0, <memory at 0x7f942b3d71c0>, None, None, 185170)
2024-07-20 04:10:10.269 INFO (Recorder) [sqlalchemy.engine.Engine] INSERT INTO states (entity_id, state, attributes, event_id, last_changed, last_changed_ts, last_reported_ts, last_updated, last_updated_ts, old_state_id, attributes_id, context_id, context_user_id, context_parent_id, origin_idx, context_id_bin, context_user_id_bin, context_parent_id_bin, metadata_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) RETURNING state_id
2024-07-20 04:10:10.269 INFO (Recorder) [sqlalchemy.engine.Engine] [insertmanyvalues 10/11 (ordered; batch not supported)] (None, '5.309', None, None, None, None, None, None, 1721441410.0638864, 1225118451, 338851, None, None, None, 0, <memory at 0x7f942b3d5600>, None, None, 4)
2024-07-20 04:10:10.269 INFO (Recorder) [sqlalchemy.engine.Engine] INSERT INTO states (entity_id, state, attributes, event_id, last_changed, last_changed_ts, last_reported_ts, last_updated, last_updated_ts, old_state_id, attributes_id, context_id, context_user_id, context_parent_id, origin_idx, context_id_bin, context_user_id_bin, context_parent_id_bin, metadata_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) RETURNING state_id
2024-07-20 04:10:10.270 INFO (Recorder) [sqlalchemy.engine.Engine] [insertmanyvalues 11/11 (ordered; batch not supported)] (None, '5.309', None, None, None, None, None, None, 1721441410.0642517, 1225118452, 338852, None, None, None, 0, <memory at 0x7f942b3d7640>, None, None, 39)
2024-07-20 04:10:10.271 INFO (Recorder) [sqlalchemy.engine.Engine] COMMIT
Then:
2024-07-20 04:12:00.160 INFO (Recorder) [sqlalchemy.engine.Engine] BEGIN (implicit)
2024-07-20 04:12:00.162 INFO (Recorder) [sqlalchemy.engine.Engine] SELECT states.state_id, states.attributes_id
FROM states
WHERE states.last_updated_ts < ?
LIMIT ? OFFSET ?
2024-07-20 04:12:00.162 INFO (Recorder) [sqlalchemy.engine.Engine] [generated in 0.00029s] (1718849520.132695, 4000, 0)
2024-07-20 04:12:00.285 INFO (Recorder) [sqlalchemy.engine.Engine] UPDATE states SET old_state_id=? WHERE states.old_state_id IN ([many])
2024-07-20 04:12:00.285 INFO (Recorder) [sqlalchemy.engine.Engine] [generated in 0.01305s] (None, 1139875840, 1139875841, 1139875842, 1139875843, 1139875844, 1139875845, 1139875846, 1139875847, 1139875848, 1139875849, 1139875850, 1139875851, 1139875852, 1139875853, 1139875854, 1139875855, 1139875856, 1139875857, 1139875858, 1139875859, 1139875860, 1139875861, 1139875862, 1139875863, 1139875864, 1139875865, 1139875866, 1139875867, 1139875868, 1139875869, 1139875870, 1139875871, 1139875872, 1139875873, 1139875874, 1139875875, 1139875876, 1139875877, 1139875878, 1139875879, 1139875880, 1139875881, 1139875882, 1139875883, 1139875884, 1139875885, 1139875886, 1139875887, 1139875888 ... 3901 parameters truncated ... 1139875790, 1139875791, 1139875792, 1139875793, 1139875794, 1139875795, 1139875796, 1139875797, 1139875798, 1139875799, 1139875800, 1139875801, 1139875802, 1139875803, 1139875804, 1139875805, 1139875806, 1139875807, 1139875808, 1139875809, 1139875810, 1139875811, 1139875812, 1139875813, 1139875814, 1139875815, 1139875816, 1139875817, 1139875818, 1139875819, 1139875820, 1139875821, 1139875822, 1139875823, 1139875824, 1139875825, 1139875826, 1139875827, 1139875828, 1139875829, 1139875830, 1139875831, 1139875832, 1139875833, 1139875834, 1139875835, 1139875836, 1139875837, 1139875838, 1139875839)
2024-07-20 04:12:00.402 INFO (Recorder) [sqlalchemy.engine.Engine] DELETE FROM states WHERE states.state_id IN ([many])
2024-07-20 04:12:00.402 INFO (Recorder) [sqlalchemy.engine.Engine] [generated in 0.00377s] (1139875840, 1139875841, 1139875842, 1139875843, 1139875844, 1139875845, 1139875846, 1139875847, 1139875848, 1139875849, 1139875850, 1139875851, 1139875852, 1139875853, 1139875854, 1139875855, 1139875856, 1139875857, 1139875858, 1139875859, 1139875860, 1139875861, 1139875862, 1139875863, 1139875864, 1139875865, 1139875866, 1139875867, 1139875868, 1139875869, 1139875870, 1139875871, 1139875872, 1139875873, 1139875874, 1139875875, 1139875876, 1139875877, 1139875878, 1139875879, 1139875880, 1139875881, 1139875882, 1139875883, 1139875884, 1139875885, 1139875886, 1139875887, 1139875888, 1139875889 ... 3900 parameters truncated ... 1139875790, 1139875791, 1139875792, 1139875793, 1139875794, 1139875795, 1139875796, 1139875797, 1139875798, 1139875799, 1139875800, 1139875801, 1139875802, 1139875803, 1139875804, 1139875805, 1139875806, 1139875807, 1139875808, 1139875809, 1139875810, 1139875811, 1139875812, 1139875813, 1139875814, 1139875815, 1139875816, 1139875817, 1139875818, 1139875819, 1139875820, 1139875821, 1139875822, 1139875823, 1139875824, 1139875825, 1139875826, 1139875827, 1139875828, 1139875829, 1139875830, 1139875831, 1139875832, 1139875833, 1139875834, 1139875835, 1139875836, 1139875837, 1139875838, 1139875839)
2024-07-20 04:12:00.433 INFO (Recorder) [sqlalchemy.engine.Engine] SELECT states.state_id, states.attributes_id
FROM states
WHERE states.last_updated_ts < ?
And then nothing from the recorder anymore.
I am also still having this issue with 2024.7.4, recorder hangs after 4 in the morning and a reboot gets it going again. I do however have a longer keep value since some months back: recorder: purge_keep_days: 30 commit_interval: 0
It was still working fine until I updated to 2024.7
Same over here, I’m on 2024.7.4 but recorder reliably does at 4am every day until a restart. What debug info can I provide to the effort?
Still having this issue here as well, running 2024.8. I do have the recorder set to keep for 1 year, since I wanted the history but as mentioned above (https://github.com/home-assistant/core/issues/117263#issuecomment-2264623334) this was working fine up until last month (07/19 for me but I can't recall when or what version I may have updated to at that time).
My recorder settings in configuration.yaml recorder: purge_keep_days: 365
I have the same configuration, with 365 days of retention, but after upgrading to 2024.8 yesterday it seems my recorder didn't die last night at 4am!
Same issue with 2024.8
I have default configuration and still have the issue 2024.8
After upgrading to 2024.8 and re-enabling auto_purge, today I see my graphs have stopped updating at 4 am again. SDD is at 100% and crunching away. After restarting HA, graphs are back, but with no data from 4am till restart.
My recorder settings:
recorder:
purge_keep_days: 90
auto_purge: true
exclude:
domains:
- updater
- media_player
- device_tracker
- weather
- sun
@bdraco tagged as you provided the fix and closed the issue. Could it be the issue was not fixed in certain cases and can it be reopened?
EDIT: added some info
SDD is at 100% and crunching away
Run a manual purge with your retention time (1 year) and wait for it to stop crunching before restarting. Mine has made it 3 days since doing that, still running 2024.07
I wish there was a way to do a manual migration. That would be fine for me.
This problem is solved in 2024.7.2. If the system ran out of disk space, and the table rebuild failed, it will try again in 2024.8.1+ see issue https://github.com/home-assistant/core/issues/123348 and solution https://github.com/home-assistant/core/pull/123388
Workaround: Disabling nightly auto purge will prevent the issue from occurring (this is not a long term solution)
Be sure to re-enable auto-purge after installing 2024.7.2 or your database will grow without bounds, and your system will eventually run out of disk space or become sluggish.
Cause: https://github.com/home-assistant/core/issues/117263#issuecomment-2197311144 Solution: https://github.com/home-assistant/core/pull/120779
The problem
Every night at around ~4:10 the histories for all entities stop. This has been happening since at least April 9th. I updated Home Assistant to 2024.4.1 on April 5th, but I can't say for sure if this issue started directly afterwards. A restart of Home Assistant allows recording again but does not restore the history missed since ~4:10. I suspect it has something to do with the Recorder auto purge at 4:12 because the same symptoms happen when the purge is run manually.
I don't think the manual or automatic purge is currently able to finish because the (SQLite) database seems way too large (>6GB) for my configured
purge_keep_days
of 7.If I run
recorder.purge
from the web UI the same symptoms happen like during the night. By looking at the mtime it is clearhome-assistant_v2.db
does not get written to anymore.htop
shows HA using 100% of one CPU core continously andiotop
show HA reading from disk at ~400MB/s continously. This went on for at least 25 minutes before I stopped the process.The logs show nothing unusual happening around 4:12. When I run
recorder.purge
from the web UI with verbose logging enabled the logs just show:When HA is stopped using SIGTERM the shutdown takes a long time and it is clear from the logs it is waiting for a Recorder task:
See the rest of the relevant messages during shutdown below.
What version of Home Assistant Core has the issue?
core-2024.5.2
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant Core
Integration causing the issue
Recorder
Link to integration documentation on our website
https://www.home-assistant.io/integrations/recorder/#service-purge
Diagnostics information
No response
Example YAML snippet
Anything in the logs that might be useful for us?
Additional information
I thought maybe my database could be corrupted, so with HA shutdown I ran
mv home-assistant_v2.db home-assistant_v2_old.db; sqlite3 home-assistant_v2_old.db ".recover" | sqlite3 home-assistant_v2.db
and then tried to run a purge again. Unfortunately the problem was not resolved. My database did shrink by about 1.5 GB.