Error executing query: (sqlite3.DatabaseError) database disk image is malformed

The problem

After some time home assistant UI is not working correctly and there are multiple errors in logs. To recover from this state, I have to manually restart the server. After restart, everything works correctly.

What version of Home Assistant Core has the issue?

core-2024.2.5

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

No response

Link to integration documentation on our website

No response

Diagnostics information

home-assistant.log.1.txt

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2024-06-28 04:15:10.200 ERROR (Recorder) [homeassistant.components.recorder.util] Error executing query: (sqlite3.DatabaseError) database disk image is malformed
[SQL: SELECT anon_1.metadata_id, anon_1.state, anon_1.last_updated_ts, anon_1.attributes 
FROM (SELECT anon_2.metadata_id AS metadata_id, anon_2.state AS state, anon_2.last_updated_ts AS last_updated_ts, anon_2.attributes AS attributes 
FROM (SELECT states.metadata_id AS metadata_id, states.state AS state, ? AS last_updated_ts, CASE WHEN (state_attributes.shared_attrs IS NULL) THEN states.attributes ELSE state_attributes.shared_attrs END AS attributes 
FROM states JOIN (SELECT states.metadata_id AS max_metadata_id, max(states.last_updated_ts) AS max_last_updated 
FROM states 
WHERE states.last_updated_ts >= ? AND states.last_updated_ts < ? AND states.metadata_id IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) GROUP BY states.metadata_id) AS anon_3 ON states.metadata_id = anon_3.max_metadata_id AND states.last_updated_ts = anon_3.max_last_updated LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_updated_ts >= ? AND states.last_updated_ts < ? AND states.metadata_id IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)) AS anon_2 UNION ALL SELECT anon_4.metadata_id AS metadata_id, anon_4.state AS state, anon_4.last_updated_ts AS last_updated_ts, anon_4.attributes AS attributes 
FROM (SELECT states.metadata_id AS metadata_id, states.state AS state, states.last_updated_ts AS last_updated_ts, CASE WHEN (state_attributes.shared_attrs IS NULL) THEN states.attributes ELSE state_attributes.shared_attrs END AS attributes 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE (states.last_changed_ts = states.last_updated_ts OR states.last_changed_ts IS NULL) AND states.metadata_id IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) AND states.last_updated_ts > ? AND states.last_updated_ts < ?) AS anon_4) AS anon_1 ORDER BY anon_1.metadata_id, anon_1.last_updated_ts]
[parameters: (0, 1718302302.213129, 1719536999.999999, 247, 267, 55, 54, 238, 395, 48, 50, 51, 52, 53, 18, 19, 20, 21, 22, 23, 235, 241, 244, 24, 340, 341, 132, 342, 136, 137, 343, 139, 141, 140, 344, 345, 346, 146, 147, 347, 348, 150, 349, 350, 153, 154, 155, 156, 351, 159 ... 120 parameters truncated ... 132, 342, 136, 137, 343, 139, 141, 140, 344, 345, 346, 146, 147, 347, 348, 150, 349, 350, 153, 154, 155, 156, 351, 159, 158, 352, 161, 162, 164, 354, 356, 355, 254, 358, 170, 171, 172, 39, 359, 360, 174, 362, 363, 260, 391, 392, 393, 394, 1719536999.999999, 1719537300.0)]
(Background on this error at: https://sqlalche.me/e/20/4xp6)

Additional information

System Information

version	core-2024.2.5
installation_type	Home Assistant OS
dev	false
hassio	true
docker	true
user	root
virtualenv	false
python_version	3.12.1
os_name	Linux
os_version	6.6.16-haos
arch	x86_64
timezone	Europe/Bucharest
config_dir	/config

Home Assistant Cloud

Home Assistant Supervisor

Dashboards

Recorder

Done some investigation, possible hardware ssd issue. (Note the following commands are run by ssh into HAOS host, eg ssh root@IP -p 22222)

After some time /mnt/data becomes read-only (drive /dev/sdb8)

Seems there are some file system errors:

# tune2fs -l /dev/sdb8
tune2fs 1.46.5 (30-Dec-2021)
Filesystem volume name:   hassos-data
Last mounted on:          /mnt/data
Filesystem UUID:          b432dc1b-9783-4c7c-ae92-7a7327f55dc8
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean with errors
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              7282688
Block count:              29126769
Reserved block count:     1170285
Overhead clusters:        467553
Free blocks:              25275891
Free inodes:              6920345
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      146
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Mon Dec  4 15:33:59 2023
Last mount time:          Fri Jul 12 18:35:23 2024
Last write time:          Fri Jul 12 20:14:04 2024
Mount count:              7
Maximum mount count:      -1
Last checked:             Wed Jul 10 19:52:57 2024
Check interval:           0 (<none>)
Lifetime writes:          974 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      69ce757a-4283-4998-bbf0-fda383f55458
Journal backup:           inode blocks
FS Error count:           8
First error time:         Fri Jul 12 18:35:23 2024
First error function:     ext4_validate_block_bitmap
First error line #:       421
First error err:          EFSCORRUPTED
Last error time:          Fri Jul 12 20:14:04 2024
Last error function:      ext4_journal_check_start
Last error line #:        84
Last error err:           EIO
Checksum type:            crc32c
Checksum:                 0x9b66f54a

Tried to stop docker daemon then umount and run fsck but I was not successfull:

#  systemctl stop docker
Warning: Stopping docker.service, but it can still be activated by:
  docker.socket
#
# umount /dev/sdb8
# umount /dev/sdb8
umount: /dev/sdb8: not mounted.
# fsck /dev/sdb8
fsck from util-linux 2.38
e2fsck 1.46.5 (30-Dec-2021)
/dev/sdb8 is in use.
e2fsck: Cannot continue, aborting.

What is the correct way to run fsck on HAOS?

home-assistant / core