Open cdalexndr opened 2 weeks ago
Related #118139 and #117178
Done some investigation, possible hardware ssd issue. (Note the following commands are run by ssh into HAOS host, eg ssh root@IP -p 22222)
After some time /mnt/data becomes read-only (drive /dev/sdb8)
Seems there are some file system errors:
# tune2fs -l /dev/sdb8
tune2fs 1.46.5 (30-Dec-2021)
Filesystem volume name: hassos-data
Last mounted on: /mnt/data
Filesystem UUID: b432dc1b-9783-4c7c-ae92-7a7327f55dc8
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 7282688
Block count: 29126769
Reserved block count: 1170285
Overhead clusters: 467553
Free blocks: 25275891
Free inodes: 6920345
First block: 0
Block size: 4096
Fragment size: 4096
Group descriptor size: 64
Reserved GDT blocks: 146
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Mon Dec 4 15:33:59 2023
Last mount time: Fri Jul 12 18:35:23 2024
Last write time: Fri Jul 12 20:14:04 2024
Mount count: 7
Maximum mount count: -1
Last checked: Wed Jul 10 19:52:57 2024
Check interval: 0 (<none>)
Lifetime writes: 974 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 32
Desired extra isize: 32
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 69ce757a-4283-4998-bbf0-fda383f55458
Journal backup: inode blocks
FS Error count: 8
First error time: Fri Jul 12 18:35:23 2024
First error function: ext4_validate_block_bitmap
First error line #: 421
First error err: EFSCORRUPTED
Last error time: Fri Jul 12 20:14:04 2024
Last error function: ext4_journal_check_start
Last error line #: 84
Last error err: EIO
Checksum type: crc32c
Checksum: 0x9b66f54a
Tried to stop docker daemon then umount and run fsck but I was not successfull:
# systemctl stop docker
Warning: Stopping docker.service, but it can still be activated by:
docker.socket
#
# umount /dev/sdb8
# umount /dev/sdb8
umount: /dev/sdb8: not mounted.
# fsck /dev/sdb8
fsck from util-linux 2.38
e2fsck 1.46.5 (30-Dec-2021)
/dev/sdb8 is in use.
e2fsck: Cannot continue, aborting.
What is the correct way to run fsck on HAOS?
smart data shows no error on drive: smartctl.json
The problem
After some time home assistant UI is not working correctly and there are multiple errors in logs. To recover from this state, I have to manually restart the server. After restart, everything works correctly.
What version of Home Assistant Core has the issue?
core-2024.2.5
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
No response
Link to integration documentation on our website
No response
Diagnostics information
home-assistant.log.1.txt
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
System Information
Home Assistant Cloud
logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | okHome Assistant Supervisor
host_os | Home Assistant OS 12.0 -- | -- update_channel | stable supervisor_version | supervisor-2024.06.2 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 109.3 GB disk_used | 12.2 GB healthy | true supported | true board | generic-x86-64 supervisor_api | ok version_api | ok installed_addons | Node-RED (17.0.3), File editor (5.7.0), Let's Encrypt (5.0.14), ESPHome (2024.3.0), Matter Server (5.5.1), Advanced SSH & Web Terminal (17.2.0), Samba NAS (12.2.0-nas1), Network UPS Tools (0.13.0), HDD Tools (1.1.0), InfluxDB (5.0.0), Home Assistant Google Drive Backup (0.112.1)Dashboards
dashboards | 4 -- | -- resources | 0 views | 3 mode | storageRecorder
oldest_recorder_run | June 28, 2024 at 1:15 AM -- | -- current_recorder_run | July 7, 2024 at 9:24 PM estimated_db_size | 83.08 MiB database_engine | sqlite database_version | 3.44.2