Open SaintTDI opened 2 months ago
I can confirm this behavior that Whisper Addon is fully backed up and backup file is huge, every model that I tried is there
Yep, models should not be saved into the /data
folder, because it's part of the backups, which is not useful.
Bump. My Full Backups also increased by 2.8GB, which means I now have to delete backups frequently to keep in my 32GB disk partition. My system is currently:
HA OS in a proxmox partition on a x86-64 PC Core 2024.5.3 Supervisor 2024.05.1 Operating System 12.3 Frontend 20240501.1
I created a Full Backup (System > Backups > Create backup > Full Backup) taking 3840.1MB. Download to my PC and open
Note the last entry … core_whisper.tar.gz 2.8GB out of 3.8GB file; and that that file contains the “medium.en” whisper model I am currently using, plus tiny folders for previously used models.
I see the original post list the documentation says it should be excluded from backup, but it isn't getting excluded.
https://bford.info/cachedir/ is a standard that some backup program follow that CACHEDIR.TAG with a specific first line will trigger the directory to be excluded from backup.
That's a thought for a mechanism for Whiper to exclude the model files if the Home Assistant backup were configured to look for these files to exclude directories. This could also allow a user to specify that they do want the models to be backed up if say they were moving to a different system and didn't want to redownload the models. Just a thought, it would all take logic to make happen.
Describe the issue you are experiencing
Yesterday I installed a local voice assist pipeline. Before Installing it each full backup (automatically done by Onedrive backup) was about 440MB... but since I installed whisper, piper and openwakeword, yesterday the backup was 1gb and this morning is 1,6gb.
Doing a partial backup of Whisper, only for this addon is 1266MB, and unzipping the file I can see the 3 models that I tried with some big files (eg path: core_whisper\data\models--rhasspy--faster-whisper-medium-int8\blobs).
On the Whisper add-on documentation it says:
Backups Whisper model files can be quite large, so they are automatically excluded from backups. The models will be re-downloaded when the backup is restored.
But it seems it doesn't happen.
What type of installation are you running?
Home Assistant OS
Which operating system are you running on?
Home Assistant Operating System
Which add-on are you reporting an issue with?
Whisper
What is the version of the add-on?
2.0.0
Steps to reproduce the issue
System Health information
System Information
Home Assistant Community Store
GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 5000 Installed Version | 1.34.0 Stage | running Available Repositories | 1402 Downloaded Repositories | 35Home Assistant Cloud
logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | okHome Assistant Supervisor
host_os | Home Assistant OS 12.1 -- | -- update_channel | stable supervisor_version | supervisor-2024.03.1 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 468.7 GB disk_used | 27.1 GB healthy | true supported | true board | generic-x86-64 supervisor_api | ok version_api | ok installed_addons | File editor (5.8.0), Advanced SSH & Web Terminal (17.2.0), Mosquitto broker (6.4.0), Zigbee2MQTT (1.36.1-1), Studio Code Server (5.15.0), Duck DNS (1.16.0), OneDrive Backup (2.3.1), Grocy (0.21.0), ESPHome (2024.3.1), Piper (1.5.0), Whisper (2.0.0), openWakeWord (1.10.0)Dashboards
dashboards | 7 -- | -- resources | 26 views | 52 mode | storageRecorder
oldest_recorder_run | 20 marzo 2024 alle ore 13:28 -- | -- current_recorder_run | 3 aprile 2024 alle ore 15:23 estimated_db_size | 996.68 MiB database_engine | sqlite database_version | 3.44.2Spotify
api_endpoint_reachable | ok -- | --Anything in the Supervisor logs that might be useful for us?
Anything in the add-on logs that might be useful for us?
Additional information
No response