Closed GrimD closed 6 months ago
Since the new Version I also receive this error in connection to Unraid with VM Backup Plugin for my HAOS VM
After upgrading til 12.1 full snapshot on Synology Virtual Machine Manager is not possible. It can only make "Crash Consistant" snapshot.
No problems "snapshotting" the running HA-OS-VM (12.1) with proxmox.
After upgrading til 12.1 full snapshot on Synology Virtual Machine Manager is not possible. It can only make "Crash Consistant" snapshot.
Same here, guest tools are running. filesystem-consistent snapshots aber broken on my Synology VMM since last HAOS Update. As I am reading Synology also uses the QEMU agent
@GrimD @felge20000 can you check journalctl from the system after taking the snapshot? Use login
on the VM terminal to get access to the OS shell, then run:
journalctl -u qemu-guest.service
As other reported, on Proxmox, it seems the freeze works nicely with HAOS 12.1:
# journalctl -f -u qemu-guest.service
Mar 15 14:35:20 homeassistant systemd[1]: Started QEMU Guest Agent.
Mar 15 14:38:31 ha-virt-proxmox qemu-ga[358]: info: guest-ping called
Mar 15 14:38:31 ha-virt-proxmox qemu-ga[358]: info: guest-fsfreeze called
Mar 15 14:38:31 ha-virt-proxmox qemu-ga[358]: info: executing fsfreeze hook with arg 'freeze'
Mar 15 14:38:31 ha-virt-proxmox qemu-ga[358]: info: executing fsfreeze hook with arg 'thaw'
@agners Thanks for providing the commands, I'd be at a loss here
so the output on my system is exactly as you posted, but the time between freezing and thawing is reduced to 0-1s instead of the 10-15s while it was working. So the service gets called and seems to think everything's ok?
Working:
After 12.1 Update, not working
From the VMM log: Warning,2024/03/15 16:09:31,USER ,Took a snapshot [GMT-2024.03.15-15.09.23] from virtual machine [Home Assistant] by [USER] without filesystem consistency. Reason: [Filesystem failed to freeze because the virtual machine is busy]
So the service gets called and seems to think everything's ok?
Hm, yes this looks as if all is good. You can also call the service calls explicitly using:
/usr/libexec/haos-freeze-hook freeze; echo $?
/usr/libexec/haos-freeze-hook thaw; echo $?
(call both, otherwise the system will stay in freeze.
Warning,2024/03/15 16:09:31,USER ,Took a snapshot [GMT-2024.03.15-15.09.23] from virtual machine [Home Assistant] by [USER] without filesystem consistency. Reason: [Filesystem failed to freeze because the virtual machine is busy]
@felge20000 hm, I don't think that your case is related to the snapshot freeze feature really. This seems to be a Synology/Hypervisor related issue to me.
Yepp, freezing and thawing manually also worked. So yeah, seems like the hypervisor and the vm don't like to talk to each other any more since the update. But I guess that's really a separated issue from the OP's. I was just "relieved" to see from @Soleima77 I'm not the only one with synology and snapshot problems :) So sorry for taking over this topic @GrimD
Yepp, freezing and thawing manually also worked. So yeah, seems like the hypervisor and the vm don't like to talk to each other any more since the update. But I guess that's really a separated issue from the OP's. I was just "relieved" to see from @Soleima77 I'm not the only one with synology and snapshot problems :) So sorry for taking over this topic @GrimD
@felge20000, Not sure sounds simular to me but kind of out of my depth as not much of a Linux tech. All I know is upgrading to HAOS 12.1 breaks it but it works fine with 12. After initially restoring to 12.0 which fixed it I then allowed it to upgrade again to double-check it broke (it did) and to get all the logs etc to log this. I then restored the VM back to before the 12.1 update again (so back to 12.0) and its happy once more.
@agners I'll need to find some time to install 12.1 again to try what you have asked but as I have upgraded it to 12.1 twice and it breaks then restoring the whole VM disk back to before the update resolves it, it certainly seems to be something in the update that's killing it. Will try to test ASAP
@agners Well I ended up just doing it now as thinking about it, it doesn't take long to update and then restore it again so here is the result:
Seems like it issues the freeze and then that's it, left it about 10 mins and it still shows the same. You can see the successful freeze\thaw from my backup last night on version 12.0
After upgrading til 12.1 full snapshot on Synology Virtual Machine Manager is not possible. It can only make "Crash Consistant" snapshot.
Also had this poblem. Reverted to 12.0 and snapshot creation is ok now.
Same problem here with HAOS 12.1. Debian bookworm host, QEMU/KVM virtualization. Trying to create snapshot of vm with virsh snapshot-create-as --quiesce --disk-only
results in error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': fsfreeze hook has failed with status 1
from virsh. journalctl -u qemu-agent
output on HAOS guest console quite similar to the above quoted by [felge20000] (thaw only 1 sec after freeze, too fast). Tried changing virtual disk attachment to vm from sata to virtio - no help. According HAOS changelog last qemu guest agent update (to 8.0.5) happened in 11.1 so this should not be the cause as 12.0 is not throwing this error. Maybe a problem related to kernel 6.6.20, should not be limited to HAOS guests then - but I did not find anything related on the net.
Same here with 12.1
Same issue with 12.1 running on unraid kvm. Please ping me if I should run anything that might help figure this one out.
I upgraded to 12.2 and my backups ran successfully.
Thanks for fixing this.
All good here too on 12.2
Thank you.
My Synology vm ist also snapshotting happily since 12.2 Thanks to everyone involved <3
I've got this same issue, but with 12.4 for some reason with HA as a proxmox VM.
Describe the issue you are experiencing
With HAOS 12.1 when trying to snapshot the VM running HAOS for backup it failed with the error; "unable to execute QEMU agent command 'guest-fsfreeze-freeze': fsfreeze hook has failed with status 1". Tried shutting down the VM, starting it and trying the backup again but got the same error. The QEMU agent is running correctly in general. After restoring a backup of the VM so it is back to HAOS 12.0 the snapshot and therefore backup run fine again.
The hypervisor is KVM with in the Unraid OS.
What operating system image do you use?
ova (for Virtual Machines)
What version of Home Assistant Operating System is installed?
12.1
Did you upgrade the Operating System.
Yes
Steps to reproduce the issue
1.Upgrade to 12.1 2.Try to to back up the VM via the VM backup plugin in Unraid 3. ...
Anything in the Supervisor logs that might be useful for us?
Anything in the Host logs that might be useful for us?
System information
System Information
Home Assistant Community Store
GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 4913 Installed Version | 1.34.0 Stage | running Available Repositories | 1399 Downloaded Repositories | 4Home Assistant Cloud
logged_in | true -- | -- subscription_expiration | 15 March 2024 at 00:00 relayer_connected | true relayer_region | eu-central-1 remote_enabled | true remote_connected | true alexa_enabled | true google_enabled | true remote_server | eu-central-1-10.ui.nabu.casa certificate_status | ready instance_id | 45fd9f3fa0764ccca8f73de93c3e5ea3 can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | okHome Assistant Supervisor
host_os | Home Assistant OS 12.1 -- | -- update_channel | stable supervisor_version | supervisor-2024.03.0 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 30.8 GB disk_used | 9.8 GB healthy | true supported | true board | ova supervisor_api | ok version_api | ok installed_addons | Node-RED (17.0.9), Terminal & SSH (9.10.0), eWeLink Smart Home (1.4.3), Studio Code Server (5.15.0), Z-Wave JS UI (3.4.1), ESPHome (2024.2.2)Dashboards
dashboards | 3 -- | -- resources | 0 views | 15 mode | storageRecorder
oldest_recorder_run | 2 March 2024 at 13:57 -- | -- current_recorder_run | 14 March 2024 at 19:34 estimated_db_size | 373.99 MiB database_engine | sqlite database_version | 3.44.2Additional information
Here is the relevant section from the VM Backup plugin log; 2024-03-14 19:39:54 information: DBHA01 can be found on the system. attempting backup. 2024-03-14 19:39:54 information: creating local DBHA01.xml to work with during backup. 2024-03-14 19:39:54 information: /mnt/user/Backups/Unraid/VMs/DBHA01 exists. continuing. 2024-03-14 19:39:54 information: skip_vm_shutdown is false and use_snapshots is 1. skipping vm shutdown procedure. DBHA01 is running. can_backup_vm set to y. 2024-03-14 19:39:54 information: actually_copy_files is 1. 2024-03-14 19:39:54 information: can_backup_vm flag is y. starting backup of DBHA01 configuration, nvram, and vdisk(s). sending incremental file list DBHA01.xml
sent 6,401 bytes received 35 bytes 12,872.00 bytes/sec total size is 6,294 speedup is 0.98 2024-03-14 19:39:55 information: copy of DBHA01.xml to /mnt/user/Backups/Unraid/VMs/DBHA01/20240314_1936_DBHA01.xml complete. sending incremental file list 43c3c14f-bbb0-e3b1-af5f-fc81c78df426_VARS-pure-efi-tpm.fd
sent 540,951 bytes received 35 bytes 1,081,972.00 bytes/sec total size is 540,672 speedup is 1.00 2024-03-14 19:39:55 information: copy of /etc/libvirt/qemu/nvram/43c3c14f-bbb0-e3b1-af5f-fc81c78df426_VARS-pure-efi-tpm.fd to /mnt/user/Backups/Unraid/VMs/DBHA01/20240314_1936_43c3c14f-bbb0-e3b1-af5f-fc81c78df426_VARS-pure-efi-tpm.fd complete. 2024-03-14 19:39:55 information: able to perform snapshot for disk /mnt/user/domains/DBHA01/vdisk1.img on DBHA01. use_snapshots is 1. vm_state is running. vdisk_type is qcow2 2024-03-14 19:39:55 information: qemu agent found. enabling quiesce on snapshot. error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': fsfreeze hook has failed with status 1
2024-03-14 19:39:55 failure: snapshot command failed on vdisk1.snap for DBHA01. 2024-03-14 19:39:57 failure: snapshot_fallback is 0. skipping backup for DBHA01 to prevent data loss. no cleanup will be performed for this vm.