Open farcaller opened 3 years ago
I'm getting frequent timeouts too during especially during apt package changes.
I'm using ZFS for Docker which results in a large number of volumes as well.
# zfs list|wc -l
6118
Same errors for me. On top of this zsys-gc fails constantly:
~$ sudo systemctl status zsys-gc ● zsys-gc.service - Clean up old snapshots to free space Loaded: loaded (/lib/systemd/system/zsys-gc.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Wed 2022-06-08 11:03:55 CEST; 10h ago TriggeredBy: ● zsys-gc.timer Process: 1374531 ExecStart=/sbin/zsysctl service gc (code=exited, status=1/FAILURE) Main PID: 1374531 (code=exited, status=1/FAILURE)
Jun 08 11:03:35 zfs-backup-host systemd[1]: Starting Clean up old snapshots to free space... Jun 08 11:03:55 zfs-backup-host zsysctl[1374531]: level=error msg="couldn't connect to zsys daemon: timed out waiting for server h> Jun 08 11:03:55 zfs-backup-host systemd[1]: zsys-gc.service: Main process exited, code=exited, status=1/FAILURE Jun 08 11:03:55 zfs-backup-host systemd[1]: zsys-gc.service: Failed with result 'exit-code'. Jun 08 11:03:55 zfs-backup-host systemd[1]: Failed to start Clean up old snapshots to free space.
I'm sure it's the same for you guys, you probably haven't noticed yet.
Got a weird feeling it might be related to a large amount of snaps I have:
$ zfs list -t snapshot | wc -l 12398
Most of these are not on rpool/bpool but an external drive so not sure if it's related. System ones are only:
$ zfs list -t snapshot | grep -v backup | wc -l 391
As far as I understand zsys shouldn't be messing with snaps of non-system-related datasets but maybe service crashes while waiting for some output?
I have the same issue, also large number of snaps:
zfs list -t snapshot | wc -l
16863
The workaround I use is:
sudo ./zfs-prune-snapshots -R -v 1M
This wipes all snaps older than 1 month. Daemon works fine after this cleanup.
The workaround I use is:
sudo ./zfs-prune-snapshots -R -v 1M
This wipes all snaps older than 1 month. Daemon works fine after this cleanup.
Is this what you are using?
Yep
Thanks @TheGrave for sharing zfs-prune-snapshots. Personally I first delete using docker commands:
docker system prune -a -f --volumes
and afterwards using zfs commands. clean zfs snapshots.md
@awhitcroft please locate somebody to subscribe and respond to zsys things
Describe the bug
zsysctl commands fail, e.g.
Interestingly enough, after the server is "primed" by e.g. grpcurl, zsysctl seems to work:
To Reproduce
Having non-trivial zfs volumes (e.g. via containerd) seems to help:
Expected behavior
zsysctl should work, even if slowly
For ubuntu users, please run and copy the following:
the log isn't trivially short, pasted in here
Screenshots If applicable, add screenshots to help explain your problem.
Installed versions:
Additional context Add any other context about the problem here.