Open josephtate opened 3 years ago
Well, I think I have fixed my system.
Not all these steps are necessary, but I thought starting from a clean slate would be faster than preserving zsys history or docker images (for example). Hopefully this can help someone.
/var/lib/docker
and all its contents. Docker complicates the zfs layout, and I only run one thing anyway.zfs promote bpool/BOOT/ubuntu_r20rzf
zfs destroy -R bpool/BOOT/ubuntu_03fo29tr
zfs promote rpool/ROOT/ubuntu_r20rzf/<mountpoint>
apt install --reinstall linux-image-5.4.0-65-generic
to make sure that I had a good initramfs, and besides the normal noise about missing encryption set up (I didn't set up encryption), the output looked ok. There are two warnings that I need to resolve still, I think, but I'll work on those My rpool/USERDATA/root_
I rebooted, but zfs-mount service still was failing to come up. zfs mount -a
was giving me errors about / not being empty.
zfs import rpool -R /system
to mount the rpool.zfs export rpool
to unmount and export the dataset.
When I rebooted again, it was still failing. I checked my canmount and mountpoint properties and found that I had two zfs datasets with / as the mountpoint, so I set one to "none".zfs mount -a
completed without a core dump or error messages.BUT I still had problems in systemd: the zysys-commit service was not starting, but the workaround in #112 helped me get that running too.
Describe the bug Perhaps this is a documentation issue, but it's unclear what the admin needs to do after booting grub from an old snapshot to keep their system working smoothly.
I tried to install a Real Time kernel to do some Ubuntu Studio work, but that was unable to load my zfs pools. So I reverted. Now I have two sets of zfs snapshots, and worse still, several zfs and zsys services don't work, zsys boot-prepare seg faults, and I don't have confidence in the system anymore.
To Reproduce Steps to reproduce the behavior:
Expected behavior I was expecting for there to be some sort of zsys permanesce command that would roll back the system zfs states to the current clone and delete the original. Something that would run zfs promote, for example and delete the other branch.
For ubuntu users, please run and copy the following:
ubuntu-bug zsys --save=/tmp/report
/tmp/report
content: I was unable to generate the report as directed:*** Collecting problem information
The collected information can be sent to the developers to improve the application. This might take a few minutes. .......
*** Problem in zsys
The problem cannot be reported:
This is not an official KDE package. Please remove any third party package and try again.
Press any key to continue...
No pending crash reports. Try --help for more information.
$ cat /etc/os-release NAME="KDE neon Plasma LTS" VERSION="5.18" ID=neon ID_LIKE="ubuntu debian" PRETTY_NAME="KDE neon Plasma LTS Edition 5.18" VARIANT="Plasma LTS Edition" VERSION_ID="20.04" HOME_URL="https://neon.kde.org/" SUPPORT_URL="https://neon.kde.org/" BUG_REPORT_URL="https://bugs.kde.org/" LOGO=start-here-kde-neon PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal
rpool/ROOT/ubuntu_r20rzf on / type zfs (rw,relatime,xattr,posixacl) rpool/USERDATA/username_aqwu6c on /home/jtate type zfs (rw,relatime,xattr,posixacl) rpool/USERDATA/root_03fo29tr on /root type zfs (rw,relatime,xattr,posixacl) bpool/BOOT/ubuntu_r20rzf on /boot type zfs (rw,nodev,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/games on /var/games type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/www on /var/www type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/log on /var/log type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/lib on /var/lib type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/usr/local on /usr/local type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/snap on /var/snap type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/spool on /var/spool type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/srv on /srv type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/mail on /var/mail type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/lib/dpkg on /var/lib/dpkg type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/lib/NetworkManager on /var/lib/NetworkManager type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/lib/AccountsService on /var/lib/AccountsService type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/lib/apt on /var/lib/apt type zfs (rw,relatime,xattr,posixacl) rpool/ROOT/ubuntu_r20rzf/var/lib/0b3174a11e50edb014a03ca2efa4fddfa481f781a2ff233c785668d42c3dac72 on /var/lib/docker/zfs/graph/0b3174a11e50edb014a03ca2efa4fddfa481f781a2ff233c785668d42c3dac72 type zfs (rw,relatime,xattr,posixacl)
$ systemctl status zsys* ● zsys-gc.timer - Clean up old snapshots to free space Loaded: loaded (/lib/systemd/system/zsys-gc.timer; enabled; vendor preset: enabled) Active: active (waiting) since Thu 2021-01-07 00:07:36 EST; 4 days ago Trigger: Tue 2021-01-12 23:09:46 EST; 23h left Triggers: ● zsys-gc.service
Jan 07 00:07:36 denali.int.dragonstrider.com systemd[1]: Started Clean up old snapshots to free space.
● zsysd.socket - Socker activation for zsys daemon Loaded: loaded (/lib/systemd/system/zsysd.socket; enabled; vendor preset: enabled) Active: failed (Result: service-start-limit-hit) since Thu 2021-01-07 00:10:22 EST; 4 days ago Triggers: ● zsysd.service Listen: /run/zsysd.sock (Stream)
Jan 07 00:07:36 denali.int.dragonstrider.com systemd[1]: Listening on Socker activation for zsys daemon. Jan 07 00:10:22 denali.int.dragonstrider.com systemd[1]: zsysd.socket: Failed with result 'service-start-limit-hit'.
● zsysd.service - ZSYS daemon service Loaded: loaded (/lib/systemd/system/zsysd.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2021-01-07 00:10:22 EST; 4 days ago TriggeredBy: ● zsysd.socket Main PID: 13566 (code=exited, status=2)
Jan 07 00:10:22 denali.int.dragonstrider.com zsysd[13566]: github.com/ubuntu/zsys/vendor/github.com/spf13/cobra.(*Command).Execute(...) Jan 07 00:10:22 denali.int.dragonstrider.com zsysd[13566]: github.com/ubuntu/zsys/vendor/github.com/spf13/cobra/command.go:864 Jan 07 00:10:22 denali.int.dragonstrider.com zsysd[13566]: main.main() Jan 07 00:10:22 denali.int.dragonstrider.com zsysd[13566]: github.com/ubuntu/zsys/cmd/zsysd/main.go:36 +0xdb Jan 07 00:10:22 denali.int.dragonstrider.com systemd[1]: zsysd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT Jan 07 00:10:22 denali.int.dragonstrider.com systemd[1]: zsysd.service: Failed with result 'exit-code'. Jan 07 00:10:22 denali.int.dragonstrider.com systemd[1]: Failed to start ZSYS daemon service. Jan 07 00:10:22 denali.int.dragonstrider.com systemd[1]: zsysd.service: Start request repeated too quickly. Jan 07 00:10:22 denali.int.dragonstrider.com systemd[1]: zsysd.service: Failed with result 'exit-code'. Jan 07 00:10:22 denali.int.dragonstrider.com systemd[1]: Failed to start ZSYS daemon service.
● zsys-gc.service - Clean up old snapshots to free space Loaded: loaded (/lib/systemd/system/zsys-gc.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Mon 2021-01-11 23:09:46 EST; 26min ago TriggeredBy: ● zsys-gc.timer Main PID: 2164204 (code=exited, status=1/FAILURE)
Jan 11 23:09:46 denali.int.dragonstrider.com systemd[1]: Starting Clean up old snapshots to free space... Jan 11 23:09:46 denali.int.dragonstrider.com zsysctl[2164204]: level=error msg="couldn't connect to zsys daemon: connection error: desc = \"transport: Error while dialing dial unix /run/zsysd.sock: connect: connection refused\"" Jan 11 23:09:46 denali.int.dragonstrider.com systemd[1]: zsys-gc.service: Main process exited, code=exited, status=1/FAILURE Jan 11 23:09:46 denali.int.dragonstrider.com systemd[1]: zsys-gc.service: Failed with result 'exit-code'. Jan 11 23:09:46 denali.int.dragonstrider.com systemd[1]: Failed to start Clean up old snapshots to free space.