Closed tigoesnumb3rs closed 5 years ago
Hey,
can you please provide more information about your setup.
bpool
and rpool
) or a single pool?zedenv get
?Best regards Markus
Hey,
Yes, I'm using two pools. I'm actually using a slightly modified version of your zedenv-setup. I did use zedenv
and zedenv-grub
from johnramsden, not from your repo, other than that it should be pretty similiar to your orginal script.
~$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
bpool 212M 3.64G 96K /
bpool/BOOT 211M 3.64G 96K none
bpool/BOOT/env 203M 3.64G 96K none
bpool/BOOT/env/zedenv-001-2019-10-28 56K 3.64G 141M legacy
bpool/BOOT/env/zedenv-002-2019-10-29 8K 3.64G 141M legacy
bpool/BOOT/env/zedenv-default 203M 3.64G 141M legacy
bpool/BOOT/grub 7.22M 3.64G 7.22M legacy
rpool 119G 273G 96K /
rpool/ROOT 3.58G 273G 96K none
rpool/ROOT/001-2019-10-28 8K 273G 3.11G legacy
rpool/ROOT/002-2019-10-29 8K 273G 3.50G legacy
rpool/ROOT/default 3.58G 273G 3.51G legacy
rpool/home 115G 273G 96K /home
rpool/home/root 328K 273G 328K /root
rpool/home/user 115G 273G 96K none
rpool/home/user/.cache 22.7M 273G 22.7M /home/user/.cache
rpool/home/user/downloads 96K 273G 96K /home/user/downloads
rpool/home/user/opt 96K 273G 96K /home/user/opt
rpool/home/user/src 96K 273G 96K /home/user/src
rpool/home/user/tmp 100K 273G 100K /home/user/tmp
rpool/home/user/uni 1.25M 273G 1.25M /home/user/uni
rpool/home/user/usr 115G 273G 115G /home/user/usr
rpool/opt 2.57M 273G 2.57M /opt
rpool/var 201M 273G 96K none
rpool/var/cache 188M 273G 188M /var/cache
rpool/var/lib 576K 273G 96K none
rpool/var/lib/docker 96K 273G 96K /var/lib/docker
rpool/var/lib/flatpak 96K 273G 96K none
rpool/var/lib/lxc 96K 273G 96K none
rpool/var/lib/lxcfs 96K 273G 96K none
rpool/var/lib/lxd 96K 273G 96K none
rpool/var/log 12.2M 273G 12.2M legacy
rpool/var/snap 96K 273G 96K none
rpool/var/spool 128K 273G 128K legacy
rpool/var/tmp 168K 273G 168K legacy
The output of zedenv get
is:
~$ zedenv get
PROPERTY VALUE
org.zedenv:bootloader grub
org.zedenv.grub:boot /boot
Could you be able to try adding the following before line 690:
print(f"be_root=${self.be_root}\nboot_environment=${grub_entry.boot_environment}")
Then run grub-mkconfig
and post the output.
Has this worked before for you when using zedenv activate
?
zedenv activate
runs through without any errors, but I think it didn't activate the boot environment.
The output of grub-mkconfig
is:
~$ sudo grub-mkconfig
Sourcing file `/etc/default/grub'
Generating grub configuration file ...
#
# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#
### BEGIN /etc/grub.d/00_header ###
if [ -s $prefix/grubenv ]; then
set have_grubenv=true
load_env
fi
if [ "${next_entry}" ] ; then
set default="${next_entry}"
set next_entry=
save_env next_entry
set boot_once=true
else
set default="0"
fi
if [ x"${feature_menuentry_id}" = xy ]; then
menuentry_id_option="--id"
else
menuentry_id_option=""
fi
export menuentry_id_option
if [ "${prev_saved_entry}" ]; then
set saved_entry="${prev_saved_entry}"
save_env saved_entry
set prev_saved_entry=
save_env prev_saved_entry
set boot_once=true
fi
function savedefault {
if [ -z "${boot_once}" ]; then
saved_entry="${chosen}"
save_env saved_entry
fi
}
function recordfail {
set recordfail=1
# GRUB lacks write support for zfs, so recordfail support is disabled.
}
function load_video {
if [ x$feature_all_video_module = xy ]; then
insmod all_video
else
insmod efi_gop
insmod efi_uga
insmod ieee1275_fb
insmod vbe
insmod vga
insmod video_bochs
insmod video_cirrus
fi
}
terminal_input console
terminal_output console
if [ "${recordfail}" = 1 ] ; then
set timeout=30
else
if [ x$feature_timeout_style = xy ] ; then
set timeout_style=menu
set timeout=2
# Fallback normal timeout code in case the timeout_style feature is
# unavailable.
else
set timeout=2
fi
fi
if [ $grub_platform = efi ]; then
set timeout=30
if [ x$feature_timeout_style = xy ] ; then
set timeout_style=menu
fi
fi
### END /etc/grub.d/00_header ###
### BEGIN /etc/grub.d/05_debian_theme ###
set menu_color_normal=white/black
set menu_color_highlight=black/light-gray
#set_background_image "images/tile.png";
set menu_color_normal=white/black
set menu_color_highlight=black/light-gray
if background_color 0,0,0; then
clear
fi
### END /etc/grub.d/05_debian_theme ###
### BEGIN /etc/grub.d/05_zfs_linux.py ###
/boot/zfsenv/zedenv-default
/boot/zfsenv/zedenv-001-2019-10-28
/boot/zfsenv/zedenv-002-2019-10-29
Traceback (most recent call last):
File "/etc/grub.d/05_zfs_linux.py", line 852, in <module>
for en in Generator().generate_grub_entries():
File "/etc/grub.d/05_zfs_linux.py", line 688, in generate_grub_entries
self.grub_boot_on_zfs, self.grub_boot_device)
File "/etc/grub.d/05_zfs_linux.py", line 110, in __init__
self.be_root, self.boot_environment)
File "/usr/lib/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib/python3.6/genericpath.py", line 149, in _check_arg_types
(funcname, s.__class__.__name__)) from None
TypeError: join() argument must be str or bytes, not 'NoneType'
Is that with the print statement?
Yes. It does however not seem to make a difference, if the print
is there or not.
When I run grub-mkconfig
for the first time after booting, I get:
### BEGIN /etc/grub.d/05_zfs_linux.py ###
/boot/zfsenv/zedenv-default
/boot/zfsenv/zedenv-001-2019-10-28
/boot/zfsenv/zedenv-002-2019-10-29
Traceback (most recent call last):
File "/etc/grub.d/05_zfs_linux.py", line 852, in <module>
for en in Generator().generate_grub_entries():
File "/etc/grub.d/05_zfs_linux.py", line 688, in generate_grub_entries
self.grub_boot_on_zfs, self.grub_boot_device)
File "/etc/grub.d/05_zfs_linux.py", line 110, in __init__
self.be_root, self.boot_environment)
File "/usr/lib/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib/python3.6/genericpath.py", line 149, in _check_arg_types
(funcname, s.__class__.__name__)) from None
TypeError: join() argument must be str or bytes, not 'NoneType'
After that I only get:
### BEGIN /etc/grub.d/05_zfs_linux.py ###
Traceback (most recent call last):
File "/etc/grub.d/05_zfs_linux.py", line 852, in <module>
for en in Generator().generate_grub_entries():
File "/etc/grub.d/05_zfs_linux.py", line 688, in generate_grub_entries
self.grub_boot_on_zfs, self.grub_boot_device)
File "/etc/grub.d/05_zfs_linux.py", line 110, in __init__
self.be_root, self.boot_environment)
File "/usr/lib/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib/python3.6/genericpath.py", line 149, in _check_arg_types
(funcname, s.__class__.__name__)) from None
TypeError: join() argument must be str or bytes, not 'NoneType'
regardless of whether or not the print
is there or not.
Sorry I think I got the wrong location, can you try this at L684?
print(f"i=${i}\nj=${j}")
diff:
diff --git a/grub.d/05_zfs_linux.py b/grub.d/05_zfs_linux.py
index ac4816d..95c685d 100755
--- a/grub.d/05_zfs_linux.py
+++ b/grub.d/05_zfs_linux.py
@@ -681,6 +681,8 @@ class Generator:
key=functools.cmp_to_key(Generator.kernel_comparator))
for j in kernels_sorted:
+ print(f"i=${i}\nj=${j}")
+
grub_entry = GrubLinuxEntry(
os.path.join(i['directory'], j), self.grub_os, self.be_root, self.rpool,
self.genkernel_arch, i, self.grub_cmdline_linux,
Sure, the output of the print is this (added some \n
s for readability):
i=${'directory': '/boot/zfsenv/zedenv-002-2019-10-29',
'files': ['initrd.img-5.0.0-1015-oem-osp1',
'memtest86+.elf',
'System.map-5.0.0-32-generic',
'config-5.0.0-32-generic',
'memtest86+_multiboot.bin',
'vmlinuz-5.0.0-32-generic',
'grub',
'config-5.0.0-1015-oem-osp1',
'efi',
'initrd.img-5.0.0-32-generic',
'System.map-5.0.0-1015-oem-osp1',
'zfsenv',
'memtest86+.bin',
'vmlinuz-5.0.0-1015-oem-osp1'],
'kernels': ['vmlinuz-5.0.0-32-generic',
'vmlinuz-5.0.0-1015-oem-osp1']}
j=$vmlinuz-5.0.0-32-generic
Traceback (most recent call last):
...
I'm not sure if the following information helps, but zedenv list
looks like this for me:
Name Active Mountpoint Creation
default NR / Mon-Oct-28-22:17-2019
001-2019-10-28 - Mon-Oct-28-22:56-2019
002-2019-10-29 - Tue-Oct-29-0:34-2019
002-2019-10-29
is the most recently created Boot Environment.
When I create a new Boot Environment (zedenv create 003-2019-10-31
) and then run the grub-mkconfig
command again, it seems to crash at the same point, still in 002-2019-10-29
and not in 003-2019-10-31
even though the newly created Boot Environment seems to be found:
/boot/zfsenv/zedenv-default
/boot/zfsenv/zedenv-001-2019-10-28
/boot/zfsenv/zedenv-002-2019-10-29
/boot/zfsenv/zedenv-003-2019-10-31
i=${'directory': '/boot/zfsenv/zedenv-002-2019-10-29',
'files': ['initrd.img-5.0.0-1015-oem-osp1',
'memtest86+.elf',
'System.map-5.0.0-32-generic',
'config-5.0.0-32-generic',
'memtest86+_multiboot.bin',
'vmlinuz-5.0.0-32-generic',
'grub',
'config-5.0.0-1015-oem-osp1',
'efi',
'initrd.img-5.0.0-32-generic',
'System.map-5.0.0-1015-oem-osp1',
'zfsenv',
'memtest86+.bin',
'vmlinuz-5.0.0-1015-oem-osp1'],
'kernels': ['vmlinuz-5.0.0-32-generic',
'vmlinuz-5.0.0-1015-oem-osp1']}
j=$vmlinuz-5.0.0-32-generic
Traceback (most recent call last):
...
Strange, the trace makes it seems like None
is being passed to join
, but everything looks OK so far.
Can you try this?
diff --git a/grub.d/05_zfs_linux.py b/grub.d/05_zfs_linux.py
index ac4816d..666a184 100755
--- a/grub.d/05_zfs_linux.py
+++ b/grub.d/05_zfs_linux.py
@@ -681,6 +681,19 @@ class Generator:
key=functools.cmp_to_key(Generator.kernel_comparator))
for j in kernels_sorted:
+ print(f"i={i}\n")
+ print(f"j={j}\n")
+ print(f"self.grub_os={self.grub_os}\n")
+ print(f"self.be_root={self.be_root}\n")
+ print(f"self.rpool={self.rpool}\n")
+ print(f"self.genkernel_arch={self.genkernel_arch}\n")
+ print(f"self.grub_cmdline_linux={self.grub_cmdline_linux}\n")
+ print(f"self.grub_cmdline_linux_default={self.grub_cmdline_linux_default}\n")
+ print(f"self.grub_devices={self.grub_devices}\n")
+ print(f"self.default={self.default}\n")
+ print(f"self.grub_boot_on_zfs={self.grub_boot_on_zfs}\n")
+ print(f"self.grub_boot_device={self.grub_boot_device}\n")
+
grub_entry = GrubLinuxEntry(
os.path.join(i['directory'], j), self.grub_os, self.be_root, self.rpool,
self.genkernel_arch, i, self.grub_cmdline_linux,
/boot/zfsenv/zedenv-default
/boot/zfsenv/zedenv-001-2019-10-28
/boot/zfsenv/zedenv-002-2019-10-29
/boot/zfsenv/zedenv-003-2019-10-31
i={'directory': '/boot/zfsenv/zedenv-002-2019-10-29', 'files': ['initrd.img-5.0.0-1015-oem-osp1', 'memtest86+.elf', 'System.map-5.0.0-32-generic', 'config-5.0.0-32-generic', 'memtest86+_multiboot.bin', 'vmlinuz-5.0.0-32-generic', 'grub', 'config-5.0.0-1015-oem-osp1', 'efi', 'initrd.img-5.0.0-32-generic', 'System.map-5.0.0-1015-oem-osp1', 'zfsenv', 'memtest86+.bin', 'vmlinuz-5.0.0-1015-oem-osp1'], 'kernels': ['vmlinuz-5.0.0-32-generic', 'vmlinuz-5.0.0-1015-oem-osp1']}
j=vmlinuz-5.0.0-32-generic
self.grub_os=Ubuntu GNU/Linux
self.be_root=rpool/ROOT
self.rpool=rpool
self.genkernel_arch=x86_64
self.grub_cmdline_linux=
self.grub_cmdline_linux_default=
self.grub_devices=['/dev/mapper/root']
self.default=
self.grub_boot_on_zfs=True
self.grub_boot_device=['/dev/nvme0n1p2']
Traceback (most recent call last):
...
Very odd, looks OK. I'll do some more investigation and get back to you. Thanks for your patience.
I can confirm the bug on my system. It seems that the name of a boot environment should not contain a dash.
zedenv create 003-2019-10-31
--> FAILSzedenv create 003
--> OKzedenv create test-test
--> FAILS@tigoesnumb3rs: Can you please test creating a new boot environment without a dash in the name?
@mkessler001 With a dash do you get this error?
Yes, the error occurs when I a new boot environment contains a dash in the name (e.g. 003-2019-10-31
or test-test
). Is this reproducible on your system?
If this error does not appear on your system, I would assume that it has something to do with parsing the name for the separate boot pool (bpool/BOOT/env/zedenv-001-2019-10-28
).
It doesn't occur on my system, no. I have dashes in most of my envs.
@mkessler001 Where in the sequence does it crash for you? If it's in a different location, could you open a new issue either in zedenv
or zedenv-grub
depending on where it occurs. Running with --verbose
may give you an idea.
The pull request should fix this. Please report back if this solved your issue.
Best regards Markus
@tigoesnumb3rs could you confirm if this fixes the issue?
If you don't want to pull the new code, you could just change L440 to the following:
Yes, I can confirm, the error is gone.
Thanks a lot for your help!
Hey,
I'm afraid I might have been a bit premature. A few hours ago I pulled the latest version and ran grub-mkconfig
and then grub-update
.
The error was indeed gone, but I then I wasn't able to reboot successfully. Since I'm currently not at home, I only have some mobile networking and a live stick, the following is roughly what happened (as good as I can remember):
Before installing I had 4 Boot Environments, the one I actively use (default
) as well as others, which I used as snapshots and which were named by the date I created them. When booting, grub
first presented me with a entry that contained boot environments, then with the snapshot boot environments. There I could select which environment to boot from. Then after entering the LUKS password, I was asked to choose from a list of boot environments (here in one case the default
boot environment was gone in all other cases (meaning in different Boot Environments previously selected in grub
) this list was empty.
I don't recall specifically which one worked, but with some compination of the above I got into a systemd rescue shell, something went wrong with the zfs systemd services.
My ZFS layout currently looks weird as well:
NAME USED AVAIL REFER MOUNTPOINT
bpool 212M 3.64G 96K /mnt
bpool/BOOT 211M 3.64G 96K none
bpool/BOOT/env 204M 3.64G 96K none
bpool/BOOT/env/zedenv-001-2019-10-28 72K 3.64G 141M legacy
bpool/BOOT/env/zedenv-002-2019-10-29 72K 3.64G 141M legacy
bpool/BOOT/env/zedenv-003-2019-10-31 72K 3.64G 141M legacy
bpool/BOOT/env/zedenv-default 203M 3.64G 141M legacy
bpool/BOOT/grub 7.21M 3.64G 7.21M legacy
rpool 117G 274G 96K /mnt
rpool/ROOT 3.71G 274G 96K none
rpool/ROOT/001-2019-10-28 8K 274G 3.11G /mnt
rpool/ROOT/002-2019-10-29 8K 274G 3.50G legacy
rpool/ROOT/003-2019-10-31 8K 274G 3.54G legacy
rpool/ROOT/default 3.70G 274G 3.54G /mnt/mnt
rpool/ROOT/default_2019-10-28-22-241202 4.99M 274G 3.11G none
rpool/ROOT/default_2019-10-31-21-679644 5.28M 274G 3.53G none
I suddenly have rpool/ROOT/default_2019-10-28-22-241202
and rpool/ROOT/default_2019-10-31-21-679644
, which I never created. I think I was able to boot into one of those and somehow the corresponding files in bpool
were missing.
I'll update this and add more information, when I'm home in a few days. My setup is pretty much taken from here, maybe it is reproducible in a virtual machine.
Please give me a moment. I've found the bug and I'm going to commit a fix soon.
Please integrate the fix comited in 0a4f806 and remove your ZFS datasets, which are accidentally created by booting a snapshot.
❯ zfs destroy rpool/ROOT/default_2019-10-28-22-241202
❯ zfs destroy rpool/ROOT/default_2019-10-31-21-679644
This happened because the python regex command included the trailing "@" for the command:
❯ grub-mkrelpath /boot
/boot/env/zedenv-test-BE@
The output of the regex parsing is used to specify the root mount in /etc/grub/grub.cfg
:
WRONG: linux /boot/env/zedenv-test-BE@/vmlinuz-4.15.0-66-generic root=ZFS=rpool/root/test-BE@ rw
OK: linux /boot/env/zedenv-test-BE@/vmlinuz-4.15.0-66-generic root=ZFS=rpool/root/test-BE rw
Setting root=ZFS=...@
leads to the behavior that you are asked to select a snapshot to boot.
I've just updated zedenv-grub
and removed the zfs datasets. It seems to work fine again!
Thank you so much for your help!
No problem, I'm glad to help. Enjoy your ZFS-based system.
Hey,
when running
update-grub
, I get this error:I'm using
zedenv
version 0.3.7,zedenv-grub
version 0.1.7 andpyzfscmds
version 0.1.5 on a Ubuntu 18.04.