projg2 / eclean-kernel

Installed kernel cleanup tool
GNU General Public License v2.0
32 stars 11 forks source link

Support UKI layout #55

Open pointlessone opened 4 days ago

pointlessone commented 4 days ago

I'm not sure if this should be fixed here or in installkernel. I'd love some input on that from the maintainers.

So, basically, in efistub layout installkernel puts kernel into ${efi_root}/EFI/Gentoo/ (where Gentoo part can be configured), but in uki layout it puts kernel into ${efi_root}/EFI/Linux/ (where Linux is hardcoded). So eclean-kernel can find some kernel dirs (e.g. with --list-kernels) but not kernels themselves. It also refuses to remove those dirs because it doesn't find any kernels.

mgorny commented 4 days ago

CC @Nowa-Ammerlaan

Nowa-Ammerlaan commented 4 days ago

Please try eclean-kernel --layout blspec.

Cleaning UKIs should work (it does for me), the only limitation is that eclean-kernel can only use one layout at the time. There is some auto-detection in place to guess which one you want to use, but it does not always get it right.

pointlessone commented 4 days ago

It doesn't seem to work for me:

# eclean-kernel --layout blspec --list-kernels -D
DEBUG:root:Sorter: <ecleankernel.sort.VersionSort object at 0xffffb1bc3ce0>
DEBUG:root:Layout failed: <class 'ecleankernel.layout.blspec.BlSpecLayout'>; exception: /etc/machine-id not found
usage: eclean-kernel [-h] [-V] [-A] [-l] [-p] [--read-kernel-version KERNEL_PATH] [-b BOOTLOADER] [-L LAYOUT] [-r ROOT] [-a] [-d] [-n NUM] [-s SORT_ORDER] [-D] [-M] [--no-bootloader-update] [--no-kernel-install]
                     [-x EXCLUDE]
eclean-kernel: error: Invalid layout: blspec

I'm not sure what the issue is but I can confirm that I don't have that file.

Nowa-Ammerlaan commented 4 days ago

What version is this? It works fine on my end:

eclean-kernel -aA --layout blspec
Preserving currently running kernel (6.10.12-gentoo-dist)
Legend:
[-] file being removed
[x] file does not exist (anymore)
[+] file being kept (used by other kernels)
pointlessone commented 4 days ago

eclean-kernel 2.99.8

Nowa-Ammerlaan commented 4 days ago

I suppose this is the problem:

        # TODO: according to bootctl(1), we should fall back to IMAGE_ID=
        # and then ID= from os-release
        for path in ("etc/kernel/entry-token", "etc/machine-id"):
            try:
                with open(root / path) as f:
                    self.kernel_id = f.read().strip()
                break
            except FileNotFoundError:
                pass
        else:
            raise LayoutNotFound("/etc/machine-id not found")
pointlessone commented 4 days ago

Yeah, I don't have any of those files. So there seem to be some discrepancy between blspec and uki layout in installkernel or somewhere else.

Nowa-Ammerlaan commented 4 days ago

Please let me know if this patch resolves your problem: https://github.com/projg2/eclean-kernel/pull/56

pointlessone commented 1 day ago

Just tried the patch. Sorry, for the delay. It still doesn't work but it fails differently.

# eclean-kernel -aA --layout blspec -D
DEBUG:root:Sorter: <ecleankernel.sort.VersionSort object at 0xffffb05fb410>
DEBUG:root:Layout: <ecleankernel.layout.blspec.BlSpecLayout object at 0xffffb06ebe30>
DEBUG:root:Bootloader failed: <class 'ecleankernel.bootloader.lilo.LILO'>
DEBUG:root:Bootloader failed: <class 'ecleankernel.bootloader.grub2.GRUB2'>
DEBUG:root:Bootloader failed: <class 'ecleankernel.bootloader.grub.GRUB'>
DEBUG:root:Bootloader failed: <class 'ecleankernel.bootloader.yaboot.Yaboot'>
DEBUG:root:Bootloader: <ecleankernel.bootloader.symlinks.Symlinks object at 0xffffb0237560>
DEBUG:root:Unrecognized potential kernel image: PE file /efi/EFI/Linux/gentoo-6.6.51-gentoo-dist-hardened.efi: EOF in section table!
DEBUG:root:in get_removal_list()
Traceback (most recent call last):
  File "/usr/lib/python-exec/python3.12/eclean-kernel", line 8, in <module>
    sys.exit(setuptools_main())
             ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/ecleankernel/__main__.py", line 391, in setuptools_main
    sys.exit(main(sys.argv[1:]))
             ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/ecleankernel/__main__.py", line 251, in main
    removals = get_removal_list(
               ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/ecleankernel/process.py", line 81, in get_removal_list
    raise SystemError(
SystemError: No vmlinuz found. This seems ridiculous, aborting.
Nowa-Ammerlaan commented 1 day ago

Okay so now it is detecting the layout properly (so the PR does what it should), but there is something weird with this UKI:

DEBUG:root:Unrecognized potential kernel image: PE file /efi/EFI/Linux/gentoo-6.6.51-gentoo-dist-hardened.efi: EOF in section table!

How did you build this, with ukify or dracut?

pointlessone commented 1 day ago

I believe it's built with dracut. At least that's what it looks like when the kernel is being installed.

I wonder if arch is important here. This is an arm64 box.

Nowa-Ammerlaan commented 1 day ago

I believe it's built with dracut. At least that's what it looks like when the kernel is being installed.

Which version of dracut? And which type of objcopy are you using (binutils or llvm)?

It is a known "problem" that dracut creates UKIs that are slightly different then the ones ukify makes. In any case this issue is separate from the layout problem, so @mgorny I think we can merge #56

pointlessone commented 1 day ago

Is there a ticket for that another issue? Do you want me to file one?

Nowa-Ammerlaan commented 23 hours ago

Is there a ticket for that another issue? Do you want me to file one?

We can re-use this one I think.

The interesting thing is that ukify support was implemented here: https://github.com/projg2/eclean-kernel/pull/47

The particular check that is failing for you now was not touched, suggesting that dracut itself might not be the issue.

In essence dracut just uses objcopy to build the UKI, we already know that llvm's objcopy sometimes behaves very different compared to the binutils version, which could be the cause of your problem if you are on an llvm profile. Another possibility is that this EOF comes directly from the input files.

pointlessone commented 22 hours ago

Oh, I forgot to tell you the versions.

dracut: 103-r4 objcopy: binutils 2.42-r1