projg2 / eclean-kernel2

Reboot of eclean-kernel [now defunct in favor of reviving ek1]
BSD 2-Clause "Simplified" License
17 stars 6 forks source link

eclean kernel does not recognize vmlinuz file on arm64 and removes running kernel #19

Open stikonas opened 6 years ago

stikonas commented 6 years ago

Possibly related to EFI stub in the kernel

Version 1.99.4 prints

# eclean-kernel -p
The following kernels would be removed:

== 4.19.0-rc6-16400-g23adc26c1ac2 ==
Rationale:
[-] stale files (no matching kernel)
Files:
- /boot/System.map-4.19.0-rc6-16400-g23adc26c1ac2
- /boot/config-4.19.0-rc6-16400-g23adc26c1ac2
- /boot/initramfs-4.19.0-rc6-16400-g23adc26c1ac2.img

The following command would be run: grub-mkconfig -o /boot/grub/grub.cfg

So all files except vmlinuz are removed.

Older 0.4x.x instead prints Invalid magic for kernel file /boot/vmlinuz-4.19.0-rc1-00117-g5b32735eb348 (!= HdrS)

I've also remorted it to https://bugs.gentoo.org/668352 but it was closed (emerge logs not provided). Not sure what logs I should provide as there is nothing else...

mgorny commented 6 years ago

Would you be able to either attach the kernel or at least paste hexdump of its header?

stikonas commented 6 years ago

I uploaded it here

https://stikonas.eu/files/vmlinuz-4.19.0-rc6-16400-g23adc26c1ac2

Note that's ARM64 kernel but I think it's a direct cause of this bug. grub2 can only boot kernels with EFI stub on arm64.

/boot/dtbs/$KERNEL_VERSION also doesn't seem to be cleaned on ARM....

mgorny commented 6 years ago

Do you happen to have some resources on that file format? I can't find kernel magic bytes in it, and my search engine fails to provide anything useful.

stikonas commented 6 years ago

file seems to report it as MS-DOS executable.

There is a tiny bit documentation in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/efi-stub.txt

mgorny commented 6 years ago

I'm afraid I can't solve this without extra help. I think the EFI stub format is not really meant to be compatible with bzImage, and I can't find any useful documentation on what's inside it.

In a regular bzImage, I grab some offset from header on top, and then locate version relative to that offset. The version string is nicely aligned and separated there.

However, in your kernel the version string seems to be preceded by Linux version at aligned offset. Which is weird but might be common to all EFI stub kernels. That's found at offset 0x00F400A0. The offset 0x00F40000 seems to correspond to some data following large gap of NUL bytes — I would presume that to be start of actual kernel. However, I have no clue if that offset is somehow static or if it should be obtained or calculated from the stub itself.

Getting some EFI stub kernels for different architectures and different kernel versions might be helpful. However, I don't really have the time to do this.

stikonas commented 6 years ago

I should be able to compile EFI kernel on amd64 too even though I don't use EFI stubs on amd64 but I can try to enable it temporary.

I will report back here when done (might not be for a few days)

stikonas commented 6 years ago

This is EFI kernel for amd64 with EFI stub. https://stikonas.eu/files/vmlinuz-4.18.16-gentoo It seems to have different structure at first glance.

Can we look at either MS-DOS executable magic bytes MZ or/and file name?

mgorny commented 6 years ago

The former is not enough, the latter is not always correct. We really need to get the 'internal' kernel version (which is usually a subset of filename) that is used to locate modules.

mgorny commented 6 years ago

Yeah, it looks like amd64 kernel is actually compatible with the standard bzImage format.

stikonas commented 6 years ago

I've also compiled aarch64 kernel without UEFI stub: https://stikonas.eu/files/vmlinuz-4.19.0-rc8-next-20181019-00002-gec6b9cb29aa9

I can't boot it though as grub tells me to recompile with UEFI stub.

mgorny commented 6 years ago

Thanks. That's an interesting data point. It isn't like the x86 format either. Here also 'Linux version ...' can b found but at 0xf400a0 (previously it was 0xf500a0). So there's some offset to be gotten here after all ;-/.

mgorny commented 6 years ago

Ok. Let's try something else. Could you tar and give me — for a single EFI stub arm64 kernel ebuild — the output file, and various generated Image* and vmlinu* files from the build tree? I'd like to figure out how they're being combined.

stikonas commented 6 years ago

https://stikonas.eu/files/images.tar.xz

Keep in mind that tarball contains hidden files too (hidden files have objcopy command that assembles zImage)

mgorny commented 6 years ago

Ok, so I think I know how to get the version from kernel ELF binary. Apparently we can use the vermagic symbol for that, so it's just a matter of processing the symbol table, finding .rodata section and then getting the version string at appropriate offset. I suppose that's all doable with libelf or alike.

The remaining part is to figure out how to find the ELF executable embedded in the kernel image. I suppose there should be a better way than scanning the file for ELF header.

stikonas commented 6 years ago

I think we should also keep in mind that kernel can be gzip compressed (make zinstall target instead of make install). This is an example:

https://stikonas.eu/files/vmlinuz-4.19.0-11708-g7e7fa7512127

mgorny commented 6 years ago

Hmm, but that's not something you'd boot by EFI, right? ;-)

stikonas commented 6 years ago

Hmm, but that's not something you'd boot by EFI, right? ;-)

I have now booted this kernel using grub2 which only supports EFI kernels on arm64 (I tried compiling without EFI_STUB and grub2 complained that it only supports booting EFI kernels). I haven't tried this with u-boot's bootefi though.

But u-boot's booti which boots uncompressed Image doesn't seem to be able to boot Image.gz (https://lists.debian.org/debian-arm/2017/02/msg00016.html)