FOGProject / fos

FOG Operating System
29 stars 33 forks source link

Sapphire Rapids CPU won't boot using 6.6.34 bzImage #88

Open mmarcini opened 6 days ago

mmarcini commented 6 days ago

A Sapphire Rapids CPU won't efi boot using the 6.3.4 kernel.

The BIOS is:

BIOS Information Vendor: Intel Corporation Version: SE5C741.86B.01.02.0001.2401260138

The issue also exists with a follow-on CPU.

An initial workaround was to disable Virtualization in BIOS:

under Processor

Virtualization

X2APIC

under IO

VT-D

For Future SKUs with a larger core count may need to be dropped to 56 cores in the BIOS as well.

mmarcini commented 6 days ago

The same kernel build with the attached config file boots fine on a RHEL 9.3 server.

config-6.6.34.txt

The config file was built using /boot/config-5.14.0-362.8.1.el9_3.x86_64 for the starting .config and using make olddefconfig.

mastacontrola commented 5 days ago

I'm not sure what problem you're trying to solve.

These kernels aren't for booting on RHEL 9.3 at all, but rather for FOGs initrd system.

mmarcini commented 5 days ago

I'm not sure what problem you're trying to solve.

These kernels aren't for booting on RHEL 9.3 at all, but rather for FOGs initrd system.

I wanted to prove that the hardware "could" boot. I generated a 6.6.34 kernel based on a non-FOS config just to confirm that.

I have a config file patch that adds some specific intel IOMMU settings that I will present in a pull request shortly, just doing some final debug with a usb stick. I looked at the config file differences between kernelx64.config and the attached config to generate missing config settings.

mmarcini commented 5 days ago

BTW, I noticed the master branch kernel-headers init generation and the bzImage kernel are out of sync with their respective versions: 6.1.62 vs. 6.6.34.

Is that intentional?

mastacontrola commented 5 days ago

So, though, init's show up as having 6.1.62 where the kernel's configs directly are using 6.6.34. The init generation has the ability to build kernels as part of the FS, though genernally we don't worry too much about this. I believe there are parts where we needed to build the kernel alongside for things that must be modular by design of FS->kernel interactions. Where we cannot build the piece directly into the kernel.

That said I think it's nothing to worry about at this point. That can be fixed or adjusted as necessary later on.

Looking at what I could of your config provided, I'm seeing you have set a lot of the items to modules. While this can work, since the idea of these kernels are to provide all the necessarily things internalized, modules aren't really something we've dug into since we have to build the kernels independent of the initrd. Is there any way you could limit the changes to the config to those required to make them build and operate?

Here's the changes I see currently: configdiffs.diff.txt

mastacontrola commented 5 days ago

I will work on seeing if I can get the headers up to date with th 6.6.34 version.

mastacontrola commented 5 days ago

I'm updated the build script to use 2024.5 version of buildroot and updated the fs configs to update binutils, gcc versions along with the headers being updated for 6.6 (which buildroot uses 6.6.32 right now)

This is currently building a release right now.

mmarcini commented 5 days ago

Looking at what I could of your config provided, I'm seeing you have set a lot of the items to modules. While this can work, since the idea of these kernels are to provide all the necessarily things internalized, modules aren't really something we've dug into since we have to build the kernels independent of the initrd. Is there any way you could limit the changes to the config to those required to make them build and operate?

I'm in the process of doing that now!

mmarcini commented 4 days ago

kernelx64.config.txt

Here is a FOS config file that boots. I have this in a fork ready to go. Just strip off the .txt suffix.

It is relative to the 6.6.34 kernel.

mmarcini commented 4 days ago

kernelx64.config.txt

Here is a FOS config file that boots. I have this in a fork ready to go. Just strip off the .txt suffix.

It is relative to the 6.6.34 kernel.

I was delayed because the BMC KVM stopped working. The SOL did work, but I'm guessing the kernel command line needs serial console stuff that usb grub didn't have. ipmitool mc reset cold brought the KVM back.

jmalanto commented 4 days ago

I just wanted to add that the issue we were seeing was the same as https://forums.fogproject.org/topic/16993/client-hangs-at-efi-stub The change resolves the issue without disabling VT and X2APIC in the BIOS

mmarcini commented 4 days ago

I just wanted to add that the issue we were seeing was the same as https://forums.fogproject.org/topic/16993/client-hangs-at-efi-stub The change resolves the issue without disabling VT and X2APIC in the BIOS

I don't see a change other than the BIOS detours, which we know about. Did I miss something in my quick pass through?

We need the BIOS settings in place for post image use.

I suspect the config changes could be condensed a bit.

mastacontrola commented 4 days ago

Sorry for the typo on the name, but I have pushed this into the branch and am buidling an expermental as we speak.

Thank you @mmarcini