linuxboot / heads

A minimal Linux that runs as a coreboot or LinuxBoot ROM payload to provide a secure, flexible boot environment for laptops, workstations and servers.
https://osresearch.net/
GNU General Public License v2.0
1.42k stars 187 forks source link

W541 dGPU unable to boot Qubes #1607

Open TrustExecutor opened 9 months ago

TrustExecutor commented 9 months ago

Please identify some basic details to help process the report

A. Provide Hardware Details

1. What board are you using (see list of boards here)?

I have the W540 and use the w541-hotp-maximized The model# is 20BH-A01YMS The dGPU is the Nvidia K2100m

2. Does your computer have a dGPU or is it iGPU-only?

3. Who installed Heads on this computer?

4. What PGP key is being used?

5. Are you using the PGP key to provide HOTP verification?

B. Identify how the board was flashed

1. Is this problem related to updating heads or flashing it for the first time?

2. If the problem is related to an update, how did you attempt to apply the update?

3. How was Heads initially flashed

4. Was the board flashed with a maximized or non-maximized/legacy rom?

5. If Heads was externally flashed, was IFD unlocked?

C. Identify the rom related to this bug report

1. Did you download or build the rom at issue in this bug report?

3. If you built your rom, which repository:branch did you use?

4. What version of coreboot did you use in building?

5. In building the rom where did you get the blobs?

Please describe the problem

See PR: https://github.com/linuxboot/heads/pull/1606

Describe the bug I built setting CONFIG_USE_OPTION_TABLE=y in the coreboot config file and then used nvramtool -w enable_dual_graphics=Enable and the card showed up in lspci with Debian 12 without any vbios blobs. But Qubes get a kernel panic at boot. If I set it to disable then Qubes boot normally.

I inspected the backup rom using UEFITool and found two Intel VGA bios with PCI ID 8086,0406 and a bunch of Nvidia VGA bios. The Nvidia VGA bios I found two that had the same PCI ID (10de,11fc) as printed in lspci using stock firmware. Intel stock PCI ID were 8086,0416 with the stock firmware.

I then created another board and included vbios binaries for both iGPU and dGPU in the coreboot config.

I also tried to change the script in the w530/t530 board targets, that download and extract vbios blobs from a Lenovo W540/W541 BIOS update .exe using vbiosfinder, innoextract, and rom-parser. See the script in the PR. The script works and extracts two verified blobs. The hashes of the output files differ from the ones I extracted from the backup rom with UEFITool.

The iGPU blob has the Intel generic 8086,0406 PCI ID. I have tried using rom-fixer to change PCI ID to 8086,0416 (also changed in coreboot config) as lspci on Debian reports, without luck. Same result.

After the kernel panic in Qubes i get beeps and LED flashes, when using the blobs extracted from the Lenovo .exe. When this happens I need to pull CMOS battery to reset, then boot again (insecure boot) to reset the nvramtool flag to disabled.

tlaurion commented 9 months ago

@TrustExecutor

Reworked your branch at https://github.com/tlaurion/heads/tree/w541-dgpu so that CI provides artifacts to download and test so we remove possibilities of host tools issues.

Also see comment on your PR https://github.com/linuxboot/heads/pull/1606#issuecomment-1946627492

Direct download links for CI produced rom images for commit 20df526 and related changes https://github.com/linuxboot/heads/compare/master...tlaurion:heads:w541-dgpu :

tlaurion commented 9 months ago

Just stumbled upon https://doc.coreboot.org/northbridge/intel/haswell/known-issues.html#known-issues-with-haswell

Seems like https://review.coreboot.org/c/30456 might be needed? Edit: this turns off iGPU.

Edit: ditch mrc blobs altogether and use NRI based ram init instead while accepting issues (still present?) with resume on suspend (s3) might be the way forward. Last time I checked, speed of ram might be lesser speeder, while less ram modules were supported vs mrc based raminit.

Tldr: with the notes on coreboot page referred above, dgpu and iGPU cannot be activated with mrc blob borrowed from Chromebook blob downloaded (preppy) and NRI is needed instead for dGPU to be functional. Otherwise dGPU randering offloading is possible with output to iGPU.

TrustExecutor commented 9 months ago

@tlaurion Thanks for cleaning up the branch and for your input. What is really interesting is that Debian 12 lists both the iGPU and the dGPU in lspci with no vbios blobs and libgfxinit (booted from heads). This is after setting nvramtool -w enable_dual_graphics=Enable It is only in Qubes this setting makes the system non-bootable.

This makes me think it may instead be a noveau driver issue in Qubes and not something mrc/coreboot related. I also found a youtube video where the same setup is made and he got the same result in debian where both GPUs show up. https://youtu.be/abDv1SAGNSA?t=135

TrustExecutor commented 9 months ago

Here is also some information about running the dGPU under coreboot on the T440p, which is also Haswell. https://crayphish.github.io/posts/t440p-adventures/

tlaurion commented 9 months ago

Here is also some information about running the dGPU under coreboot on the T440p, which is also Haswell. https://crayphish.github.io/posts/t440p-adventures/

@TrustExecutor :

The current coreboot code for the T440p has the PCI Express Graphics (PEG) device that handles the dGPU disabled. Thanks to nullenvk, you can restore dGPU functions by applying this commit, which adds the “Enable dedicated graphics (experimental)” option to nconfig. Enabling this will re-enable the PEG device in your next coreboot build. You need to include the VBIOS for both the iGPU and dGPU. You can extract both using VBiosFinder. If you’re building coreboot from master branch, the current SeaBIOS payload will result in a bootloop. Use SeaBIOS 1.15 or TianoCore (UEFI OS only).

Also note that the non-Tldr section of the referred article stipulates lots of broken things linked to using Prime, a lot of workaround that needs to be activated in OS and vertical tearing. All those reasons explain why Nvidia is not encouraged on qubesos side and that why coreboot didn't invest much effort into supporting it correctly even today.

Well, I could still add the PEG patch from coreboot for you to test outcome rom if you commit to testing and tweaking configs and documenting board config properly so that other interested users can get similar results as yours, as long as you are really interested and understanding of the limitations of and have realistic expectations?

Unfortunately, that also means that if abandoned coreboot PEG patch doesn't apply directly, Heads would need to maintain that patch across coreboot version bump which would be a burden on myself and on which I cannot commit from the moment the patch doesn't apply directly, which would result in an UNMAINTAINED_ board needing assistance to have it support as other boards in tree.

A similar story happened recently for fhd/edp which finally got merged under coreboot which limited testing of more recent versions of coreboot and other things untested that broke since coreboot 4.19, and I've decided to not go that route anymore if staying behind because of a specific board variant. The PEG patch referred here in thread and in this article is currently abandoned and would need input from interested board owners and committed testers to have it part of coreboot master. Tldr: Heads intends to be a user of coreboot, not a maintainer of it.


@TrustExecutor if you can commit into testing and bridging the gap upstream by commenting on the coreboot patch to try to upstream it by being the bridging agent between the testing that would happen under Heads and reports upstream? If yes I can try to adapt the patchset here and have you comment on it once it builds. Note that the rom extracter is using obsolete ruby gems and also needs to be worked on in upstream project, otherwise Heads also commits in maintaining those tools and that is not planned. Tldr on that: rom extractor won't work on debian-12 nor nix buildsystem that heads plans to switch to in the mid-term. Meaning: dgpu variants also become a burden to Heads maintenance with no people actually reporting usage/testing (the variants are all officially untested now).

tlaurion commented 4 months ago

NRI was revived in June 2024 https://github.com/linuxboot/heads/issues/1711

tlaurion commented 4 months ago

@TrustExecutor : don't hesitate pinging me here with your testing progress for dgpu+nri progress, you would have my attention.