Nitrokey / heads

A minimal Linux that runs as a coreboot or LinuxBoot ROM payload to provide a secure, flexible boot environment for laptops and servers.
http://osresearch.net/
GNU General Public License v2.0
15 stars 1 forks source link

ns50 v2.3 boots to blank screen #28

Closed commandline-be closed 10 months ago

commandline-be commented 10 months ago

Please identify some basic details to help process the report

A. Provide Hardware Details

1. What board are you using (see list of boards here)?

2. Does your computer have a dGPU or is it iGPU-only?

3. Who installed Heads on this computer?

4. What PGP key is being used?

5. Are you using the PGP key to provide HOTP verification?

B. Identify how the board was flashed

1. Is this problem related to updating heads or flashing it for the first time?

2. If the problem is related to an update, how did you attempt to apply the update?

3. How was Heads initially flashed

4. Was the board flashed with a maximized or non-maximized/legacy rom?

5. If Heads was externally flashed, was IFD unlocked?

C. Identify the rom related to this bug report

1. Did you download or build the rom at issue in this bug report?

2. If you downloaded your rom, where did you get it from?

Please provide the release number or otherwise identify the rom downloaded

3. If you built your rom, which repository:branch did you use?

4. What version of coreboot did you use in building?

5. In building the rom where did you get the blobs?

Please describe the problem

Describe the bug A clear and concise description of what the bug is.

After boot there is black screen, assumed to be because intel_iommu=igfx_off is still in place. Attempts to manual boot using kexec-boot -b /boot -e 'kernel /boot/vmlinuz|initrd /boot/initrd.img|....' resulted in fail.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Normal boot process requesting LUKS password

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

commandline-be commented 10 months ago

Obviously confirmed to be because of not changing intel_iommu=igfx_off to intel_iommu=on before updating firmware.

A simple procedure when not preparing to update by adjusting grub.cfg to replace igfx_off with on

  1. choose to exit to recovery shell
  2. remount /boot as writable ( mount -o remount,rw /boot )
  3. replace igfx_off with on for all intel_iommu= keys
  4. save the file
  5. sign the bootloader ( kexec-sign-config -p /boot ) note this requires the user pin
  6. reboot by typing reboot
  7. confirm to sign the modified grub.cfg , note this also requires the user pin
  8. you should now see the luks prompt again

this worked for me, now set grub to modify all entries

edit: /etc/default/grub and modify intel_iommu=igfx_off to intel_iommu=on, now save the file execute: update-grub and reboot

now the config is safe for future reboots

commandline-be commented 10 months ago

I can also confirm this does indeed appears to greatly affect the CPU rapidly heating up. There is barely any change now.

It is unclear what drivers work best with the ns50 iGPU. /boot/grub/grub.cfg` After reboot it reports use of i915 and Mesa Intel Graphics (ADL GT2)

Don' forget to enable hardware acceleration in the browser.

commandline-be commented 10 months ago

it is unknown at this time if nomodeset is preferable to keep or it can be removed from the kernel boot parameters

tlaurion commented 10 months ago

@daringer that was https://github.com/linuxboot/heads/pull/1522 related.

Deactivation of iommu should not happen anymore(iommu should be activated by default per heads kernel config ) and neither ifgx_off, since there is no more double attempt of using i915 driver on both firmware and OS path to drive console and framebuffer. Only final OS should use i915.

Note that KERNEL_REMOVE should also remove those unneeded parameters if present in grub config for final OS, specified from board config, which were useful only for qubesos because I915.

QubesOS will probably keep those parameters by default on new installation, but should be removed if unneeded per board config, which should be the case here, as all other boards under https://github.com/linuxboot/heads/pull/1522 that were touched by switching from i915 to efifb under coreboot.

Also note under https://github.com/linuxboot/heads/pull/1522 recommendation of using branding directory to limit unneeded patches under nitrokey against upstream in the future.

I wished https://github.com/linuxboot/heads/pull/1522 was reviewed commented on and tested prior of releasing nitrokey 2.3 release downstream, as usual. Makes issue like this weird because they should not have happened, requiring point release and actually more work both downstream and upstream.

commandline-be commented 10 months ago

Thanks for the informative update. I don't know what you mean with use of KERNEL_REMOVE, this is new to me.

Thus far I've noticed repeat issues with dual screen not being detected and suspend not working, the system appears frozen. For the freeze i point to the Arch Linux Intel Graphics article specific to item 6.18 'Freeze after wake from sleep/suspend with Alde Lake-P' where 2 freedesktop issues are mentioned (5531 and 6401)

tlaurion commented 10 months ago

https://github.com/Nitrokey/heads/blob/nitropad/boards/nitropad-nv41/nitropad-nv41.config#L42

daringer commented 10 months ago

@commandline-be thanks for testing, we'll switch the v2.3 to pre-release until we release a cleaner 2.4 version which is more consistent with the upstream state ...

tlaurion commented 10 months ago

https://github.com/linuxboot/heads/pull/1522#issuecomment-1821383858

commandline-be commented 10 months ago

@tlaurion @daringer are the references i shared (archlinux and freedesktop) of any relevance, i'm curious if compiling in the suggested patch is of any use.

right now I'm leaving the laptop with acpi_osi=Linux set as boot parameter to see if this makes a difference at all

another thing is Ubuntu Gnome reports use of Mesa while i assume this is not what i915 is about, this may require a configuration adjustment as well, is this correct ?

tlaurion commented 10 months ago

@commandline-be a patch for efi was missing under 2.3. Please see PR under heads for testing but only do so if you have way of flashing externally as this might break console and display.

https://github.com/linuxboot/heads/pull/1522#issuecomment-1821567219

As said there no promise, but that patch missing was definitely causing efi FB from heads linuxbside to not be usable by efifb driver.

More explanations under that PR. As said there, having i915+drm under Heads Linux without proper command_line from coreboot to Linux makes no sense, where coreboot patch missing made efifb unable to parse "efi" Coreboot tables to Linux kernel. And reusing i915 on both heads and final os required quirks that were removed.

2.3 was a mess.

commandline-be commented 10 months ago

not having had time to read thru all of #1522 unless I'm deluding myself, for me it appears gfx to be working since CPU load is remarkably lower modifying intel_iommu value to 'on' works around the blank screen (as per normal use requirement) the acpi_osi=Linux command-line parameter may be working, i'll know in the morning for sure

tlaurion commented 10 months ago

@tlaurion @daringer are the references i shared (archlinux and freedesktop) of any relevance, i'm curious if compiling in the suggested patch is of any use.

right now I'm leaving the laptop with acpi_osi=Linux set as boot parameter to see if this makes a difference at all

another thing is Ubuntu Gnome reports use of Mesa while i assume this is not what i915 is about, this may require a configuration adjustment as well, is this correct ?

I checked real quick but again, suspend is supposed to be patched for NV41 under coreboot, not the same story is possible for NS50 AFAIK. But I might also be wrong, I don't know all the details, don't own the Platforms nor sell them and just followed coreboot mailing list and read some discussions.

On graphical issues, if the firmware doesn't mess anything from the firmware side, the os should be able to deal with the graphic adapter as if he was the first one initializing it. Gop blob should provide the bare minimal for coreboot to provide tables required by efifb to use the framebuffer without 3d init and make heads just draw in 2d to that FB correctly. In theory.

After that, just for all other Intel based boards under heads, kexec to final os should be able to have i915+drm do the right thing without quirks. But that is to be tested and validated. 2.3 never offered the proper coreboot base (patch even missing for efifb in Heads Linux kernel) to do proper tests, I have no idea what happened there and I'm sorry for the confusion.

commandline-be commented 10 months ago

for now i think it is safe to say for the NS50 acpi_osi=Linux works fine to remedy any suspend issues i had thus far personally i don't mind a list of command-line boot parameters as long as 'stuff' just works

more testing tomorrow morning

tlaurion commented 10 months ago

for now i think it is safe to say for the NS50 acpi_osi=Linux works fine to remedy any suspend issues i had thus far\npersonally i don't mind a list of command-line boot parameters as long as 'stuff' just works\n\nmore testing tomorrow morning

And that is where KERNEL_ADD should be used to add things required per board configurations without needing users to punch them in under grub after installation.

@commandline-be roms are downloadable from Circleci per upstream https://osresearch.net instructions (search download). Of course validate hash or you can also build yourself. Have external reprogrammer and be ready to restore backup of 2.2 release externally.

commandline-be commented 10 months ago

let's call 2.3 a beta release and add a simple script to modify /etc/default/grub to do the dirty work ? if it works it works, if people get some acceleration and less heat on CPU that will be much appreciated already simply adding a readme would also do i guess

tlaurion commented 10 months ago

Get me right. 2.2 to have become 2.3 would have required kernel_remove statements in kernel config if I understand well.

2.3 is faking efifb usage as of now and is actually not using it, not being feature ready.

I'm not part of the release team here, but both 2.2 and 2.3 are having issues dealing correctly with proper parameters to properly do what is supposed to be done.

2.2.1 could fix i915 parameters to boot into final os correctly, applying mitigation to release roms while fixing things correctly to create a 2.4 release actually enforcing efifb.

As of now, anybody trying to boot other OSes, for example tinycore would fail miserably on both releases, requiring an unaccelerated FB without implying i915+drm.

If resource are scarce, it's about doing something that works minimally and then work up to max performance at this point.

In 2.2.1 as said upstream, someone would need to apply revert of purism branch moving away of efifb to have something that works with acceleration on nv41/n50 with proper tweaks you seem to have discovered to fix suspend issues, directly passed to final os through kernel_add/kernel_remove per board config and making sure heads i915 kernel parameters matches the requirements that needed a lot of work to figure out and was worked on upstream and can be replicated and known to work.

That would be a working 2.2.1 release until 2.4 is workable without needing fixes.

Taking into consideration people will run a script and understand what to do is a general error. The firmware must do the right thing tm

commandline-be commented 10 months ago

https://bbs.archlinux.org/viewtopic.php?id=248107true, then again, I'm looking at this from a consumer point of view. glmark2 shows promising results which makes me happy already.

acpi_osi=Linux does not not work as expected i can say now with confidence my assumption at this time is there appears to be no issue with suspend actually but with deeper sleep states which causes a no-resume state, no matter what is tried at the keyboard

this issue here appears to describe exactly what I've observed, this dates back to 2019 ... it seems to yet another multi year issue with Linux/OSS (not trying to flame, just observing)

commandline-be commented 10 months ago

found a test: this outputs all states counting from 0, for the NS50 this counts to state4 which = C10

grep . /sys/devices/system/cpu/cpu0/cpuidle/state*/name
# which outputs
/sys/devices/system/cpu/cpu0/cpuidle/state0/name:POLL
.....
/sys/devices/system/cpu/cpu0/cpuidle/state4/name:C10

the question here is if the HEADS build has the powerstates for the NS50 supported correctly ? pm-hibernate works as expected when executed manually, this also when setting processor.max_cstate=5 intel_idle.max_cstate=5 Ubuntu suspend does result in a freeze however.

tlaurion commented 10 months ago

Nitrokey would need to dasharo coreboot releases closely. Issues related to power issues and such are related to coreboot there on which heads is based on.

https://docs.dasharo.com/variants/novacustom_ns5x_adl/releases/ https://docs.dasharo.com/variants/novacustom_nv4x_adl/releases/

~As one can see, those issues are fixed for NV41 12th Gen but not ns50.~ edit: those seems to be fixed in latest coreboot per november latest release notes that none 2.3/2.2 nitrokey release uses.

See relevant issues and verify coreboot hash under modules/coreboot for novacustom/nitrokey dasharo coreboot's fork.

@commandline-be you should open different issues for nitrokey to base itself on latest dasharo release for issues not related to blank screen here. Those issues reported, not related to this current issue name, will get lost in the noise.

commandline-be commented 10 months ago

Hey tlaurion, thanks for sharing that link. The contents support my experience, using pm-suspend or pm-hibernate there is indeed no issue. So the issue is how Ubuntu powermanagement is set to work, which i believe works with systemd.

the blank screen was resolved already by changing the intel_iomm= value from igfx_off to on.

commandline-be commented 10 months ago

blank screen resolved by changing the intel_iommu= value from igfx_off to on in /etc/default/grub then running update-grub

tlaurion commented 10 months ago

Hey tlaurion, thanks for sharing that link. The contents support my experience, using pm-suspend or pm-hibernate there is indeed no issue. So the issue is how Ubuntu powermanagement is set to work, which i believe works with systemd.

the blank screen was resolved already by changing the intel_iomm= value from igfx_off to on.

@commandline-be

Note that https://github.com/linuxboot/heads/pull/1522 is getting into the right path fixing what was missing the efifb patch needed permitting to remove i915 driver inside of heads and was reported working by @nestire there.

Where kernel_remove as stated before in board config would do what you suggested to do in grub, but automatically without manual intervention then flashing next available firmware upgrade from Nitrokey when available or heads upstream if desired.