MrChromebox / firmware

Issue tracker for firmware issues
78 stars 15 forks source link

Linux Kernels Sometimes Do Not Boot. #52

Closed ReddestDream closed 5 years ago

ReddestDream commented 7 years ago

So, this is a very long-standing issue, and it's time it was properly documented and investigated as part of making the UEFI firmwares more friendly for Linux users.

Ever since the first UEFI ROMs, Linux kernels have sometimes not booted. This seems to occur 1/10-1/20 boots. Kernel version is irrelevant. And once the system is in the state where it does happen, the state can be preserved by forcing warm reboot. Cold restart usually resolves it.

It seems to occur most commonly with GRUB, but it can also affect EFISTUB booting via rEFInd as well. On GRUB, this results in the system freezing after kernel selection. On rEFInd EFISTUB, freezing occurs on the splashscreen that details kernel parameters. If it happens when booting off a USB drive, the drive stops with lights solid (on or off). This appears to affect every device we support and every boot media possible (USB, SATA, eMMC, etc.).

This issue is strange and insidious. For months, I thought it was an issue with my drives or setup until I had confirmation of the problem from @MattDevo, @coolstar, and other users. I've also confirmed that this issue only occurs on Chrome hardware. These same drives do not cause this issue on non-Chrome hardware. At this point, I actually suspect there is a real issue with the firmware that causes this to happen. Based on my notes, the issue started with the first UEFI ROMs and persisted since.

A few other observations:

  1. The issue is more likely to occur when the system is started up totally cold (i.e., been off for at least a few hours; EC is off/hibernating).

  2. The likelihood of this issue occurring seems to increase when booting off USB drives if more USB ports are occupied.

  3. The likelihood of this issue occurring appears to increase after repeated Refresh + Power cycles.

Edit: This seems to be caused by the mode setting the Linux Intel Graphics driver does. It doesn't happen with "nomodeset" set.

Edit: There seems to be more to it then just the mode . . .

ReddestDream commented 7 years ago

My experiments seem to show that this is /much/ more likely to occur on i3 C720 than any other device. Perhaps I was onto something with graphics being part of the issue (HD4400)

ReddestDream commented 7 years ago

Issue confirmed by mpoly and still confirmed by me to be present on 07/14.

coolstar commented 7 years ago

This is likely a problem with EFI GRUB on newer UEFI firmwares. This issue also happens on the non-Chrome Lenovo Yoga 720-15

ReddestDream commented 7 years ago

@coolstar Could be. But I am unable to reproduce the issue described here on any non-Haswell devices that I have . . .

wfleurant commented 6 years ago

please append "panic=5" to your kernel parameters (probably in GRUB) -- this is the only thing i've found to fix Chromebox CN60 and CN62 hardware panics during cold/early boots. I found like a %25 rate of failure with reboot testing.