NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.29k stars 13.54k forks source link

linux-rpi: 6.6 kernel causes boot loop on Raspberry Pi 4 #325473

Open Scrumplex opened 2 months ago

Scrumplex commented 2 months ago

Describe the bug

On my Raspberry Pi 4, upgrading from 6.1.63-stable_20231123 to 6.6.31-stable_20240529 causes a reset after 20-40s of uptime.

This is not a software-initiated reboot, as there are no kernel messages before it resets, even with loglevel=7.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Update to a recent nixos-unstable revision
  2. Activate new NixOS version
  3. Reboot

Expected behavior

When booting my Raspberry Pi 4 with 6.6.31-stable_20240529 it should just run indefinitely just like with 6.1.63-stable_20231123

Screenshots

N/A

Additional context

My current workaround is to just use the kernel from nixos-24.05 like this:

{inputs, pkgs, ...}: {
  boot.kernelPackages = inputs.nixpkgs-stable.legacyPackages.${pkgs.system}.linuxKernel.packages.linux_rpi4;
}

This was caused by #292880

Notify maintainers

@peat-psuwit

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"aarch64-linux"`
 - host os: `Linux 6.1.63, NixOS, 24.11 (Vicuna), 24.11.20240703.9f4128e`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.4`
 - channels(root): `"nixos-22.11.3299.9ef6e7727f4"`
 - nixpkgs: `/etc/nix/channels/nixpkgs`

Add a :+1: reaction to issues you find important.

Cryolitia commented 1 month ago

Could you test raspbian with the same kernel.

peat-psuwit commented 1 month ago

Well... this is unfortunate.

I tested kernel upgrades by applying it first on my RPi 4 running NixOS 23.05. The kernel is built with NixOS 23.05 kernel config and toolchain, and it does run fine.

However, NixOS's default kernel config has changed since then, and then there's toolchain differences, so it's difficult to test without a spare RPi with NixOS unstable installed. Unfortunately I don't have one, and I can't upgrade my RPi 4 to NixOS unstable (its running a task in my home), so I won't be able to help.

Do you have a UART serial connector? It should be able to give you kernel debug info even if the DRM system is broken.

bddvlpr commented 1 month ago

Also running into this issue on b5e96f2.

bddvlpr commented 1 month ago

Managed to find a very old UART usb but I'm rather confused, I don't think the corruption is because of the UART connector nor the pins.

~~The log file contains two boots; one initial boot and a second one because of the bootloop. ttyUSB0.log~~

Managed to find another UART usb that doesn't cause corruption. Is there a way to make it more verbose? ttyUSB0.log

peat-psuwit commented 1 month ago

Managed to find another UART usb that doesn't cause corruption. Is there a way to make it more verbose?

Maybe try adding loglevel=8 to kernel command line. This should enable highest verbose level in the kernel. It can go even more verbose per-component, but let's start from here.

bddvlpr commented 1 month ago

With loglevel set to 8, not much difference. Still reboots out of nowhere with no signs in the kernel log. ttyUSB0.log