tpwrules / nixos-apple-silicon

Resources to install NixOS bare metal on Apple Silicon Macs
MIT License
743 stars 73 forks source link

Hang up on startup #160

Closed drksnw closed 4 months ago

drksnw commented 4 months ago

Hello !

I have an issue where the system hangs up on startup and immediately reboots without throwing any error message. The issue happens on a MacBook Pro M2 Pro.

This problem never happened on Fedora Asahi.

However, sometimes the system manages to boot and is perfectly usable after that, so it doesn't look like an issue with the system generation.

Here is a journalctl boot log from a failed boot :

failed_boot.txt

Here is my configuration.nix :

# Edit this configuration file to define what should be installed on
# your system. Help is available in the configuration.nix(5) man page, on
# https://search.nixos.org/options and in the NixOS manual (`nixos-help`).

{ config, lib, pkgs, ... }:

{
  imports =
    [ # Include the results of the hardware scan.
      ./hardware-configuration.nix
      <apple-silicon-support/apple-silicon-support>
    ];

  # Use the systemd-boot EFI boot loader.
  boot.loader.systemd-boot.enable = true;
  boot.loader.efi.canTouchEfiVariables = false;

  networking.hostName = "supernova"; # Define your hostname.
  # Pick only one of the below networking options.
  # networking.wireless.enable = true;  # Enables wireless support via wpa_supplicant.
  networking.networkmanager.enable = true;  # Easiest to use and most distros use this by default.

  # Set your time zone.
  time.timeZone = "Europe/Zurich";

  # Configure network proxy if necessary
  # networking.proxy.default = "http://user:password@proxy:port/";
  # networking.proxy.noProxy = "127.0.0.1,localhost,internal.domain";

  # Select internationalisation properties.
  #i18n.defaultLocale = "en_US.UTF-8";
  #console = {
  #  font = "Lat2-Terminus16";
  #  keyMap = "ch";
  #  useXkbConfig = true; # use xkb.options in tty.
  #};

  # Enable the X11 windowing system.
  # services.xserver.enable = true;

  # Configure keymap in X11
  # services.xserver.xkb.layout = "us";
  # services.xserver.xkb.options = "eurosign:e,caps:escape";

  # Enable CUPS to print documents.
  # services.printing.enable = true;

  # Sway
  programs.sway.enable = true;
  security.polkit.enable = true;
  hardware.opengl.enable = true;
  hardware.opengl.driSupport = true;

  # Enable sound.
  #sound.enable = true;
  #nixpkgs.config.pulseaudio = true;
  #hardware.pulseaudio.enable = true;

  # Enable touchpad support (enabled default in most desktopManager).
  # services.xserver.libinput.enable = true;

  # Define a user account. Don't forget to set a password with ‘passwd’.
  users.users.yaska = {
    isNormalUser = true;
    extraGroups = [ "wheel" ]; # Enable ‘sudo’ for the user.
  };

  # List packages installed in system profile. To search, run:
  # $ nix search wget
  environment.systemPackages = with pkgs; [
    vim # Do not forget to add an editor to edit configuration.nix! The Nano editor is also installed by default.
    wget
    firefox
  ];

  # Some programs need SUID wrappers, can be configured further or are
  # started in user sessions.
  # programs.mtr.enable = true;
  # programs.gnupg.agent = {
  #   enable = true;
  #   enableSSHSupport = true;
  # };

  # List services that you want to enable:

  # Enable the OpenSSH daemon.
  # services.openssh.enable = true;

  # Open ports in the firewall.
  # networking.firewall.allowedTCPPorts = [ ... ];
  # networking.firewall.allowedUDPPorts = [ ... ];
  # Or disable the firewall altogether.
  # networking.firewall.enable = false;

  # Copy the NixOS configuration file and link it from the resulting system
  # (/run/current-system/configuration.nix). This is useful in case you
  # accidentally delete configuration.nix.
  system.copySystemConfiguration = true;

  # This option defines the first version of NixOS you have installed on this particular machine,
  # and is used to maintain compatibility with application data (e.g. databases) created on older NixOS versions.
  #
  # Most users should NEVER change this value after the initial install, for any reason,
  # even if you've upgraded your system to a new NixOS release.
  #
  # This value does NOT affect the Nixpkgs version your packages and OS are pulled from,
  # so changing it will NOT upgrade your system - see https://nixos.org/manual/nixos/stable/#sec-upgrading for how
  # to actually do that.
  #
  # This value being lower than the current NixOS release does NOT mean your system is
  # out of date, out of support, or vulnerable.
  #
  # Do NOT change this value unless you have manually inspected all the changes it would make to your configuration,
  # and migrated your data accordingly.
  #
  # For more information, see `man configuration.nix` or https://nixos.org/manual/nixos/stable/options#opt-system.stateVersion .
  system.stateVersion = "24.05"; # Did you read the comment?

  hardware.asahi.peripheralFirmwareDirectory = ./firmware;
  hardware.asahi.withRust = true;
  hardware.asahi.addEdgeKernelConfig = true;
  hardware.asahi.useExperimentalGPUDriver = true;
  hardware.asahi.experimentalGPUInstallMode = "replace";

}

Please tell me if you need more informations.

Thanks !

zzywysm commented 4 months ago

FYI, i've been seeing random system crashes very early on boot recently too, but it is not an exclusive problem to Asahi. I see it in my vanilla QEMU+KVM virtual machines running both NixOS and Fedora. I suspect there is some weird intermittent problem in recent arm64 kernels or possibly recent versions of systemd.

drksnw commented 4 months ago

Yeah you're right, there's definitely something strange in systemd...

I've tried to enhance systemd's loglevel with this modification to my configuration.nix :

  boot.kernelParams = [
    "systemd.log_level=debug" 
    "systemd.log_target=journal"
  ];

And.... system boots every time... That will be a pain to debug...

rowanG077 commented 4 months ago

Happens to me too after updating to the latest nixpgks-unstable

tpwrules commented 4 months ago

I can replicate this on my machine. It seems related to graphics drivers, as when a successful boot occurs, I only have ever gotten software rendering. I updated to the 2024-02-14 Mesa and then X just crashes and the login screen doesn't come up on a "successful" boot (instead of the whole machine rebooting).

I'm not having much luck debugging either, haven't identified a more recent nixpkgs which works. I will likely do a release soon with the most recent nixpkgs that works then come back to this later.

zzywysm commented 4 months ago

I can replicate this on my machine. It seems related to graphics drivers, as when a successful boot occurs, I only have ever gotten software rendering. I updated to the 2024-02-14 Mesa and then X just crashes and the login screen doesn't come up on a "successful" boot (instead of the whole machine rebooting).

FYI, the 20240214 Mesa (and the subsequent bug fix releases) will not work if you're not also running linux-asahi-6.6-15.

tpwrules commented 4 months ago

Turns out I accidentally broke Rust support due to some other changes so that's why GPU acceleration never worked. So I don't think it's related as much anymore.

But I still don't see anything obvious. I'm considering shipping the nixpkgs in the wip branch where crashes seem very rare instead of frequent, but I can't seem to totally prevent or always trigger the issue. Sometimes even on very crashy nixpkgs versions I can get several reboots in a row without issues.

tpwrules commented 4 months ago

A weekend of bisecting and several hundred reboots later... it's fixed. Please upgrade to the latest release.

Determining why the SD reader is at fault and what, if anything, might need to be fixed in the kernel or whatever, I leave to someone else.