NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.58k stars 13.73k forks source link

KDE/SDDM fails to start on NVIDIA proprietary driver v560.35.03 + Kernel 6.11.0 (Could not initialize egl/EGL not available) #344167

Open opl- opened 6 days ago

opl- commented 6 days ago

Updating NixOS to nixpkgs c04d5652cfa9742b1d519688f65d1bbccea9eb7e results in SDDM crashing on startup with "Could not initialize egl" and "EGL not available" errors logged in the journal.

Additional context

nixpkgs: c04d5652cfa9742b1d519688f65d1bbccea9eb7e Kernel: v6.11.0 NVIDIA driver: v560.35.03 (crashes with both open and non-open kernel module) KDE: v6.1.5 (wayland) dGPU: NVIDIA RTX 3070 Ti Laptop

Previous working generation was running nixpkgs c374d94f1536013ca8e92341b540eba4c22f9c62 (Linux kernel v6.10.6 with the beta v560.31.02 NVIDIA driver).

# configuration.nix
boot.kernelPackages = pkgs.linuxPackages_latest;
services.xserver.videoDrivers = [ "nvidia" ];
hardware.nvidia.package = config.boot.kernelPackages.nvidiaPackages.beta;
hardware.nvidia.modesetting.enable = true;
hardware.nvidia.open = true; # either crashes
hardware.nvidia.powerManagement.enable = true;
hardware.nvidia.powerManagement.finegrained = false;
hardware.nvidia.prime.sync.enable = true;
sudo journalctl -b -1 | grep sddm Nearly identical with open and non-open kernel module, the only difference being the `HDMI-A-1` display being named unknown. ```console sddm[1664]: Greeter session started successfully sddm-helper-start-wayland[1874]: Starting Wayland process "/nix/store/yxy38krm4jpq9f4xbb3i31bszyp5dvv3-kwin-6.1.5/bin/kwin_wayland --no-global-shortcuts --no-kactivities --no-lockscreen --locale1" "sddm" sddm-helper-start-wayland[1874]: started succesfully "/nix/store/yxy38krm4jpq9f4xbb3i31bszyp5dvv3-kwin-6.1.5/bin/kwin_wayland --no-global-shortcuts --no-kactivities --no-lockscreen --locale1" sddm-helper-start-wayland[1874]: "No backend specified, automatically choosing drm\n" sddm-helper-start-wayland[1874]: Directory "/run/user/175" has changed, checking for Wayland socket sddm-helper-start-wayland[1874]: Found Wayland socket "/run/user/175/wayland-0" sddm-helper-start-wayland[1874]: "Accepting client connections on sockets: QList(\"wayland-0\")\n" sddm-greeter-qt6[1893]: High-DPI autoscaling Enabled sddm-helper-start-wayland[1874]: "\"applications.menu\" not found in QList(\"/run/current-system/sw/etc/xdg/menus\")\n" sddm-helper-start-wayland[1874]: "kwin_scene_opengl: Creating the OpenGL rendering failed: \"Could not initialize egl\"\n" sddm-greeter-qt6[1893]: Reading from "/nix/store/7j5hgwyngfx5vpdkyh29ar8bzg43xdip-desktops/share/wayland-sessions/plasma.desktop" sddm-greeter-qt6[1893]: Reading from "/nix/store/7j5hgwyngfx5vpdkyh29ar8bzg43xdip-desktops/share/xsessions/plasmax11.desktop" sddm-greeter-qt6[1893]: Loading theme configuration from "/run/current-system/sw/share/sddm/themes/breeze/theme.conf" sddm-greeter-qt6[1893]: Connected to the daemon. sddm[1664]: Message received from greeter: Connect sddm-greeter-qt6[1893]: EGL not available sddm-greeter-qt6[1893]: Loading file:///run/current-system/sw/share/sddm/themes/breeze/Main.qml... sddm-greeter-qt6[1893]: failed to acquire GL context to resolve capabilities, using defaults.. sddm-greeter-qt6[1893]: Adding view for "HDMI-A-1" QRect(800,0 2048x1152) sddm-greeter-qt6[1893]: Loading file:///run/current-system/sw/share/sddm/themes/breeze/Main.qml... sddm-greeter-qt6[1893]: failed to acquire GL context to resolve capabilities, using defaults.. sddm-greeter-qt6[1893]: Adding view for "eDP-2" QRect(2848,0 1707x1067) sddm-greeter-qt6[1893]: Loading file:///run/current-system/sw/share/sddm/themes/breeze/Main.qml... sddm-greeter-qt6[1893]: failed to acquire GL context to resolve capabilities, using defaults.. sddm-greeter-qt6[1893]: Adding view for "Unknown-1" QRect(0,0 800x600) sddm-greeter-qt6[1893]: Message received from daemon: Capabilities sddm-greeter-qt6[1893]: Message received from daemon: HostName sddm-greeter-qt6[1893]: QRhiGles2: Failed to create temporary context sddm-greeter-qt6[1893]: QRhiGles2: Failed to create context sddm-greeter-qt6[1893]: Failed to create RHI (backend 2) sddm-greeter-qt6[1893]: Failed to initialize graphics backend for OpenGL. systemd-coredump[2002]: Process 1893 (sddm-greeter-qt) of user 175 terminated abnormally with signal 6/ABRT, processing... systemd-coredump[2003]: Process 1893 (sddm-greeter-qt) of user 175 dumped core. Module sddm-greeter-qt6 without build-id. #20 0x00000000004125b4 main (sddm-greeter-qt6 + 0x125b4) #23 0x0000000000412a25 _start (sddm-greeter-qt6 + 0x12a25) sddm-helper-start-wayland[1874]: wayland greeter finished 6 QProcess::CrashExit sddm-helper-start-wayland[1874]: quitting helper-start-wayland sddm-helper-start-wayland[1874]: Stopping... "/nix/store/yxy38krm4jpq9f4xbb3i31bszyp5dvv3-kwin-6.1.5/bin/kwin_wayland" sddm-helper-start-wayland[1874]: wayland compositor finished 15 QProcess::NormalExit sddm-helper-start-wayland[1874]: quitting helper-start-wayland sddm-helper[1764]: [PAM] Closing session sddm-helper[1764]: pam_systemd(sddm-greeter:session): New sd-bus connection (system-bus-pam-systemd-1764) opened. drkonqi-coredump-processor[2004]: "/nix/store/shlcpqycfm5ni30aigipjfig8lxg112w-sddm-unwrapped-0.21.0/bin/sddm-greeter-qt6" 1893 "/var/lib/systemd/coredump/core.sddm-greeter-qt.175.8d57ab7e4618474cabfaa73d494e5ada.1893.1727162623000000.zst" drkonqi-coredump-launcher[2034]: Unable to find file for pid 1893 expected at "kcrash-metadata/sddm-greeter-qt6.8d57ab7e4618474cabfaa73d494e5ada.1893.ini" sddm-helper[1764]: [PAM] Ended. sddm[1664]: Auth: sddm-helper exited successfully sddm[1664]: Greeter stopped. SDDM::Auth::HELPER_SUCCESS (sd-pam)[1790]: pam_unix(systemd-user:session): session closed for user sddm ```

The simple-framebuffer section is not present in the drmdevice output when using my previous system generation.

nix shell nixpkgs#libdrm^bin -c drmdevice ```console --- Checking the number of DRM device available --- --- Devices reported 3 --- --- Retrieving devices information (PCI device revision is ignored) --- device[0] +-> available_nodes 0x01 +-> nodes | +-> nodes[0] /dev/dri/card0 +-> bustype 0002 | +-> platform | +-> fullname simple-framebuffer +-> deviceinfo +-> platform +-> compatible simple-framebuffer --- Opening device node /dev/dri/card0 --- --- Retrieving device info, for node /dev/dri/card0 --- device[0] +-> available_nodes 0x01 +-> nodes | +-> nodes[0] /dev/dri/card0 +-> bustype 0002 | +-> platform | +-> fullname simple-framebuffer +-> deviceinfo +-> platform +-> compatible simple-framebuffer device[1] +-> available_nodes 0x05 +-> nodes | +-> nodes[0] /dev/dri/card2 | +-> nodes[2] /dev/dri/renderD129 +-> bustype 0000 | +-> pci | +-> domain 0000 | +-> bus 01 | +-> dev 00 | +-> func 0 +-> deviceinfo +-> pci +-> vendor_id 10de +-> device_id 24a0 +-> subvendor_id 1043 +-> subdevice_id 1a8c +-> revision_id IGNORED --- Opening device node /dev/dri/card2 --- --- Retrieving device info, for node /dev/dri/card2 --- device[1] +-> available_nodes 0x05 +-> nodes | +-> nodes[0] /dev/dri/card2 | +-> nodes[2] /dev/dri/renderD129 +-> bustype 0000 | +-> pci | +-> domain 0000 | +-> bus 01 | +-> dev 00 | +-> func 0 +-> deviceinfo +-> pci +-> vendor_id 10de +-> device_id 24a0 +-> subvendor_id 1043 +-> subdevice_id 1a8c +-> revision_id a1 --- Opening device node /dev/dri/renderD129 --- --- Retrieving device info, for node /dev/dri/renderD129 --- device[1] +-> available_nodes 0x05 +-> nodes | +-> nodes[0] /dev/dri/card2 | +-> nodes[2] /dev/dri/renderD129 +-> bustype 0000 | +-> pci | +-> domain 0000 | +-> bus 01 | +-> dev 00 | +-> func 0 +-> deviceinfo +-> pci +-> vendor_id 10de +-> device_id 24a0 +-> subvendor_id 1043 +-> subdevice_id 1a8c +-> revision_id a1 device[2] +-> available_nodes 0x05 +-> nodes | +-> nodes[0] /dev/dri/card1 | +-> nodes[2] /dev/dri/renderD128 +-> bustype 0000 | +-> pci | +-> domain 0000 | +-> bus 00 | +-> dev 02 | +-> func 0 +-> deviceinfo +-> pci +-> vendor_id 8086 +-> device_id 46a6 +-> subvendor_id 1043 +-> subdevice_id 1a8c +-> revision_id IGNORED --- Opening device node /dev/dri/card1 --- --- Retrieving device info, for node /dev/dri/card1 --- device[2] +-> available_nodes 0x05 +-> nodes | +-> nodes[0] /dev/dri/card1 | +-> nodes[2] /dev/dri/renderD128 +-> bustype 0000 | +-> pci | +-> domain 0000 | +-> bus 00 | +-> dev 02 | +-> func 0 +-> deviceinfo +-> pci +-> vendor_id 8086 +-> device_id 46a6 +-> subvendor_id 1043 +-> subdevice_id 1a8c +-> revision_id 0c --- Opening device node /dev/dri/renderD128 --- --- Retrieving device info, for node /dev/dri/renderD128 --- device[2] +-> available_nodes 0x05 +-> nodes | +-> nodes[0] /dev/dri/card1 | +-> nodes[2] /dev/dri/renderD128 +-> bustype 0000 | +-> pci | +-> domain 0000 | +-> bus 00 | +-> dev 02 | +-> func 0 +-> deviceinfo +-> pci +-> vendor_id 8086 +-> device_id 46a6 +-> subvendor_id 1043 +-> subdevice_id 1a8c +-> revision_id 0c ```

Notify maintainers

@Kiskae @edwtjo

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

 - system: `"x86_64-linux"`
 - host os: `Linux 6.11.0, NixOS, 24.11 (Vicuna), 24.11.20240919.c04d565`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.5`
 - nixpkgs: `/nix/store/hiasfhl8f5yy88hcfbr3s8s4bm63wsjw-source`

Add a :+1: reaction to issues you find important.

opl- commented 6 days ago

Linking issue #343774 as it might be related, but the errors in the logs given there differ from mine.

This comment on that issue links to an Arch forum thread, where someone explains the issue is caused by "simpledrm" not being automatically disabled by the NVIDIA driver due to header changes in kernel v6.11.0.

SDDM and KDE start correctly when testing the suggested workaround by adding initcall_blacklist=simpledrm_platform_driver_init to kernel parameters with the open kernel modules, but it causes console TTYs to freeze almost immediately during boot, eternally showing only the first two lines of boot logs. I think KDE crashed twice without the open kernel module.

To quickly test if this will fix the issue, I selected the NixOS generation with kernel v6.11.0 in grub, pressed [e], then added initcall_blacklist=simpledrm_platform_driver_init at the end of the text box at the bottom, separated from the rest by a space, and pressed [enter] to boot.

opl- commented 6 days ago

There's already a PR to the NVIDIA open-gpu-kernel-modules repository which adds support for the renamed kernel header files.

I tried to test it with the following NixOS configuration change after merging the PR into the v560.35.03 kernel module. I think this is technically incorrect as I'm not globally overriding the linuxPackages.nvidia_x11 package, but the Nix documentation again failed to assist me in doing that.

As a result SDDM was no longer crashing, but wasn't rendering correctly either, staying as a black screen. The only reason I realized it's running is because it briefly flashed (at the wrong resolution) when I switched to a console TTY.

After blindly entering my password into the black SDDM, KDE crashed with the errors from #343774 appearing in it.

I guess I'm finally experiencing the reasons why people always say not to run the latest kernel with NVIDIA proprietary drivers.

{ config, pkgs }: {
  # This does not work. Kind of.
  hardware.nvidia.open = true;
  hardware.nvidia.package = config.boot.kernelPackages.nvidiaPackages.beta.overrideAttrs {
    open = config.boot.kernelPackages.nvidiaPackages.beta.open.overrideAttrs {
      src = pkgs.fetchFromGitHub {
        owner = "opl-";
        repo = "open-gpu-kernel-modules";
        rev = "main";
        hash = "sha256-SzbXewSU1Mn8uFtLlDGiJKJSEkXBoTRpLlFzlvZiliU=";
      };
    };
  };
}
opl- commented 6 days ago

And indeed, Kernel v6.10.11 ({ boot.kernelPackages = pkgs.linuxPackages_6_10; }) works fine with NVIDIA proprietary v560.35.03 + open kernel module.

VeilSilence commented 6 days ago

Nvidia issue. Stay at 6.10 until new driver release.