Closed Mr-Andersen closed 1 month ago
@Mr-Andersen Internal ticket has been created to investigate this issue. Thanks!
Hi @Mr-Andersen, are your kernel and mesa libs up to date? It seems like there are some similar issues caused by out-of-date kernel and mesa versions.
@schung-amd I've upgraded to Linux 6.6.47 and Mesa 24.2.0, still having the issue. Can't bump Linux further yet, since I need ZFS. In which versions were those similar issues resolved?
Thanks for checking! It's unclear what versions help; one user's issues were fixed by a kernel update in 6.5.x with mesa 23.1.7 (https://www.reddit.com/r/linux_gaming/comments/16jhxnz/starfield_crashes_amd_radeon_rx_6600/), while others are having issues with more recent versions. Some users are reporting RAM problems being related (https://gitlab.freedesktop.org/drm/amd/-/issues/2943). These issues are for different workloads on various hardware, so your underlying issue may be different, but might provide a clue. I'll try to reproduce your issue with wezterm specifically on similar hardware and get back to you.
Hi @Mr-Andersen, I was unable to reproduce your issue, but I may be missing something in the NixOS configuration. On a fresh install of NixOS 24.05 on an RX 6400, I installed ROCm and wezterm
through the config file by
environment.systemPackages = with pkgs; [
pkgs.rocmPackages.rpp
pkgs.wezterm
];
in /etc/nixos/configuration.nix
followed by a sudo nixos-rebuild switch
, and I can use wezterm
without any obvious issue. Is there a crash or hang occurring when you encounter this issue, or are the error messages the only symptom?
I've also tried enabling OpenGL in the config file, but this doesn't cause wezterm
to break. Do you have other options enabled which are related to GPU acceleration?
Hey @schung-amd, here are all the relevant options from my config:
{ ... }: {
boot.initrd.kernelModules = [ "amdgpu" ];
boot = {
kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;
kernelParams = [ "amdgpu.runpm=0" ]; # <-- this was me trying to fix the issue by reading Arch forums :)
};
hardware = {
graphics = {
enable = true;
enable32Bit = true;
};
};
services = {
displayManager = {
defaultSession = "xfce";
};
xserver = {
enable = true;
displayManager.lightdm.enable = true;
desktopManager.xfce.enable = true;
};
};
}
My current nixpkgs commit is 12228ff1752d7b7624a54e9c1af4b222b3c1073b
. I am on github:NixOS/nixpkgs/nixos-unstable
branch currently, but I've starting seeing the issue while using nixos-24.05
.
Here is how I experience it:
I should've provided this config since the beginning, sorry about that. This is my first serious bug report :)
No worries, thanks for the config information. A couple follow-up questions so I can try to reproduce the issue:
pkgs.rocmPackages.rpp
) or was amdgpu built-in?pkgs.rocmPackages.rpp
Sure, a list of packages couldn't hurt. I'm more interested in the other config options, in case we could narrow this down to a config change, but if you haven't tested on a fresh install that's ok.
I forgot there is also a hardware-configuration.nix
{ config, lib, pkgs, modulesPath, ... }:
{
imports =
[ (modulesPath + "/installer/scan/not-detected.nix")
];
boot.initrd.availableKernelModules = [ "xhci_pci" "ahci" "usbhid" "usb_storage" "sd_mod" ];
boot.initrd.kernelModules = [ ];
boot.kernelModules = [ "kvm-amd" ];
boot.extraModulePackages = [ ];
nixpkgs.hostPlatform = lib.mkDefault "x86_64-linux";
hardware.cpu.amd.updateMicrocode = lib.mkDefault config.hardware.enableRedistributableFirmware;
}
To clarify - by "fresh install" you mean "an install with as little customization as possible"? My install is new - I have had these issues since the first boot. Maybe that's what you meant?
Yes, that's what I meant, sorry for any confusion. That will make this much easier, thanks. I'll try running a high load as you suggest and see if I can reproduce the issue.
I am unable to reproduce the issue on XFCE, even at high load. Could you upload your configuration.nix
file? You can scrub out your user information if you want to. hardware-configuration.nix
should be automatically generated, but uploading this might help as well, so I can check for any discrepancies between your system and what I'm trying to repro with. Thanks!
Closing this as I can't reproduce the issue. If you'd still like support on this issue, feel free to reopen with your configuration.nix
and hardware-configuration.nix
files, and ideally with a consistent method of reproducing the issue.
Sorry for leaving this thread. It seems that the issue was fixed somewhere upstream https://discourse.nixos.org/t/getting-amdgpu-error-that-crashes-desktop/50510/8?u=mr-andersen
Glad to hear your issue is resolved, thanks for the update!
Problem Description
I get
[gfxhub] page fault
and then[drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
I couldn't find my GPU (AMD Radeon RX 6400 / gfx1034) in the list below :( So I chose the first one. Additionally, I didn't find instructions on how to find out my ROCm version; I am guessing it's 6.0.2 since it's the default one on current NixOS.
Operating System
NixOS 24.05 (Uakari)
CPU
AMD Ryzen 5 3600 6-Core Processor
GPU
AMD Instinct MI300X
ROCm Version
ROCm 6.0.0
ROCm Component
No response
Steps to Reproduce
Run wezterm; wait
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
Additional Information