CachyOS / linux-cachyos

Archlinux Kernel based on different schedulers and some other performance improvements.
https://cachyos.org
GNU General Public License v3.0
937 stars 37 forks source link

NVIDIA not working in Cachy-OS kernel since few weeks #324

Closed mysteryx93 closed 1 week ago

mysteryx93 commented 3 weeks ago

Since updating a few weeks ago, NVidia GPU (GTX 2060) is not working when booting with Cachy-OS so the HDMI output isn't working after boot. That problem happened in the past with updates and I run nvidia-all to reinstall and it fixes it, but this time I tried installing various versions and couldn't fix it. Switching to kernel linux-lts however solves the issue, so it's with the kernel. I waited a few weeks, just updated again, and the problem persists.

Here's my inxi, when booted in linux-lts

System:
Kernel: 6.6.59-1.1-lts arch: x86_64 bits: 64 compiler: gcc v: 14.2.1
clocksource: tsc avail: hpet,acpi_pm
parameters: BOOT_IMAGE=/@/boot/vmlinuz-linux-lts
root=UUID=51ef4a7a-fb89-43c3-a466-7318e0363e7e rw rootflags=subvol=@
quiet loglevel=3 ibt=off
Desktop: KDE Plasma v: 6.2.2 tk: Qt v: N/A info: frameworks v: 6.7.0
wm: kwin_x11 vt: 2 dm: SDDM Distro: Garuda base: Arch Linux
Machine:
Type: Laptop System: Acer product: Predator PH315-53 v: V2.04
serial: <superuser required>
Mobo: CML model: QX50_CMS v: V2.04 serial: <superuser required>
part-nu: 0000000000000000 uuid: <superuser required> UEFI: Insyde v: 2.04
date: 08/20/2021
Battery:
ID-1: BAT1 charge: 40.7 Wh (100.0%) condition: 40.7/58.8 Wh (69.3%)
volts: 16.4 min: 15.4 model: SMP AP18E7M type: Li-ion serial: <filter>
status: full
CPU:
Info: model: Intel Core i7-10750H bits: 64 type: MT MCP arch: Comet Lake
gen: core 10 level: v3 note: check built: 2020 process: Intel 14nm family: 6
model-id: 0xA5 (165) stepping: 2 microcode: 0xFC
Topology: cpus: 1x dies: 1 clusters: 6 cores: 6 threads: 12 tpc: 2
smt: enabled cache: L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 1.5 MiB
desc: 6x256 KiB L3: 12 MiB desc: 1x12 MiB
Speed (MHz): avg: 1061 min/max: 800/5000 scaling: driver: intel_pstate
governor: powersave cores: 1: 1061 2: 1061 3: 1061 4: 1061 5: 1061 6: 1061
7: 1061 8: 1061 9: 1061 10: 1061 11: 1061 12: 1061 bogomips: 62431
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Vulnerabilities: <filter>
Graphics:
Device-1: Intel CometLake-H GT2 [UHD Graphics] vendor: Acer Incorporated ALI
driver: i915 v: kernel arch: Gen-9.5 process: Intel 14nm built: 2016-20
ports: active: none off: eDP-1 empty: HDMI-A-2 bus-ID: 00:02.0
chip-ID: 8086:9bc4 class-ID: 0300
Device-2: NVIDIA TU106M [GeForce RTX 2060 Mobile]
vendor: Acer Incorporated ALI driver: nvidia v: 565.57.01
alternate: nouveau,nvidia_drm non-free: 550.xx+ status: current (as of
2024-09; EOL~2026-12-xx) arch: Turing code: TUxxx process: TSMC 12nm FF
built: 2018-2022 pcie: gen: 1 speed: 2.5 GT/s lanes: 16 link-max: gen: 3
speed: 8 GT/s ports: active: none off: HDMI-A-1 empty: DP-1
bus-ID: 01:00.0 chip-ID: 10de:1f15 class-ID: 0300
Device-3: Quanta HD User Facing driver: uvcvideo type: USB rev: 2.0
speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-5:3 chip-ID: 0408:a061
class-ID: 0e02
Display: x11 server: X.Org v: 21.1.14 with: Xwayland v: 24.1.4
compositor: kwin_x11 driver: X: loaded: modesetting,nvidia unloaded: nouveau
alternate: fbdev,intel,nv,vesa dri: iris gpu: i915,nvidia,nvidia-nvswitch
display-ID: :0 screens: 1
Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x285mm (20.00x11.22")
s-diag: 582mm (22.93")
Monitor-1: HDMI-A-1 mapped: HDMI-1-0 note: disabled pos: primary
model: Samsung serial: <filter> built: 2016 res: 1920x1080 hz: 60 dpi: 40
gamma: 1.2 size: 1210x680mm (47.64x26.77") diag: 1168mm (46") ratio: 16:9
modes: max: 1920x1080 min: 640x480
Monitor-2: eDP-1 note: disabled model: AU Optronics 0x82ed built: 2018
res: 1920x1080 dpi: 142 gamma: 1.2 size: 344x194mm (13.54x7.64")
diag: 394mm (15.5") ratio: 16:9 modes: 1920x1080
API: EGL v: 1.5 hw: drv: intel iris drv: nvidia platforms: device: 0
drv: nvidia device: 2 drv: iris device: 3 drv: swrast gbm: drv: nvidia
surfaceless: drv: nvidia x11: drv: iris inactive: wayland,device-1
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: intel mesa v: 24.2.6-arch1.1.1
glx-v: 1.4 direct-render: yes renderer: Mesa Intel UHD Graphics (CML GT2)
device-ID: 8086:9bc4 memory: 7.54 GiB unified: yes
API: Vulkan v: 1.3.295 layers: 10 device: 0 type: integrated-gpu
name: Intel UHD Graphics (CML GT2) driver: mesa intel v: 24.2.6-arch1.1.1
device-ID: 8086:9bc4 surfaces: xcb,xlib device: 1 type: discrete-gpu
name: NVIDIA GeForce RTX 2060 driver: nvidia v: 565.57.01
device-ID: 10de:1f15 surfaces: xcb,xlib device: 2 type: cpu name: llvmpipe
(LLVM 18.1.8 256 bits) driver: mesa llvmpipe v: 24.2.6-arch1.1.1 (LLVM
18.1.8) device-ID: 10005:0000 surfaces: xcb,xlib
Audio:
Device-1: Intel Comet Lake PCH cAVS vendor: Acer Incorporated ALI
driver: snd_hda_intel v: kernel alternate: snd_soc_skl,snd_sof_pci_intel_cnl
bus-ID: 00:1f.3 chip-ID: 8086:06c8 class-ID: 0403
Device-2: NVIDIA TU106 High Definition Audio vendor: Acer Incorporated ALI
driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16
bus-ID: 01:00.1 chip-ID: 10de:10f9 class-ID: 0403
Device-3: Texas Instruments PCM2900B Audio CODEC
driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 2.0 speed: 12 Mb/s
lanes: 1 mode: 1.1 bus-ID: 1-1.3:6 chip-ID: 08bb:29b0 class-ID: 0300
API: ALSA v: k6.6.59-1.1-lts status: kernel-api tools: N/A
Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
Server-2: PipeWire v: 1.2.6 status: active with: 1: pipewire-pulse
status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
Device-1: Intel Comet Lake PCH CNVi WiFi vendor: Rivet Networks Dual Band
Wi-Fi 6 Killer AX1650i 160MHz 2x2 driver: iwlwifi v: kernel
bus-ID: 00:14.3 chip-ID: 8086:06f0 class-ID: 0280
IF: wlp0s20f3 state: up mac: <filter>
Device-2: Realtek Killer E2600 GbE vendor: Acer Incorporated ALI
driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: 3000
bus-ID: 08:00.0 chip-ID: 10ec:2600 class-ID: 0200
IF: enp8s0 state: down mac: <filter>
IF-ID-1: docker0 state: down mac: <filter>
IF-ID-2: virbr0 state: down mac: <filter>
Info: services: NetworkManager, smbd, systemd-timesyncd, wpa_supplicant
Bluetooth:
Device-1: Intel AX201 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0
speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-14:5 chip-ID: 8087:0026
class-ID: e001
Report: btmgmt ID: hci0 rfk-id: 1 state: down bt-service: enabled,running
rfk-block: hardware: no software: yes address: <filter> bt-v: 5.2 lmp-v: 11
status: discoverable: no pairing: no
Drives:
Local Storage: total: 3.19 TiB used: 1.95 TiB (61.1%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Western Digital model: PC SN730
SDBQNTY-512G-1014 size: 476.94 GiB block-size: physical: 512 B
logical: 512 B speed: 31.6 Gb/s lanes: 4 tech: SSD serial: <filter>
fw-rev: 11101100 temp: 24.9 C scheme: GPT
ID-2: /dev/nvme1n1 maj-min: 259:2 vendor: Western Digital
model: WD Blue SN570 2TB size: 1.82 TiB block-size: physical: 512 B
logical: 512 B speed: 31.6 Gb/s lanes: 4 tech: SSD serial: <filter>
fw-rev: 234200WD temp: 37.9 C scheme: GPT
ID-3: /dev/sda maj-min: 8:0 vendor: HGST (Hitachi) model: HTS721010A9E630
size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
tech: HDD rpm: 7200 serial: <filter> fw-rev: A3J0 scheme: GPT
Partition:
ID-1: / raw-size: 1.73 TiB size: 1.73 TiB (100.00%) used: 897.6 GiB (50.6%)
fs: btrfs dev: /dev/nvme1n1p3 maj-min: 259:5
ID-2: /boot/efi raw-size: 625 MiB size: 623.7 MiB (99.80%)
used: 584 KiB (0.1%) fs: vfat dev: /dev/nvme1n1p1 maj-min: 259:3
ID-3: /home raw-size: 1.73 TiB size: 1.73 TiB (100.00%)
used: 897.6 GiB (50.6%) fs: btrfs dev: /dev/nvme1n1p3 maj-min: 259:5
ID-4: /var/log raw-size: 1.73 TiB size: 1.73 TiB (100.00%)
used: 897.6 GiB (50.6%) fs: btrfs dev: /dev/nvme1n1p3 maj-min: 259:5
ID-5: /var/tmp raw-size: 1.73 TiB size: 1.73 TiB (100.00%)
used: 897.6 GiB (50.6%) fs: btrfs dev: /dev/nvme1n1p3 maj-min: 259:5
Swap:
Kernel: swappiness: 133 (default 60) cache-pressure: 100 (default) zswap: no
ID-1: swap-1 type: zram size: 15.45 GiB used: 0 KiB (0.0%) priority: 100
comp: zstd avail: lzo,lzo-rle,lz4,lz4hc,842 max-streams: 12 dev: /dev/zram0
ID-2: swap-2 type: partition size: 11.72 GiB used: 0 KiB (0.0%)
priority: -2 dev: /dev/nvme1n1p2 maj-min: 259:4
Sensors:
System Temperatures: cpu: 51.0 C pch: 64.0 C mobo: N/A
Fan Speeds (rpm): N/A
Info:
Memory: total: 16 GiB available: 15.45 GiB used: 5.29 GiB (34.2%)
Processes: 363 Power: uptime: 6m states: freeze,mem,disk suspend: deep
avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
suspend, test_resume image: 6.13 GiB services: org_kde_powerdevil,
power-profiles-daemon, upowerd Init: systemd v: 256 default: graphical
tool: systemctl
Packages: pm: pacman pkgs: 1768 libs: 471 tools: octopi,paru Compilers:
clang: 18.1.8 gcc: 14.2.1 Shell: garuda-inxi default: fish v: 3.7.1
running-in: konsole inxi: 3.3.36
Garuda (2.6.26-1):
System install date:     2024-07-17
Last full system update: 2024-11-05
Is partially upgraded:   No
Relevant software:       snapper NetworkManager dracut
Windows dual boot:       No/Undetected
Failed units:
1Naim commented 3 weeks ago

It looks like you're using another Arch-based distribution without using our repos. If that's the case, then NVIDIA is failing due to #286. Either patch dkms with https://github.com/dell/dkms/pull/417/ or grab the dkms package from our repos.

mysteryx93 commented 3 weeks ago

Is that issue going to be resolved over time without requiring manual interventions?

1Naim commented 3 weeks ago

Yes it has been merged in upstream dkms and it will be fixed when the next version comes out.

michaelsebero commented 1 week ago

When using the stock linux-cachyos kernel I still have this issue with nvidia-open-dkms and nvidia-dkms.

ptr1337 commented 1 week ago

When using the stock linux-cachyos kernel I still have this issue with nvidia-open-dkms and nvidia-dkms.

You need to use the CachyOS repos or compile the dkms package manually in "cachyos-pkgbuilds" repository. Or switch to GCC as compiler.

michaelsebero commented 1 week ago

When using the stock linux-cachyos kernel I still have this issue with nvidia-open-dkms and nvidia-dkms.

You need to use the CachyOS repos or compile the dkms package manually in "cachyos-pkgbuilds" repository. Or switch to GCC as compiler.

I already have GCC as my compiler and the repo I used to install the linux-cachyos package has been functional until about 2 weeks ago, same as when the OP starting having issues. I've also gone about doing manual intervention and it doesn't seem to fix my issue. On my AMD system linux-cachyos works fine though.

1Naim commented 1 week ago

Output of dkms status?

michaelsebero commented 1 week ago

I'll reinstall linux-cachyos again and give the output. I ran mkinitcpio -P afterwards on my last install to make sure and it gave me the same issue.

michaelsebero commented 1 week ago

Output of dkms status?

nvidia/565.57.01, 6.11.8-artix1-2, x86_64: installed nvidia/565.57.01, 6.12.0-1-cachyos, x86_64: installed openrazer-driver/3.9.0, 6.11.8-artix1-2, x86_64: installed openrazer-driver/3.9.0, 6.12.0-1-cachyos, x86_64: installed

Also I'd note that when you boot up the kernel you see a lot of debug logs in the init before you get to the cli login screen.

1Naim commented 1 week ago

Okay.. Is this an Artix install with CachyOS repos or is this only an Artix install?

ptr1337 commented 1 week ago

Either compile the nvidia module with: https://github.com/CachyOS/linux-cachyos/blob/master/linux-cachyos/PKGBUILD#L135-L142

Or ask artix to update their nvidia-dkms for the kernel. I have pushed to archlinux fixes for 6.12 and 6.11 kernel. Artix should adapt them.

michaelsebero commented 1 week ago

Okay.. Is this an Artix install with CachyOS repos or is this only an Artix install?

Artix mirrors Arch's repos and this cachyos package comes from there.

ptr1337 commented 1 week ago

Okay.. Is this an Artix install with CachyOS repos or is this only an Artix install?

Artix mirrors Arch's repos and this cachyos package comes from there.

Can you show the output of pacman -Qs nvidia-utils ?

michaelsebero commented 1 week ago

Okay.. Is this an Artix install with CachyOS repos or is this only an Artix install?

Artix mirrors Arch's repos and this cachyos package comes from there.

Can you show the output of pacman -Qs nvidia-utils ?

local/lib32-nvidia-utils 565.57.01-1 NVIDIA drivers utilities (32-bit) local/nvidia-utils 565.57.01-1 NVIDIA drivers utilities local/nvidia-utils-s6 20240813-1 (s6-world) s6-rc service scripts for nvidia-utils

ptr1337 commented 1 week ago

local/lib32-nvidia-utils 565.57.01-1 NVIDIA drivers utilities (32-bit) local/nvidia-utils 565.57.01-1 NVIDIA drivers utilities local/nvidia-utils-s6 20240813-1 (s6-world) s6-rc service scripts for nvidia-utils

Yes, artix does not follow archlinux, and therefore it is not working. Make a bugreport at artix so that they sync their stuff properly.

michaelsebero commented 1 week ago

local/lib32-nvidia-utils 565.57.01-1 NVIDIA drivers utilities (32-bit) local/nvidia-utils 565.57.01-1 NVIDIA drivers utilities local/nvidia-utils-s6 20240813-1 (s6-world) s6-rc service scripts for nvidia-utils

Yes, artix does not follow archlinux, and therefore it is not working. Make a bugreport at artix so that they sync their stuff properly.

Artix follows Arch Linux's repos generally but I need to know is this an issue with the nvidia driver or nvidia-utils?

ptr1337 commented 1 week ago

I have pushed these changes to archlinux around a week ago: https://gitlab.archlinux.org/archlinux/packaging/packages/nvidia-utils/-/commit/f3777d92d102a0f6313228a2b6d19d8ef852679d

Artix does not seem to sync/apply these changes.

gloatoriginal commented 1 week ago

Hey, I installed the cachyos kernel and headers to an Arch installation that is up to date, using nvidia-open-dkms modules, however when I hit graphical display while booting cachyos it lingered there. I was still able to ctrl+alt+f# to other TTY# and it looked like nvidia-smi still recognized my graphics cards and the driver, but it's like cachyos didn't load xorg onto it at the point, because there were no applications loaded onto the GPU. Sorry I didn't capture logs, but I can attempt rebooting in and gather further information if that will help, this LT does work with nvidia-open-dkms on mainline, lts and zen.

ptr1337 commented 1 week ago

Hey, I installed the cachyos kernel and headers to an Arch installation that is up to date, using nvidia-open-dkms modules, however when I hit graphical display while booting cachyos it lingered there. I was still able to ctrl+alt+f# to other TTY# and it looked like nvidia-smi still recognized my graphics cards and the driver, but it's like cachyos didn't load xorg onto it at the point, because there were no applications loaded onto the GPU. Sorry I didn't capture logs, but I can attempt rebooting in and gather further information if that will help, this LT does work with nvidia-open-dkms on mainline, lts and zen.

You need to use our dkms package. This is an issue in dkms, when using clang build kernel. This has been fixed in upstream.

Either build the pkg yourself or pull it from the repository