NVIDIA / open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source
Other
15.26k stars 1.29k forks source link

Not entering D3cold on 4070 laptop (Lenovo Legion Slim 5 16APH8) #730

Open ngbomford opened 3 weeks ago

ngbomford commented 3 weeks ago

NVIDIA Open GPU Kernel Modules Version

565.57.01

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

Operating System and Version

Arch Linux

Kernel Release

6.11.6-arch1-1

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

Hardware: GPU

NVIDIA GeForce RTX 4070 Laptop GPU

Describe the bug

Hi everyone,

After switching to the open driver I noticed my laptop (Lenovo Legion Slim 5 16APH8) does not enter D3cold, there are no processes running accessing the GPU at the time as this is after a fresh reboot. Running "watch -n 1 cat /sys/class/drm/card*/device/power_state" you can see that it's running constantly in D0 state, it switches for a second or so to D3cold, but immediately changes back to D0.

This issue is not present in the closed driver 565.57.01, after a reboot the laptop changes to D3cold after less than 30 seconds.

I only noticed because power consumption was higher than normal while on battery.

To Reproduce

Install nvidia-open-dkms package on Arch Linux

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

No response

mtijanic commented 3 weeks ago

Hey there, thanks for the report!

This issue is not present in the closed driver 565.57.01, after a reboot the laptop changes to D3cold after less than 30 seconds.

Looking at the attached log I see some logs from this run but I can't tell if it was with GSP enabled or disabled. Can you confirm this? If unsure, you can boot with the proprietary driver in the no-repro mode and just run nvidia-smi -q | grep GSP. Or you can run nvidia-bug-report.sh while in the no-repro mode and attach those logs as well.

From the logs we do see some errors that could be relevant. Could I maybe also trouble you to reload the open driver once with NVreg_RmMsg=":" and attach those logs too?

Thanks again!

ngbomford commented 3 weeks ago

Thanks for the reply.

Looking at the attached log I see some logs from this run but I can't tell if it was with GSP enabled or disabled. Can you confirm this? If unsure, you can boot with the proprietary driver in the no-repro mode and just run nvidia-smi -q | grep GSP. Or you can run nvidia-bug-report.sh while in the no-repro mode and attach those logs as well.

From the proprietary driver, looks like GSP firmware is enabled: nvidia-smi -q | grep GSP GSP Firmware Version : 565.57.01

From the logs we do see some errors that could be relevant. Could I maybe also trouble you to reload the open driver once with NVreg_RmMsg=":" and attach those logs too?

I've attached the logs with NVreg_RmMsg=":" from the open driver. nvidia-bug-report.log.gz

Thanks for looking into this.