Dasharo / dasharo-issues

The Dasharo issue tracker
https://dasharo.com/
24 stars 0 forks source link

NVIDIA card permanently active, never suspended in V560TNE #999

Open filipleple opened 1 month ago

filipleple commented 1 month ago

Component

Dasharo firmware

Device

NovaCustom V56 14th Gen

Dasharo version

v0.9.1-rc2

Dasharo Tools Suite version

No response

Test case ID

NVI002.001

Brief summary

NVIDIA card permanently active, never suspended in V560TNE

How reproducible

100%

How to reproduce

Run the NVI002.001 test

Expected behavior

It should pass

Actual behavior

The card is never suspended. No additional GUI apps that could utilize HW acceleration are running.

------------------------------------------------------------------------------
NVI002.001 NVIDIA Graphics power management (Ubuntu) :: Check whet... ....
Checking if mesa-utils is installed...

Package mesa-utils is installed
.
Checking if pciutils is installed...

Package pciutils is installed
NVI002.001 NVIDIA Graphics power management (Ubuntu) :: Check whet... | FAIL |
'active' does not contain 'suspended'
------------------------------------------------------------------------------

Screenshots

No response

Additional context

No response

Solutions you've tried

No response

mkopec commented 3 weeks ago

should be fixed by https://github.com/Dasharo/coreboot/pull/548/commits/66abce327e64a05f36f84a05a92266187459fa9b

philipandag commented 3 weeks ago

Issue still exists in v0.9.1-rc3

scripts/run.sh dasharo-compatibility/nvidia.robot -- -t "NVI002*"
dasharo-compatibility/nvidia.robot -- -t NVI002*
robot -L TRACE -l logs/novacustom-v560tne/2024_08_20_15_07_00/dasharo-compatibility/nvidia.robot__log.html -r logs/novacustom-v560tne/2024_08_20_15_07_00/dasharo-compatibility/nvidia.robot__report.html -o logs/novacustom-v560tne/2024_08_20_15_07_00/dasharo-compatibility/nvidia.robot__out.xml -b logs/novacustom-v560tne/2024_08_20_15_07_00/dasharo-compatibility/nvidia.robot__debug.log -v rte_ip:127.0.0.1 -v config:novacustom-v560tne -v device_ip:192.168.4.168 -v fw_file:novacustom_v560tnx.rom -t NVI002* dasharo-compatibility/nvidia.robot
==============================================================================
Nvidia                                                                        
==============================================================================
NVI002.001 NVIDIA Graphics power management (Ubuntu) :: Check whet... ....
Checking if mesa-utils is installed...

Package mesa-utils is installed
.
Checking if pciutils is installed...

Package pciutils is installed
NVI002.001 NVIDIA Graphics power management (Ubuntu) :: Check whet... | FAIL |
'active' does not contain 'suspended'
------------------------------------------------------------------------------
Nvidia                                                                | FAIL |
1 test, 0 passed, 1 failed
==============================================================================
mkopec commented 3 weeks ago

is the nvidia driver loaded? Please check lsmod | grep nvidia

philipandag commented 3 weeks ago

@mkopec

ubuntu@3mdeb:~$ lsmod | grep -i nvidia
nvidia_uvm           5021696  0
nvidia_drm            122880  2
nvidia_modeset       1507328  2 nvidia_drm
nvidia               8781824  32 nvidia_uvm,nvidia_modeset
ecc                    45056  2 ecdh_generic,nvidia
video                  73728  3 xe,i915,nvidia_modeset
ubuntu@3mdeb:~$ 
mkopec commented 3 weeks ago

in an x11 session, Xorg keeps the gpu powered on. fixed by switching to GNOME Wayland session:

cat /proc/driver/nvidia/gpus/0000\:01\:00.0/power
Runtime D3 status:          Enabled (fine-grained)
Video Memory:               Off

GPU Hardware Support:
 Video Memory Self Refresh: Supported
 Video Memory Off:          Supported

S0ix Power Management:
 Platform Support:          Supported
 Status:                    Disabled
philipandag commented 2 weeks ago

in an x11 session, Xorg keeps the gpu powered on. fixed by switching to GNOME Wayland session:

Xorg was not used here (in my case). Wayland was used from the beginning. Switching to Xorg and back to Wayland does not change the output of cat /sys/class/drm/card1/device/power/runtime_status which is used in the test documentation and automatic tests.

Also, running cat /proc/driver/nvidia/gpus/0000\:01\:00.0/power results in cat: '/proc/driver/nvidia/gpus/0000:01:00.0/power': No such file or directory

lsmod | grep -i nvidia returns nothing suggesting the driver is not even loaded. But lspci | grep -i nvidia detects the card. I tried reinstalling the driver and dkms in version 550-open and rebooting which did not make any difference.

philipandag commented 1 week ago

I think the nvidia gpu is not working at all on V560TNE with the Ubuntu 24.04, kernel 6.9 which is installed on our device. I tried installing the drivers -open variant in versions 535, 550 and 600 using apt as well as Ubuntu's Software & Updates app, rebooting after every install, but to no avail, because running lsmod | grep -i nvidia always yielded no results. I checked what gpu chromium uses by going to chrome://gpu and it used the iGPU despite hardware acceleration being allowed in the settings. The GPU works fine on Windows so maybe I am doing something wrong,

philipandag commented 2 days ago

V540TND v0.9.1-rc5 kernel 6.9 is the same as https://github.com/Dasharo/dasharo-issues/issues/999#issuecomment-2312465168, both on Wayland and Xorg session (changed using the cogwhell in the low-right corner on the login screen)

Before and after this unfortunate reboot it was true that:

Logs after the reboot: cbmem-nvidia.log dmesg-nvidia.log

dkms status:

ubuntu@3mdeb:~$ dkms status
acpi-call/1.2.2, 6.8.0-35-generic, x86_64: installed
acpi-call/1.2.2, 6.8.0-39-generic, x86_64: installed
nvidia/535.183.01: added

apt-cache policy nvidia-driver-530:

nvidia-driver-530:
  Installed: (none)
  Candidate: 535.183.01-0ubuntu0.24.04.1
  Version table:
     535.183.01-0ubuntu0.24.04.1 500
        500 http://pl.archive.ubuntu.com/ubuntu noble-updates/restricted amd64 Packages
        500 http://security.ubuntu.com/ubuntu noble-security/restricted amd64 Packages
     535.171.04-0ubuntu2 500
        500 http://pl.archive.ubuntu.com/ubuntu noble/restricted amd64 Packages

Looking at the outputs I suspect it is related to updating to kernel 6.9. Maybe the command on dasharo docs is wrong? I have been just copy/pasting it.

sudo apt install nvidia-driver-550-open contains:

(...)
Module build for kernel 6.9.0-060900-generic was skipped since the kernel headers for this kernel do not seem to be installed.
(...)
mkopec commented 2 days ago

you need to also install kernel headers if you're installing a new kernel. It's a separate package. Without it the nvidia kernel driver will not be built and it won'be available.

philipandag commented 2 days ago

When running the command from docs.dasharo

sudo apt install ./linux-headers-6.9.0-060900_6.9.0-060900.202405122134_all.deb     ./linux-image-unsigned-6.9.0-060900-generic_6.9.0-060900.202405122134_amd64.deb     ./linux-modules-6.9.0-060900-generic_6.9.0-060900.202405122134_amd64.deb 

The output contains:

(...)
linux-headers-6.9.0-060900 is already the newest version (6.9.0-060900.202405122134).
linux-image-unsigned-6.9.0-060900-generic is already the newest version (6.9.0-060900.202405122134).
linux-modules-6.9.0-060900-generic is already the newest version (6.9.0-060900.202405122134).
(...)

suggesting the headers are installed.

wessel-novacustom commented 21 hours ago

FWIW: Suspend seems to work fine with Pop!_OS, while the NVIDIA graphics card is working.