bayasdev / envycontrol

Easy GPU switching for Nvidia Optimus laptops under Linux
MIT License
1.26k stars 61 forks source link

[QUESTION] Will the removal of the nvidia devices from the PCIe bus work for others? #157

Closed klmcwhirter closed 4 months ago

klmcwhirter commented 7 months ago

Does the approach described below work for most people (different nvidia H/W, linux distros, etc.) ?

I encountered a technique to remove the NVIDIA devices from the PCIe bus so that the nvidia hardware can be switched off with the goal of improving battery life.

I have put together a POC and am working with @bayasdev on potentially including this in envycontrol upon switch to integrated mode.

The approach I have found has these benefits.

Prerequisites

I reference the blog post behind the first URL from the POC README above.

  1. JS mentions an absolutely bare minimum Linux kernel version is 5.12 - but recommends 5.14 or 5.15. I suspect it may have to do with the version of the intel/nouveau drivers and the tmpfiles.d support. Although, he doesn't really mention specifically why 5.12 is recommended.

He just says at the end of that paragraph that:

I'm personally using Fedora, but everything listed here should work with any distribution running a modern kernel and systemd.

  1. obviously, the bus ids will need to be tweaked for your machine. I have tried to generalize that process at https://github.com/klmcwhirter/nvidia-more-battery/blob/master/nvidia_more_battery/services/tmpfiles.py#L69-L75. But I only have the one system to test with.

I should also mention that I have an Acer Nitro 5 with an RTX 3050 Ti.

nitro5-neofetch.png

$ lspci | grep NVIDIA
0000:01:00.0 VGA compatible controller: NVIDIA Corporation GA107M [GeForce RTX 3050 Ti Mobile] (rev a1)
0000:01:00.1 Audio device: NVIDIA Corporation Device 2291 (rev a1)

My /etc/tmpfiles.d/nvidia_no_gpu.conf file.

$ cat /etc/tmpfiles.d/nvidia_no_gpu.conf 
d /run/no-nvidia 0755 1000 1000
f /run/no-nvidia/in-effect 0444 1000 1000 - 1
w /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/remove - - - - 1
w /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/remove - - - - 1

I'll be happy to answer any questions you may have about using the POC code over in the https://github.com/klmcwhirter/nvidia-more-battery repo.

Please comment with your feedback to help us decide whether this can or even should be included in envycontrol directly, or simply added to the documentation as another option.

Thanks for the help.

Miaua commented 5 months ago

I'm using EnvyControl, enabled integrated GPU, but the lspci command can still wake disabled devices up.

I'm viewing my devices from powertop, before and after i run lspci command. After i run "sudo lspci -m -k", the Host bridge Device 14e8 gets replaced with PCI bridge Device 14ed. The power draw is permanently tripled forever stuck. Lenovo Legion Slim 5 16" Gen 8, 7840HS, 780M, RTX-4060, Fedora 40 KDE.

Before i run "sudo lspci -m -k" The battery reports a discharge rate of 7.25 W The energy consumed was 152 J

          Usage     Device name
          2,2%        CPU misc
          2,2%        CPU core

        00:00.0 "Host bridge" "Advanced Micro Devices, Inc. [AMD]" "Device 14e8" -p00 "Lenovo" "Device 3802"

        PCI Device: Advanced Micro Devices, Inc. [AMD] Device 14ed

After i did run "sudo lspci -m -k" The battery reports a discharge rate of 20.5 W The energy consumed was 419 J

          Usage     Device name
         11,4%        CPU misc
         11,4%        CPU core

        00:01.1 "PCI bridge" "Advanced Micro Devices, Inc. [AMD]" "Device 14ed" -p00 "Advanced Micro Devices, Inc. [AMD]" "Device 1453"

        00:02.3 "PCI bridge" "Advanced Micro Devices, Inc. [AMD]" "Device 14ed" -p00 "Advanced Micro Devices, Inc. [AMD]" "Device 1453"
klmcwhirter commented 4 months ago

This approach only seems to work with a specific set of hardware / software and is not a general solution.