FeralInteractive / gamemode

Optimise Linux system performance on demand
BSD 3-Clause "New" or "Revised" License
4.75k stars 185 forks source link

Issue setting overclock values on Nvidia card #170

Open dylanparry opened 5 years ago

dylanparry commented 5 years ago

Describe the bug I get an error when I attempt to overclock my Nvidia card

To Reproduce Steps used to reproduce the behavior:

  1. Launch gamemod with the following config file:
[general]
; The reaper thread will check every 5 seconds for exited clients and for config file changes
reaper_freq=5

; The desired governor is used when entering GameMode instead of "performance"
desiredgov=performance
; The default governer is used when leaving GameMode instead of restoring the original value
defaultgov=powersave

; GameMode can change the scheduler policy to SCHED_ISO on kernels which support it (currently
; not supported by upstream kernels). Can be set to "auto", "on" or "off". "auto" will enable
; with 4 or more CPU cores. "on" will always enable. Defaults to "off".
softrealtime=off

; GameMode can renice game processes. You can put any value between 0 and 20 here, the value
; will be negated and applied as a nice value (0 means no change). Defaults to 0.
renice=0

; By default, GameMode adjusts the iopriority of clients to BE/0, you can put any value
; between 0 and 7 here (with 0 being highest priority), or one of the special values
; "off" (to disable) or "reset" (to restore Linux default behavior based on CPU priority),
; currently, only the best-effort class is supported thus you cannot set it here
ioprio=0

; Sets whether gamemode will inhibit the screensaver when active
; Defaults to 1
inhibit_screensaver=1

[gpu]

; Setting this to the keyphrase "accept-responsibility" will allow gamemode to apply GPU optimisations such as overclocks
apply_gpu_optimisations=accept-responsibility

; The DRM device number on the system (usually 0), ie. the number in /sys/class/drm/card0/
gpu_device=1

; Nvidia specific settings
nv_powermizer_mode=1
nv_core_clock_mhz_offset=150
nv_mem_clock_mhz_offset=1000
  1. Run gamemodrun glxgears
  2. View output of 'systemctl --user status gamemode.service`
  3. See error:
● gamemoded.service - gamemoded
   Loaded: loaded (/usr/lib/systemd/user/gamemoded.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2019-08-16 21:54:11 BST; 19s ago
 Main PID: 18148 (gamemoded)
   Status: "GameMode is now active."
   CGroup: /user.slice/user-1000.slice/user@1000.service/gamemoded.service
           └─18148 /usr/bin/gamemoded -l

Aug 16 21:54:28 kratos /usr/bin/gamemoded[18148]: Adding game: 18231 [/usr/bin/glxgears]
Aug 16 21:54:28 kratos /usr/bin/gamemoded[18148]: Entering Game Mode...
Aug 16 21:54:28 kratos /usr/bin/gamemoded[18148]: governor was initially set to [powersave]
Aug 16 21:54:28 kratos /usr/bin/gamemoded[18148]: Requesting update of governor policy to performance
Aug 16 21:54:28 kratos pkexec[18238]: pam_unix(polkit-1:session): session opened for user root by (uid=1000)
Aug 16 21:54:28 kratos /usr/bin/gamemoded[18148]: Requesting GPU optimisations on device:1
Aug 16 21:54:28 kratos pkexec[18266]: pam_unix(polkit-1:session): session opened for user root by (uid=1000)
Aug 16 21:54:28 kratos gamemoded[18148]: ERROR: Error assigning value 150 to attribute 'GPUGraphicsClockOffset' (kratos:1[gpu:0]) as specified in assignment '[gpu:0]/GPUGraphic>
Aug 16 21:54:28 kratos gamemoded[18148]: ERROR: Error assigning value 1000 to attribute 'GPUMemoryTransferRateOffset' (kratos:1[gpu:0]) as specified in assignment '[gpu:0]/GPUM>
Aug 16 21:54:28 kratos /usr/bin/gamemoded[18148]: Setting ioprio value...

Expected behavior The GPU should be overclocked.

System Info (please complete the following information):

Additional context I've found that the only settings that allow my GPU to be overclocked are:

Which differ slightly from the ones that are trying to be used in the above status log.

aejsmith commented 5 years ago

What GPU and driver version are you using?

dylanparry commented 5 years ago

GPU is GeForce GTX 1050, and driver version is 430.40

mdiluz commented 5 years ago

Hmm, wouldn't be surprised if something has changed in the Nvidia driver since I implemented this - for instance on my machine nvidia-settings doesn't even expose GPUGraphicsClockOffset or GPUMemoryTransferRateOffset anymore.

What happens when you run gamemoded -t? I get this when I check now, on a laptop GPU (perhaps part of the problem).

::: Verifying GPU Optimisations
ERROR: Failed to parse output for "[gpu:0]/GPUGraphicsClockOffset[3]" output was ""!
ERROR: External process failed with exit code 1
ERROR: Output was: 
ERROR: Failed to call gpuclockctl, could not get values!
ERROR: Failed to parse output for "[gpu:0]/GPUGraphicsClockOffset[3]" output was ""!
ERROR: External process failed with exit code 1
ERROR: Output was: 
ERROR: Failed to call gpuclockctl, could not get values!
ERROR: Could not get current GPU info, see above!
dylanparry commented 5 years ago

I get this:

~$ gamemoded -t
: Loading config
Loading config file [/etc/gamemode.ini]
: Running tests

:: Basic client tests
:: Passed

:: Dual client tests
gamemode request succeeded and is active
Quitting by request...
:: Passed

:: Gamemoderun and reaper thread tests
ERROR: gamemode_query_status failed to return other client connected (expected 1)!
...Waiting for child to quit...
...Waiting for reaper thread (reaper_frequency set to 5 seconds)...
:: Supervisor tests
:: Passed

: Client tests failed, skipping feature tests
: Tests Failed!

Not sure what the "other client" it refers to is as there's no games running.

From the reading I did, it looks like GPUGraphicsClockOffset and GPUMemoryTransferRateOffset were replaced with GPUGraphicsClockOffsetAllPerformanceLevels and GPUMemoryTransferRateOffsetAllPerformanceLevels respectively. I couldn't find anywhere that knew exactly when this change occurred, but it looks like it was some time during the 390.x series of drivers. I also couldn't find any official documentation for this anywhere :\

mdiluz commented 5 years ago

Not the best log message, but:

ERROR: gamemode_query_status failed to return other client connected (expected 1)!

Is complaining that with an extra fake client connected, gamemode_query_status didn't return 1 to say that there was a client active. That's strange. Does the issue repeat? If so, the output of journalctl --user --unit=gamemoded --follow during the same time as running the tests would be interesting to see.

The tests didn't even get to taking a look at the GPU stuff anyway, because of that core failure :(

: Client tests failed, skipping feature tests

From the reading I did, it looks like GPUGraphicsClockOffset and GPUMemoryTransferRateOffset were replaced with GPUGraphicsClockOffsetAllPerformanceLevels and GPUMemoryTransferRateOffsetAllPerformanceLevels respectively. I couldn't find anywhere that knew exactly when this change occurred, but it looks like it was some time during the 390.x series of drivers. I also couldn't find any official documentation for this anywhere :\

Ugh yeah, that's happened before, and now the code would need a special case for each version. I'm starting to think more complicated GPU overclocking should be handled outside of GameMode, using the custom start/end scripts.

dylanparry commented 5 years ago

This is what I get:

~$ journalctl --user --unit=gamemoded --follow
-- Logs begin at Tue 2019-07-02 15:47:54 BST. --
Aug 26 22:55:19 kratos /usr/bin/gamemoded[1724]: Successfully initialised bus with name [com.feralinteractive.GameMode]...
Aug 26 22:55:19 kratos systemd[1708]: Started gamemoded.
Aug 27 00:20:41 kratos /usr/bin/gamemoded[1724]: Failure when processing the bus: Connection reset by peer
Aug 27 00:20:41 kratos systemd[1708]: gamemoded.service: Main process exited, code=exited, status=1/FAILURE
Aug 27 00:20:41 kratos systemd[1708]: gamemoded.service: Failed with result 'exit-code'.
-- Reboot --
Aug 27 12:12:49 kratos systemd[1675]: Starting gamemoded...
Aug 27 12:12:49 kratos /usr/bin/gamemoded[1690]: v1.4
Aug 27 12:12:49 kratos /usr/bin/gamemoded[1690]: Loading config file [/etc/gamemode.ini]
Aug 27 12:12:49 kratos /usr/bin/gamemoded[1690]: Successfully initialised bus with name [com.feralinteractive.GameMode]...
Aug 27 12:12:49 kratos systemd[1675]: Started gamemoded.
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: Adding game: 6457 [/usr/bin/gamemoded]
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: Entering Game Mode...
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: governor was initially set to [powersave]
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: Requesting update of governor policy to performance
Aug 27 12:21:20 kratos pkexec[6458]: pam_unix(polkit-1:session): session opened for user root by (uid=1000)
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Getting Nvidia parameters requires DISPLAY to be set - will likely fail!
Aug 27 12:21:20 kratos gamemoded[1690]: Unable to init server: Could not connect: Connection refused
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: The control display is undefined; please run `/usr/bin/nvidia-settings --help` for usage information.
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: External process failed with exit code 1
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Output was:
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Failed to get [gpu:0]/GPUPerfModes!
Aug 27 12:21:20 kratos gamemoded[1690]: Unable to init server: Could not connect: Connection refused
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: The control display is undefined; please run `/usr/bin/nvidia-settings --help` for usage information.
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: External process failed with exit code 1
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Output was:
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Failed to get [gpu:0]/GPUGraphicsClockOffset[-1]!
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: External process failed with exit code 1
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: Output was:
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: Failed to call gpuclockctl, could not get values!
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: Requesting GPU optimisations on device:1
Aug 27 12:21:20 kratos pkexec[6469]: pam_unix(polkit-1:session): session opened for user root by (uid=1000)
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Setting Nvidia parameters requires DISPLAY and XAUTHORITY to be set - will likely fail!
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Getting Nvidia parameters requires DISPLAY to be set - will likely fail!
Aug 27 12:21:20 kratos gamemoded[1690]: Unable to init server: Could not connect: Connection refused
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: The control display is undefined; please run `/usr/bin/nvidia-settings --help` for usage information.
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: External process failed with exit code 1
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Output was:
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Failed to get [gpu:0]/GPUPerfModes!
Aug 27 12:21:20 kratos gamemoded[1690]: Unable to init server: Could not connect: Connection refused
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: The control display is undefined; please run `/usr/bin/nvidia-settings --help` for usage information.
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: External process failed with exit code 1
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Output was:
Aug 27 12:21:20 kratos gamemoded[1690]: ERROR: Failed to set [gpu:0]/GPUPowerMizerMode=1!
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: External process failed with exit code 255
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: Output was:
Aug 27 12:21:20 kratos /usr/bin/gamemoded[1690]: Failed to call gpuclockctl, could not apply optimisations!

I've resorted to running a start/end script that does the overclocking manually for now, and that works just fine.