sibradzic / amdgpu-clocks

Simple script to control power states of amdgpu driven GPUs
GNU General Public License v2.0
390 stars 43 forks source link

Systemd service can't set the clocks on Manjaro #33

Closed xcom169 closed 3 years ago

xcom169 commented 3 years ago

ápr 20 18:23:35 x-a320ms2hv2 systemd[1]: Starting Set custom amdgpu clocks & voltages... ápr 20 18:23:35 x-a320ms2hv2 amdgpu-clocks[769]: ls: cannot access '/sys/class/drm/card0/device/hwmon': No such file or directo> ápr 20 18:23:35 x-a320ms2hv2 amdgpu-clocks[758]: WARNING: /sys/class/drm/card0/device/pp_od_clk_voltage does not exist, skippin> ápr 20 18:23:35 x-a320ms2hv2 systemd[1]: Finished Set custom amdgpu clocks & voltages.

sibradzic commented 3 years ago

ls -alh /sys/class/drm ? You starting it via systemd?

xcom169 commented 3 years ago

Yes, via Systemd service. Maybe it's connected with the Wayland session?

total 0 drwxr-xr-x 2 root root 0 2021 ápr 21 . drwxr-xr-x 69 root root 0 2021 ápr 21 .. lrwxrwxrwx 1 root root 0 2021 ápr 21 card0 -> ../../devices/pci0000:00/0000:00:03.1/0000:0a:00.0/0000:0b:00.0/0000:0c:00.0/drm/card0 lrwxrwxrwx 1 root root 0 2021 ápr 21 card0-DP-1 -> ../../devices/pci0000:00/0000:00:03.1/0000:0a:00.0/0000:0b:00.0/0000:0c:00.0/drm/card0/card0-DP-1 lrwxrwxrwx 1 root root 0 2021 ápr 21 card0-DP-2 -> ../../devices/pci0000:00/0000:00:03.1/0000:0a:00.0/0000:0b:00.0/0000:0c:00.0/drm/card0/card0-DP-2 lrwxrwxrwx 1 root root 0 2021 ápr 21 card0-DP-3 -> ../../devices/pci0000:00/0000:00:03.1/0000:0a:00.0/0000:0b:00.0/0000:0c:00.0/drm/card0/card0-DP-3 lrwxrwxrwx 1 root root 0 2021 ápr 21 card0-HDMI-A-1 -> ../../devices/pci0000:00/0000:00:03.1/0000:0a:00.0/0000:0b:00.0/0000:0c:00.0/drm/card0/card0-HDMI-A-1 lrwxrwxrwx 1 root root 0 2021 ápr 21 renderD128 -> ../../devices/pci0000:00/0000:00:03.1/0000:0a:00.0/0000:0b:00.0/0000:0c:00.0/drm/renderD128 lrwxrwxrwx 1 root root 0 2021 ápr 21 ttm -> ../../devices/virtual/drm/ttm -r--r--r-- 1 root root 4,0K 2021 ápr 21 version

sibradzic commented 3 years ago

This may be similar to what's happening in https://github.com/sibradzic/amdgpu-clocks/issues/26, the /sys/class/drm/card0 may not be preset at the time systemd is triggering the service start.

Does it run when you start it manually, without systemd?

xcom169 commented 3 years ago

Yes, manually it works fine. Also systemctl restart amd..clock works after boot manully.

sibradzic commented 3 years ago

So it must be the systemd timing issue. Can you try those two workarounds I suggested @ https://github.com/sibradzic/amdgpu-clocks/issues/26#issuecomment-783582949 ?

xcom169 commented 3 years ago

I will try. For systemd both X and Wayland is a graphical target ? Both good?

On Thu, 22 Apr 2021 at 05:22, Samir Ibradžić @.***> wrote:

So it must be the systemd timing issue. Can you try those two workarounds I suggested @ #26 https://github.com/sibradzic/amdgpu-clocks/issues/26 ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sibradzic/amdgpu-clocks/issues/33#issuecomment-824509278, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEH2G2WQBESSEAWY5WD75MLTJ6JANANCNFSM43IR6DWA .

sibradzic commented 3 years ago

I am no systemd expert, but I think the login screen itself (gdm3 and such) qualifies as graphical target... Try these commands to check it out:

systemd-analyze
systemd-analyze critical-chain
zakk4223 commented 2 years ago

I was having the same issue on arch, this is what fixed it for me in the unit file:

Wants=modprobe@amdgpu.service

xcom169 commented 2 years ago

Good idea! I added 20-30 sec delay which also helped a bit.

On Sun, 10 Oct 2021 at 04:41, zakk4223 @.***> wrote:

I was having the same issue on arch, this is what fixed it for me in the unit file:

@.***

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/sibradzic/amdgpu-clocks/issues/33#issuecomment-939395154, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEH2G2SQJW6ZZH5NICNYMMDUGD4NJANCNFSM43IR6DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

sibradzic commented 2 years ago

@zakk4223 merged:

--- a/amdgpu-clocks.service
+++ b/amdgpu-clocks.service
@@ -1,6 +1,7 @@
 [Unit]
 Description=Set custom amdgpu clocks & voltages
 After=multi-user.target rc-local.service systemd-user-sessions.service
+Wants=modprobe@amdgpu.service

 [Service]
 Type=oneshot
spagootie commented 2 years ago

This error is happening for me on Arch Linux, maybe reopen? It's exactly the same as described in this issue, but the fix that was merged didn't fix it for me.

FlyingWombat commented 2 years ago

This is also still happening for me: Arch Linux 5.18.14, systemd 251.3, GPU rx 6700 xt. Changing to Requires=modprove@amdgpu.service doesn't fix. Including amdgpu in initramfs doesn't fix.

Also, all of the pp_* sysfs files are just gone, even after login. So, restarting amdgpu-clocks.service or manually running the script doesn't work. If I disable the service and reboot, things go back to normal.

Having the service disabled, and running systemctl start amdgpu-clocks.service after login works.

EDIT: For unknown reasons, removing Wants=modprobe@amdgpu.service fixed it for me.

--- a/amdgpu-clocks.service
+++ b/amdgpu-clocks.service
@@ -1,7 +1,6 @@
 [Unit]
 Description=Set custom amdgpu clocks & voltages
 After=multi-user.target rc-local.service systemd-user-sessions.service
-Wants=modprobe@amdgpu.service

 [Service]
 Type=oneshot

EDIT2: (Updating here for completeness) As stated in my later comment, the reason Wants=modprobe@amdgpu.service didn't work for me is just because it changed the initialization order of my GPUs. With it, my amd gpu was assigned to "card0" instead of the usual "card1", that I had set my config for.

sibradzic commented 2 years ago

@FlyingWombat What does the systemd-analyze critical-chain say, with and without Wants=modprobe@amdgpu.service?

FlyingWombat commented 1 year ago

They are the same:

# without modprobe@amdgpu.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

graphical.target @5.486s
└─multi-user.target @5.486s
  └─getty.target @5.486s
    └─getty@tty1.service @5.486s
      └─systemd-user-sessions.service @5.482s +3ms
        └─network.target @5.480s
          └─NetworkManager.service @5.458s +22ms
            └─dbus.service @5.439s +15ms
              └─basic.target @5.436s
                └─sockets.target @5.436s
                  └─dbus.socket @5.436s
                    └─sysinit.target @5.427s
                      └─systemd-udev-settle.service @207ms +5.220s
                        └─systemd-udev-trigger.service @157ms +49ms
                          └─systemd-udevd-kernel.socket @153ms

# with modprobe@amdgpu.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

graphical.target @5.473s
└─multi-user.target @5.473s
  └─getty.target @5.473s
    └─getty@tty1.service @5.472s
      └─systemd-user-sessions.service @5.467s +4ms
        └─network.target @5.464s
          └─NetworkManager.service @5.441s +22ms
            └─dbus.service @5.423s +14ms
              └─basic.target @5.420s
                └─sockets.target @5.420s
                  └─dbus.socket @5.419s
                    └─sysinit.target @5.419s
                      └─systemd-udev-settle.service @218ms +5.200s
                        └─systemd-udev-trigger.service @165ms +52ms
                          └─systemd-udevd-kernel.socket @160ms
FlyingWombat commented 1 year ago

Update: I found out why Wants=modprobe@amdgpu.service doesn't work for me -- it is so simple I can't believe I didn't notice earlier. Normally my amd gpu is initialized after the intel iGPU, and is assigned to /sys/class/drm/card1. With Wants=modprobe@amdgpu.service the order changes, and my amd gpu is assigned to card0.

sibradzic commented 1 year ago

Interesting observation. So, when your 6700XT gets assigned as card0, does amdgpu-clocks work for you as expected, provided you rename your custom state file so it ends with .card0?

For the sake of science, you could try commenting out the line starting with After in the service file, followed by deamon-reload and restart, and see if the /sys/class/drm card ordering is as expected.