erpalma / throttled

Workaround for Intel throttling issues in Linux.
MIT License
2.68k stars 166 forks source link

Fedora 34 failure: Unable to write to msr #250

Open timrichardson opened 3 years ago

timrichardson commented 3 years ago

Fedora 34 and Fedora 33 both have kernel 5.11. throttled works well on F33. On F34, there is a service failure.

I have the COPR version installed using the repository mentioned in the throttled readme. I suppose this is a kernel configuration difference between F33 and F34


tim@fedora proc]$ cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.11.12-300.fc34.x86_64 root=UUID=0d527167-cbd7-4cef-80b3-6629bc800868 ro rhgb quiet
[tim@fedora proc]$ 

× throttled.service - Stop Intel throttling
     Loaded: loaded (/usr/lib/systemd/system/throttled.service; enabled; vendor preset: disabled)
     Active: failed (Result: exit-code) since Sun 2021-04-11 12:10:45 AEST; 28min ago
   Main PID: 752 (code=exited, status=1/FAILURE)
        CPU: 410ms

Apr 11 12:10:37 fedora systemd[1]: Started Stop Intel throttling.
Apr 11 12:10:44 fedora throttled[752]: [I] Detected CPU architecture: Intel Kaby Lake (R)
Apr 11 12:10:44 fedora throttled[752]: [I] Loading config file.
Apr 11 12:10:44 fedora throttled[752]: [E] Unable to write to MSR. Try to disable Secure Boot and check if your kernel does not restrict access to MSR.
Apr 11 12:10:45 fedora systemd[1]: throttled.service: Main process exited, code=exited, status=1/FAILURE
Apr 11 12:10:45 fedora systemd[1]: throttled.service: Failed with result 'exit-code'.
timrichardson commented 3 years ago

Note: I get the same result when installing from git, on master, with the default config file (HWP mode is off)

(although the service is now called lenovo_fix.service)

timrichardson commented 3 years ago

although when run manually (as in python lenovo_fix.py) it works fine.

timrichardson commented 3 years ago

If I replace the fatal('Unable to write to MSR. Try to disable Secure Boot'...) with a warning() in writemsr(), the service runs, and the CPUs reach 85 C under load (currently I am using the default config file. )

neil1969 commented 3 years ago

I tried your fix and realized that while the service was up and running it ws not actually doing anything since everything was still not allowed. I decided to look at similar issues with similar packages like intel-undervolt and found a suggestion that is was SELinux related and of course this being Fedora and SELinux being SELinux it should always be one of the first things to looks at when something does not work.

grep throttled /var/log/audit/audit.log | audit2allow -M throttled-policy semodule -i throttled-policy.pp

Everything now works as it did previously. No modifications to the lenovo_fix.py needed.

timrichardson commented 3 years ago

For me, the throttling is definitely improved after my one-line fix, but not as much as it should be.

that is, do stress-ng -c 8 in one terminal

The config file now has 95C as the AC trigger point. After reboot, the systemd script runs, but throttling happens at 67 C, so it's not working after running the script from /opt in the venv

I see my temps get to 82 C, so it is better, but not working properly.

I am using the git version.

# grep throttled /var/log/audit/audit.log | audit2allow -M throttled-policy

gives Nothing to do

and it does not create a .pp file

Are you using the COPR version?

neil1969 commented 3 years ago

I tested the modified lenovo_fix.py with both the copr package and by installing from git. In both instances my temperatures remained limited, and my freqs would quickly (in seconds) drop to 2.3GHz per core.

Currently following adding the SELinux policy it reaches 95C, and hits freqs of up 3.9GHz using stress-ng and I ran this for 5 minutes with just the usual fluctuation between 3.5 and 3.9.

The command likely failed for you using the git version because it runs as lenovo_fix not throttled so,

grep lenovo_fix /var/log/audit/audit.log | audit2allow -M lenovo_fix-policy

should provide the recommended action and allow you to create the .pp file.

lakotamm commented 3 years ago

grep throttled /var/log/audit/audit.log | audit2allow -M throttled-policy semodule -i throttled-policy.pp

I can confirm that this fixes the issue for me. Thank you!

timrichardson commented 3 years ago

Thanks. I did my real upgrade to F34 today as opposed to running from a usb stick, and after the copr install your semodule fix worked. I will leave this open until the copr maintainer gets around to fixing the install.

yatian-liu commented 3 years ago

For me, the throttling is definitely improved after my one-line fix, but not as much as it should be.

that is, do stress-ng -c 8 in one terminal

The config file now has 95C as the AC trigger point. After reboot, the systemd script runs, but throttling happens at 67 C, so it's not working after running the script from /opt in the venv

I see my temps get to 82 C, so it is better, but not working properly.

I am using the git version.

# grep throttled /var/log/audit/audit.log | audit2allow -M throttled-policy

gives Nothing to do

and it does not create a .pp file

Are you using the COPR version?

I think this is because the git version has a different service name than the copr version. For the git version the service is called lenovo_fix.service while for the copr version the service is called throttled.service.

Update: for my system somehow grep throttled ... still doesn't work but grep "MSR" ... works. It identifies the access to MSR to be from python3 instead of the throttled service. As long as you grepped the denied access to MSR it should work.

lakotamm commented 3 years ago

Not sure how much this helps - I must not restart my laptop if I want from it to run without running. That means, I must manually turn it off and turn it on instead of restarting.

timrichardson commented 3 years ago

For me, the throttling is definitely improved after my one-line fix, but not as much as it should be. that is, do stress-ng -c 8 in one terminal The config file now has 95C as the AC trigger point. After reboot, the systemd script runs, but throttling happens at 67 C, so it's not working after running the script from /opt in the venv I see my temps get to 82 C, so it is better, but not working properly. I am using the git version. # grep throttled /var/log/audit/audit.log | audit2allow -M throttled-policy gives Nothing to do and it does not create a .pp file Are you using the COPR version?

I had the same experience as you. There is some improvement when using the git version, with the throttling threshold increased a bit. When I used the copr version, the suggested fixed to the selinux policy worked and throttling is back to the setting in the config file (95°C is the default on AC, I think). I don't know why the selinux policy changes don't work with the git version, even after allowing for the different name of the service and I didn't spend much time trying to work it out.

SuitedBadge401 commented 3 years ago

For me, the throttling is definitely improved after my one-line fix, but not as much as it should be. that is, do stress-ng -c 8 in one terminal The config file now has 95C as the AC trigger point. After reboot, the systemd script runs, but throttling happens at 67 C, so it's not working after running the script from /opt in the venv I see my temps get to 82 C, so it is better, but not working properly. I am using the git version. # grep throttled /var/log/audit/audit.log | audit2allow -M throttled-policy gives Nothing to do and it does not create a .pp file Are you using the COPR version?

I think this is because the git version has a different service name than the copr version. For the git version the service is called lenovo_fix.service while for the copr version the service is called throttled.service.

Update: for my system somehow grep throttled ... still doesn't work but grep "MSR" ... works. It identifies the access to MSR to be from python3 instead of the throttled service. As long as you grepped the denied access to MSR it should work.

What commands did you do exactly? I tried MSR.service to no avail.