Closed Eraden closed 12 months ago
Thanks for implementing this. When testing a locally built amdfand from this PR: the UsagePoint matrix currently doesn't seem to blast the fans during gaming with the following mapping.toml
. Whichever matrix has the highest speed defined should take precedent, right?
Yes. This is still in progress but it should works now. When did you cloned and built? Only newest version is working correctly
Yes. This is still in progress but it should works now. When did you cloned and built? Only newest version is working correctly
I built it yesterday 23:21:39 CEST (Central European Summer Time). Testing with a rebuilt version now yields the same result. If the matrix calculations are working correctly, then I hypothesize the usage calculation needs to be calculated differently.
As some game engines tend to batch queue draw calls, where a few milliseconds the GPU is at 99% load, and most of the time 0-10%.
A way to combat this would be to calculate an accurate busy percentage by comparing measurements, similar to how nvtop implements it here and here.
Thanks for resources. And I will also try to solve this ASAP
Hi, current code should works properly. I tested it with glmark2
and amdfand did use GPU usage to manage fan speed.
I need to fix GUI so this will be on hold until fix but you can compile from this branch and use it safely.
cargo build --release --bin amdfand
It does detect GPU usage now and ramps up accordingly! However, it now seems to sporadically set the speed to 100.
I presume this line here is the root cause.
Instead of fixed constants, the min and max should be more variable, similar to the min_speed
and max_speed
functions. (refactored to use the usage matrix)
The speed still sporadically goes to 100, while the defined max is set to 33.
With light desktop usage around the 5-10% GPU load, it seems to think the max is reached somehow.
I added some additional logs. Can you share your logs with me?
Please git fetch && git reset origin/feature-gpu-usage
and compile.
Then run with at least Debug log level and share output of there was at least couple full speed occurs
This will allows me to understand why this is happening.
Also consider adding 2 additional points in usage matrix to tune service a little bit
Appreciate the detailed instructions, however, you don't have to bother. Have created an AUR package called amdfand and am building from a locally modified version that pulls the feature-gpu-usage
branch. 🙂
With 2 additional points in the usage matrix:
Without any usage matrix defined (commented out), the speed sporadically goes up higher, too. This time to 94 instead of 100.
Longer debug log. Tested with both matrixes enabled again with an additional point added to the usage matrix to equalise numbers between the matrixes to six. Same thing happens.
I'm currently stuck with my day-to-day job. I'll try to improve this implementation ASAP but I can't promise anything now
As git diff
:
diff --git a/crates/amdgpu-config/src/fan.rs b/crates/amdgpu-config/src/fan.rs
index 41573c9..ed7c22d 100644
--- a/crates/amdgpu-config/src/fan.rs
+++ b/crates/amdgpu-config/src/fan.rs
@@ -108,11 +108,11 @@ impl Config {
pub fn fan_speed_for_temp(&self, temp: f64) -> f64 {
let idx = match self.temp_matrix.iter().rposition(|p| p.temp <= temp) {
Some(idx) => idx,
- _ => return self.min_speed(),
+ _ => return self.min_speed_for_temp(),
};
if idx == self.temp_matrix.len() - 1 {
- return self.max_speed();
+ return self.max_speed_for_temp();
}
linear_map(
@@ -125,12 +125,13 @@ impl Config {
}
pub fn fan_speed_for_usage(&self, usage: f64) -> f64 {
- let Some(idx) = self.usage_matrix.iter().rposition(|p| p.usage <= usage) else {
- return 0.0;
+ let idx = match self.usage_matrix.iter().rposition(|p| p.usage <= usage) {
+ Some(idx) => idx,
+ _ => return self.min_speed_for_usage(),
};
if idx == self.usage_matrix.len() - 1 {
- return 100.0;
+ return self.max_speed_for_usage();
}
linear_map(
@@ -158,14 +159,22 @@ impl Config {
self.update_rate
}
- fn min_speed(&self) -> f64 {
+ fn min_speed_for_temp(&self) -> f64 {
self.temp_matrix.first().map(|p| p.speed).unwrap_or(0f64)
}
- fn max_speed(&self) -> f64 {
+ fn max_speed_for_temp(&self) -> f64 {
self.temp_matrix.last().map(|p| p.speed).unwrap_or(100f64)
}
+ fn min_speed_for_usage(&self) -> f64 {
+ self.usage_matrix.first().map(|p| p.speed).unwrap_or(0f64)
+ }
+
+ fn max_speed_for_usage(&self) -> f64 {
+ self.usage_matrix.last().map(|p| p.speed).unwrap_or(100f64)
+ }
+
fn default_refresh_delay() -> u64 {
4000
}
Debug log and mapping with the local changes suggested above. It now works as expected, ramping up as defined within the mapping:
During gaming, the fans do still ramp up and down between the low and the high of the curve a little too much. As frame-to-frame the game GPU usage varies, one frame it could be 0, the next frame it could be 80.
To normalize the RPM speed adjustments more, amdfand
could implement another polling thread for only the GPU usage calculation. Where it would poll more frequently (e.g. 100ms) independently of the defined update_rate
.
Then during the regular tick cycle, the highest max encountered usage of that independent GPU usage poll thread would be picked.
That would lower the potential of a single frame 0% GPU usage affecting the card's RPM.
Resolves #59