Closed londresarthur closed 2 years ago
It's a good question which temperature is the right one... I got temps from BIOS reading, let me check your BIOS (i have a dump) - maybe they add some correction.
Meanwhile, try to run alienfx-mon
and enable ESIF sensors - did one of them have the readings from GPU-Z?
as the temperature slowly increases during use.
It's correct. You have shared cooling system between CPU and GPU, so CPU and MB fry GPU after some time even if not used.
I'm 100% sure that the correct one for the dGPU is the "Temp 2" sensor, I did several tests and the temperature reported on the GPU-Z is exactly the same as that reported by the Temp 2 sensor.
Yes, it's a bug (in fact - two bugs) into your system BIOS.
Temps are shifted down (CPU too). For CPU, difference should be 0-3C, for GPU (JFYI - they use \_SB.PC00.LPCB.ECDV.TVGA._TMP()
) - difference can be up to 27C.
Let me think - i can't use ESIF data for fan control directly (it's from WMI, so provide heavy impact on both size and performance), but i try to configure out how to detect incorrect data.
Meanwhile, i recommend to start fan boost early (about 20C), this should compensate it.
I believe this sensor reported as GPU Internal Thermistor simply has nothing to do with the actual GPU temperature. When I do a load test on the GPU the temperature reported by this sensor does not change, it only changes with hours of use. So I believe that the sensor reports some temperature from some part of the chassis, but that has nothing to do with the GPU.
The CPU temperature appears to be correct, it is reporting the same temperature as the Throttlestop, which reads directly from the CPU.
BTW, sensors Temp 5, 6 and 7 are all CPU Temperature too, they report the same temperature as the Throttlestop, and as you can see, the temperature of these sensors in the last screenshot reports very similar temperatures as GPU sensor (Temp 3).
I believe this sensor reported as GPU Internal Thermistor simply has nothing to do with the actual GPU temperature.
Even more interesting. As i see at reading function, they have a barrier based on other ACPI flag - so it not reports real sensor data in some cases. And, even worse, BIOS control GPU fan based on this data!
In fact, i wonder why ESIF data does not expose in thermal zones (it's G5 "feature", all of them like this, but all Alienware do so), seems like they use different ACPI blocks for it. You can try to locate it into your BIOS, and i'll add support for reading and control.
Update: Uff... It's into \_SB.PC00.LPCB.ECDV
. But there are some issue. All temps can be readed trough method KDRT
with number as parameter. But it's names... in different blocks with non-numeric device name.
BTW, you also have CPU/GPU VR temp, GPU mem temp and battery temp.
Ok, let's test it!
Here is a test version of alienfan-cli
- alienfan-cli.zip
Can you please:
kdl.dll
and hwacc.sys
)alienfan-cli test=X
, where X is from 0 to N (at least 6, maybe more).I'm interested into output log (to define how to count sensor's quantity). Meanwhile, check the data from sensors, it should be same as ESIF.
WARNING! Be careful, incorrect input value can provide BSOD, so close all other apps. You didn't break anything, anyway (data not modified).
Was it supposed to happen like this?
No, this means method call failed. It should be 2 strings... Let me check....
I tested up to n=30, and nothing.
Just in case, there is my ACPI dump:
Thanks. I remove it - it have some sensitive data inside!
it have some sensitive data inside!
what kind of data?
Yes! I found the issue - these bios have different way to values!
Please try this CLI - alienfan-cli.zip
Task is the same (in case "Test result" is 1, not 0!).
what kind of data?
Your full system data (tags, manufacturing info) and Windows security keys. Better not share it for public.
test=5 gave me the actual GPU temperature
what kind of data?
Your full system data (tags, manufacturing info) and Windows security keys. Better not share it for public.
Thank you!
Is it possible to read the temperature of the VRM through this method?
from test=0 to test=15:
Yes, it's correct now! BTW, AWCC (And their BIOS functions) reading sensor 4. But right one is sensor 5 (even by name).
Is it possible to read the temperature of the VRM through this method?
You can check what all of this means:
Can you please check MORE sensors? You have 6, but i interested what happened if you ask for 10th or so.
Oh, i see! 255 (-1). Niiiiice!
Ok, let's test. Here is AlienFan tools - AlienFan.zip
First, test CLI - alienfan-cli temp
- it should expose 7 sensors (but names are weird for some, i didn't configure out how to read it correctly for now).
If this works, start GUI....
IT WORKED!!! Thank you very much!
You are welcome!
Let me do some polish and maybe configure out how to get names, so wait for new official release. If all work well into it - i close this task.
PS: Looking at your curve, I recommend spin fans earlier - it needs some time to spin up, especially to overboost.
- Method(_TMP
Ok, so: 01 - CPU Package Sensor 02 - CPU VR Sensor 03 - dGPU VR Sensor 04 - dGPU VRAM Sensor 05 - AWCC (?) 06 - dGPU Sensor
I just didn't understand the sensor called "AWCC".
upespecially to overboost.
Overboost doesn't work on my laptop, never goes above 5000 rpm.
I just didn't understand the sensor called "AWCC".
It's what AWCC used as a GPU sensor ^_^
Overboost doesn't work on my laptop, never goes above 5000 rpm.
Overboost don't work in G-Mode. I think i need to disable it before testing, just forget to do so.
By the way, here is a new version - it adds ECDV sensors for different BIOS variations. Can you check all still correct for your gear? (Also, it has some fixes for overboost - see #150).
Overboost don't work in G-Mode. I think i need to disable it before testing, just forget to do so.
Overboost still doesn't work, even with G-mode off, the RPM doesn't go beyond 5000 rpm at all:
Can you check all still correct for your gear?
Yes, everything is still working correctly:
Interesting... For G-series overboost quite high in common - about 150+. But seems like your BIOS is nicely tuned.
By the way, you can experiment - alienfan-cli setover=0,150
for example. But i don't sure you fans can run above 5000....
Anyway, thank you for testing!
It's what AWCC used as a GPU sensor ^_^
Yes, but I don't understand where this value comes from. It doesn't seem to be something random and doesn't seem to be influenced by the other sensors, very curious.
You need to study hardware design to answer your question. From the software side, it's just one of the temperature sensors connected to EC bus. Have no idea there it connected really...
... But can guess it can be Ambient sensor (i have one at my gear) or SSD (have too).
Describe the bug The software is incorrectly reporting the GPU temperature, it is probably reading an ambient temperature sensor, just like AWCC does.
To Reproduce Steps to reproduce the behavior:
Expected behavior It would be interesting if the software read the actual temperature of the dGPU, as well as other software (such as GPU-Z and MSI Afterburner), to be able to define a temperature curve for the dGPU too
Screenshots
Desktop (please complete the following information):
Additional context The same error happens using the AWCC, I don't understand if this is an ambient temperature sensor, but it appears to be, as the temperature slowly increases during use.