T-Troll / alienfx-tools

Alienware systems lights, fans, and power control tools and apps
MIT License
458 stars 39 forks source link

Max Fan Boost Power Level N vs Manual Mode 100% #315

Closed TheSQLGuru closed 1 year ago

TheSQLGuru commented 1 year ago

Describe the bug When I set a manual mode fan level to 100% for a given temp and hit that temp while stress testing, the fan goes to speed X (X varies per fan). If I set a Power Level of 4+, every fan will spin at a much greater rate than the X achieved using 100% on Manual Mode.

It seems that Manual Mode does not scale the fan speed to the maximum achieved by the higher Power Level settings. I believe Manual Mode should provide the same scaling to the same maximum speed for each fan that is achieved by the highest Power Levels.

To Reproduce

1) Set Manual Fan curves of 100% for all fans. 2) Make CPU or GPU or both go above the temp required for 100% fan speed. 3) Note the speed for each fan.

4) Choose Power Level 4, 5 or 6. Repeat steps 2 and 3 above.

Expected behavior Fan speeds between the different runs above should always max out to the same rpm for a given fan.

Screenshots None needed, but I can provide them if requested.

System (please complete the following information): X17 R1 11980HK 3080 Win 10 x64 22H2 nVidia 528.49 AFC - current version (and almost certainly many/all prior versions)

T-Troll commented 1 year ago

No, it's correct condition.

Manual mode ALWAYS have lower RPM cap then Performance/Full speed. For SOME systems, the situation can be leveled using overboost (it's what it for).

PS: For some systems, it's also interesting BIOS behavior - fan RPMs can't stay on top more, then 1 minute and become lower after. I think it's about PWM temperature or overload.

TheSQLGuru commented 1 year ago

I did run the Check Max Boost for my system. And I do notice fans start out at a higher rpm and then rather quickly drop back.

Is there any way to get the Manual Mode to use the same rpm range as thePerformance modes do?

T-Troll commented 1 year ago

Let's say it's unknown for me right now.

But this topic quite interesting in general. And the question is not different profiles and caps (it's reasonable for different power limits), but why boost control disabled for every mode, but manual? I see no reasonable cause for this.

TheSQLGuru commented 1 year ago

That is definitely curious. It sucks having to have the fans wide open full time to get the maximum cooling. While there are obviously workloads that require that, playing most games usually doesn't.

Oh well. I do recommend putting a note somewhere in the Wiki about this to inform others who may notice the same behavior.

You can close this one out whenever you wish.

T-Troll commented 1 year ago

UPD: I check your BIOS. Intersesting... Boost set blocked into AWCC methods, but it's EC-controlled. EC block in RAM, 0xFE0B0800, size 0x1000. There are 4 boost variables there, i know names but lazy to calculate offsets - you can easily found it yourself using CLI. Then you can try to switch power mode, set something there, and check the result.

Also, it's a very interesting variable there, named... MAXQ. Is it can be disabled? O_O

So, as a final - let's dig into it deeper. We can meet something useful. For now, (and i think i mention it into docs), max. rpm, in fact, a function of total power limit. Reasonable.

TheSQLGuru commented 1 year ago

It is late here, and my brain is foggy. I will try to understand and dig into the technical aspects of what you wrote tomorrow.

I note that MAX-Q used to be an nVidia whole-laptop-power "feature". It worked both the GPU and CPU to "optimize" power/performance/battery life.

I thought it was deprecated for some years now. But it is very much active, just not marketed any longer:

https://www.nvidia.com/en-us/geforce/gaming-laptops/max-q-technologies/

That page states that nVidia's 40xx series laptop GPUs have "Fifth-Gen Max-Q Technologies".

I wonder if this is what you are seeing in that variable.

What am I checking my BIOS for? The Overclocking and Fan Settings? OC os off, Fans are set at normal.

I note that I do still have AWCC installed, although it is completely shutdown. I still have work to do to create the matrix of data you need to build the lighting module for X17 R1 laptops.

I DEFINITELY question the assertion that max rpm is a function of total power limit. When I was doing my testing, I had Bitsum's Highest Performance power mode enabled for all testing, as well as Throttle Stop set to help CPU max out perf and all-core clock speed while just preventing thermal throttling in Cinebench23. How would your Power Levels make the total power limit any higher than it would be for that configuraiton? Oh, and is there a way I can see the currently set "actual" power limit?

TheSQLGuru commented 1 year ago

I finally got a chance to circle back to this. And unfortunately my knowledge of HOW to find/check/set the things you mention is very limited. I am familiar with debuggers and some utilities that allow accessing firmwares and memory, but only from a high level. Could you point me to some primers/utilities I could study to come up to speed?

Also, If I flesh out my VS 2022 Enterprise Edition with all of the C/C++ goodies it has, can I debug your code to view/access the values you mention?

T-Troll commented 1 year ago

Could you point me to some primers/utilities I could study to come up to speed?

RWEverything is a nice tool to dig into system/BIOS. But be careful - one mistake, and you have BSOD (if you're lucky) or need power cycle or even BIOS reset!

Also, If I flesh out my VS 2022 Enterprise Edition

Too bad. 2019 only can be used. The reason is simple, 3 letters - WDK. You can review my code to understand what exactly and how i set.

TheSQLGuru commented 1 year ago

I do have RWEverything, and have played around with it a bit. I even think I gave you some data from my system from it. But I really don't know enough about the app's interface and capabilities nor the data I am viewing (yet).

My old laptop has VS 2019. But assuming you mean Windows Driver Kit with WDK, surely it has been released officially since this post a year ago:

https://learn.microsoft.com/en-us/windows-hardware/drivers/installing-preview-versions-wdk

Well f*ck. Looks like I may not be able to use the new laptop after all. Per this page, the latest WDK build requires VS 2022 AND Windows 11 22H2. I am on Windows 10 21H2. I wonder I can install the Win 11 WDK on Win 10. :-D

T-Troll commented 1 year ago

Yep. You are right about WDK - it's a such a mess here... That's why i stuck with 2019. Good news, you CAN build Fan SDK v2 in 2022, but not v1.

The difference is v1 using direct ACPI calls (BTW... ALL ACPI is a one big security hole, and you can't avoid it. It's even OS-independed, so don't tell me something like "Linux is secure" anymore - no, it didn't), and v2 using hole Dell leave into WMI (but it cost some overhead). Methods are the same.

Why i ask for BIOS dump (RWEverything - ACPI->Save All)? I just check for common Alienware functions there and their interfaces, as well as what they really do. Also sensors and other stuff. Now i also have a tool, which decrypt Windows PNP data from BIOS dump - it gives me hint about what functions you system have and how it mapped to ACPI method calls.

Good: Propietary AWCC methods are universal. Any Alieware/G-series have the same set, with little variations. Bad: Some functions are buggy (seems like Dell have 3 different BIOS developer teams - it's visible by style, and one of them is total CRAP), some can't be accessed this way (but some workarounds available as well).

I don't mess with EC in v2, But it's possible via v1.

TheSQLGuru commented 1 year ago

HUH, it seems that I can install Win 11 Driver Kit. And if your stuff requires kernel mode debugging a quick review makes me think I might need my old XPS laptop to connect to and debug my X17 R1. Will I need to do that, or can I debug the AlienFX stuff all within my X17? That would obviously be preferable.

That tool you have is probably handy. When I was poking around with RWE I was able to find a lot of code sections that looked pretty hairy to sort out.

What is "EC"?

I'm not surprised you can notice different development teams. I have come across that at many clients over the years too. I hope the X17 stuff is NOT done by the DUMB @SS crew!! LOL

I have a crazy-busy week (or few) coming up, so probably I won't be able to try to get my dev environment configured to build/debug through your code. Well, unless I spend time doing that when I should be doing other stuff! ;-P

T-Troll commented 1 year ago

And if your stuff requires kernel mode debugging

No, you don't need kernel debugging until you don't alter driver (hwacc.sys). BTW, both hwacc and KDL sources only available at 6.x.x.x - antivirus become crazy about KDL, so i drop it from current project.

That tool you have is probably handy.

It's handy for WMI mapping reveal. Dump + brain is enough to understand how it really works.

What is "EC"?

Embedded controller. See BIOS dump, as well as "EC" button in RWE. It's a storage of hardware data variables.

I hope the X17 stuff is NOT done by the DUMB

It's interesting... No, it's 2nd team (they like function calls instead of just change data). But seems like Dell know about me - they try some obfuscation in method calls, so it's harder to understand what it really changes.

I have a crazy-busy week

Well... Me too, i even into the middle of relocation to different country (or maybe not, i like my current). Good luck and feel free to ask questions!