nicehash / excavator

NiceHash's proprietary low-level CUDA miner
https://www.nicehash.com
53 stars 19 forks source link

Fanspeed minimums not being utilized if under fanspeed start #302

Closed DroptheHammer closed 3 years ago

DroptheHammer commented 3 years ago

I've got my Smartfan set as attached at the bottom. My intent is that the fans initially use "Start_Level" at a value of 70% or higher such that the Junction temps dont run away into the 95+ range BEFORE the smartfan can adjust properly. Ideally if VRAM_targ (Mine is 92C) is not being hit then speeds on the fan should slow down to my accepted minimum (In this case 60% for fan_override_level_min=60.

The Issue I'm highlighting is that my speed never slows under the "start_level".

Even if I make the Start Level something like 80 and the minimum something very far like 20% AND I'm running Junction/VRAM temps FAR LOWER than than VRAM_targ; the speeds never reduce less than the Start Level and I end up at like VRAM 78C with Fans at 80% (Start_Level).

In a perfect world, my fans would start sort of high to catch runaway temps (I have a 3090 that can get to 98 Junction REALLY fast). Then speedfan adjusts slowly over time down as far as possible to maintain minimum noise while staying at my VRAM_targ.

I will say the function of speeds increasing if temperatures increase OVER VRAM_Targ does function correctly, and say they go to 100% as VRAM Spikes (say I left my window closed and the room heated up), it will lower speeds back down to Start_Level, but it will never continue past that to minimum speed, even if the card is cooler than VRAM_Targ

Thanks, Hammer

"event": "on_quickminer.start", "commands": [{ "id": 1, "method": "device.smartfan.set", "params": ["0", "3", "60", "92"] },{ "id": 1, "method": "device.smartfan.set.advanced", "params": ["0", "70", "60", "90", "0", "0", "0", "0"] }, { "id": 1, "method": "device.set.oc_profile2", "params": ["0", "1110", "10352"] }]

nicehashdev commented 3 years ago

You have provided quite invalid values for method device.smartfan.set.advanced which breaks algorithm. You should call devices.get to get default values of these first and see what kind of values are default and use these instead of 0.

In fact, current version has following default values (in order after providing GPU ID):

You can see this from the following output when calling devices.get:

..."smartfan":{"mode":2,"fixed_speed":100,"target_gpu":58,"target_vram":90,"start_level":75,"override_level_min":-1,"override_level_max":-1,"decrease_k":200,"increase_k":2000,"increase_n_gpu":-3,"increase_n_vram":0}...

So you should modify your call like this:

...{
"id": 1,
"method": "device.smartfan.set.advanced",
"params": ["0", "70", "60", "90", "200", "2000", "-3", "0"]
}...
DroptheHammer commented 3 years ago
Command parameter # Type Description
1 string Device ID or Device UUID.
2 string Starting fan level when mode 2 or 3 is used.
3 string Override default min level.
4 string Override default max level.
5 string Decrease multi K constant.
6 string Increase multi K constant.
7 string Increase add n constant (when GPU target is used).
8 string Increase add n constant (when VRAM target is used).

So I found this in the Wiki to work from but didnt know what values 5 - 8 really mean (since I dont know what multi-K and n-constant are).

So you've set... 200, 2000 , -3 and 0 here. What is a 200 multi K? What is a 2000 multi K? what is a -3 and a 0? Are these percentages? RPM? What am I adjusting and to what end?

Thanks for responding, amazing support from Nicehash!

nicehashdev commented 3 years ago

I have updated docs a bit, check it out.

DroptheHammer commented 3 years ago

Thanks @nicehashdev! I think from that I can figure it out. Maybe in the LONG term breaking it down to less algebra might be easier to birdbrains such as myself but I think I can see how this works.

Thanks!