Closed xberg closed 5 years ago
Tried to do some debugging. Added those lines just before line 127: prf "gpu:"$gpu prf "current_t:"$current_t prf "old_t:"${old_t[$gpu]}
$gpu = 0 $current_t = 78 old_t is blank
So there seems to be an issue with old_t at launch on a multi gpu setup.
EUREKA: found the bug. in temp.sh, line 188 should be: for i in $(seq 0 "$num_gpus_loop"); do old_t["$i"]="0" done
... but everything is not all OK yet. Unless I misunderstood how to use your software, only GPU0 is affected. All other GPUs are at 0 fan speed creating a potential hazard :)
Found and fixed a second bug related to my comment just above. Line 222 should be: gpu="$i" and NOT gpus="$i"
Not sure who did your multi GPU debugging but I hope you did not pay him too much haha :)
Oh wow you did quite some work there! Haha yeah I haven't checked in with my multi-GPU tester in a while so there was bound to be some bugs. I'll have a look and see what I can find! Thanks for the work you've done with debugging it, certainly reduces the amount of work I have to do xD
Alright, with those two bugs fixed, uncomment line 154 (the echo_info
line), what is the error now? Or is it still the same as before, with line 127?
Sorry that I was not clear. With the 2 bugs I fixed above temp.sh works perfectly, no need to fix anything else. Now I did not yet test the other code, update.sh which logically should have the same bugs. Thanks for a great and very useful piece of code.
No problem! Glad you're finding it useful :)
Thanks for finding and fixing the bug! I'll be putting your name in the README
, for your help. :D
Hi, Running version 17 with 6 GPU. Launched temp.sh and made no changes to the default config. I get the following error:
./temp.sh: line 127: [: : integer expression expected
Your line 127 is: if [ "$current_t" -ne "${old_t[$gpu]}" ]; then
Script started nicely: Number of Fans detected: 6 Number of GPUs detected: 6 tdiff average: 10
Attribute 'GPUFanControlState' (central2:0[gpu:0]) assigned value 1.
...
Running the following command to see if all fans detected: nvidia-settings -q fans
6 Fans on central2:0
My setup: Ubuntu 18.04.1 Nvidia 396.54 6x 1060 GTX 3GB Good luck: let me know if I can help you with anything.