Closed valeriob01 closed 3 years ago
Please try with two processes per GPU. While one process has a mild slowdown, two shows a speed-up.
1359 us/it, however this way the GEC time is doubled. Not worth it.
Why not worth it? if you run 2 processes in parallel, you should divide all the times by 2; 1360/2 = 680, and doubled-GEC/2 stays the same.
On Sun, 1 Nov 2020 at 20:45, valeriob01 notifications@github.com wrote:
1359 us/it, however this way the GEC time is doubled. Not worth it.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/preda/gpuowl/issues/200#issuecomment-720061423, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFMO6VHH44FUB5CWONQBMDSNUU23ANCNFSM4TGMF5IQ .
I prefer the classical way of 1 job per GPU, it makes me feel the system is more stable.
Sure that's fine, I also run one GPU in one-task-only. The slowdown is mild, there is a moderated decrease in memory allocation at the same time, and who knows maybe things will improve in the future (i.e. the slowdown being cancelled) hopefully, we'll see.
On Sun, 1 Nov 2020 at 21:36, valeriob01 notifications@github.com wrote:
I prefer the classical way of 1 job per GPU, it makes me feel the system is more stable.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/preda/gpuowl/issues/200#issuecomment-720067106, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFMO6QKBAUO2HB4RTYDVIDSNU2ZTANCNFSM4TGMF5IQ .
ROCm 3.9 is out, not testing it for now. I am busy with other things. :-)
I tried it, no improvement for me.
On Sun, 1 Nov 2020 at 21:56, valeriob01 notifications@github.com wrote:
ROCm 3.9 is out, not testing it for now. I am busy with other things. :-)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/preda/gpuowl/issues/200#issuecomment-720069706, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFMO6QQXWMSD6J7HMDMRSTSNU5GVANCNFSM4TGMF5IQ .
Mild speedup with latest commit, from 706 us/it to 696 us/it.
Big speedup with carry32, from 696 us/it to 680 us/it.
from 6xx us/it to 706 us/it, exponent 106xxxxxx.