preda / gpuowl

GPU Mersenne primality test.
GNU General Public License v3.0
163 stars 39 forks source link

Slowdown with ROCm 3.7 -- 333M exponents -- Radeon VII #188

Closed valeriob01 closed 4 years ago

valeriob01 commented 4 years ago

ROCm 3.3 performance gpu1 gpu2 min: 2404 us/it 2427 us/it max: 2443 us/it 2510 us/it

ROCm 3.7 performance gpu1 gpu2 min: 2501 us/it 2524 us/it max: 2538 us/it 2612 us/it

about 102 us/it slower on gpu2.

preda commented 4 years ago

Thanks. I guess I'll be looking forward to ROCm 4.x or 5.x, hopefully by that time they'll start looking into performance as well.

preda commented 4 years ago

Based on https://github.com/RadeonOpenCompute/ROCm/issues/1124 this seems to indicate similar performance between 3.5 and 3.7. (I didn't measure, as I see no reason to upgrade to 3.7 at this point).