cbuchner1 / CudaMiner

a CUDA accelerated litecoin mining application based on pooler's CPU miner
Other
688 stars 303 forks source link

[2014-01-20] 10% Performance regression vs. 2013-12-18 release #79

Closed andreweller closed 10 years ago

andreweller commented 10 years ago

Summary: 2013-12-18 release was reliably pushing 227-235 khash/s for multiple weeks. Just upgraded to 2014-01-20 release and now I'm seeing 200-212 khash/s.

Config: 2013 iMac w/ GeForce GTX 780M OS X 10.9.1 (13B42)

Command ran on 2013-12-18: ./cudaminer -H 1 -d 0 -i 0 -l K16x16 -o stratum+tcp://:3333 -O :

Avg. rate: 235 khash/s

Commands ran on 2014-01-20: ./cudaminer -H 1 -d 0 -i 0 -l K16x16 -o stratum+tcp://:3333 -O :

Avg. rate: 190 khash/s

./cudaminer -o stratum+tcp://:3333 -O :

Avg. rate: 200 khash/s

./cudaminer -H 1 -d 0 -i 0 -l K256x3 -o stratum+tcp://:3333 -O :

Avg. rate: 207 khash/s

cbuchner1 commented 10 years ago

there is no 2013-01-20 release. You're working with in-development code.

use the Y kernel instead of K

-l Y

Y #SMX x 32 may be the best config. or autotune

2014-01-27 andreweller notifications@github.com

Summary: 2013-12-18 release was reliably pushing 227-235 khash/s for multiple weeks. Just upgraded to 2014-01-20 release and now I'm seeing 200-212 khash/s.

Config: 2013 iMac w/ GeForce GTX 780M OS X 10.9.1 (13B42)

Command ran on 2013-12-18: ./cudaminer -H 1 -d 0 -i 0 -l K16x16 -o stratum+tcp://:3333 -O :

Avg. rate: 235 khash/s

Commands ran on 2014-01-20: ./cudaminer -H 1 -d 0 -i 0 -l K16x16 -o stratum+tcp://:3333 -O :

Avg. rate: 190 khash/s

./cudaminer -o stratum+tcp://:3333 -O :

Avg. rate: 200 khash/s

./cudaminer -H 1 -d 0 -i 0 -l K256x3 -o stratum+tcp://:3333 -O :

Avg. rate: 207 khash/s

— Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/79 .

andreweller commented 10 years ago

Ah, thanks. Y kernel is helping... pushing around 222 khash/s now. Still a bit slower, but this narrows the gap.

andreweller commented 10 years ago

oh, ACTUALLY... Y256x3 was 222 khash/s... but Y16x16 is 246 khash/s!

sweet

cbuchner1 commented 10 years ago

This was submitted by an nVidia performance engineer. You bet it's faster!

2014-01-27 andreweller notifications@github.com

oh, ACTUALLY... Y256x3 was 222 khash/s... but Y16x16 is 246 khash/s!

sweet

— Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/79#issuecomment-33378139 .