cbuchner1 / CudaMiner

a CUDA accelerated litecoin mining application based on pooler's CPU miner
Other
688 stars 303 forks source link

memcpy/kernel concurenncy #83

Open whitesand77 opened 10 years ago

whitesand77 commented 10 years ago

I ran the lastest (commit 132) through the NVIDIA Visual Profiler and noticed there is no concurrency at all with memcpy or kernel execution. Is this known?

I did a small test with the current code to actually use the 2 streams it's coded for and got 3-5% improvement but there was very little concurrency with the current state of code. I don't understand the code well enough to refactor it to fully be concurrent.

On other code I've seen 30-40% improvement with proper use of streams.

cbuchner1 commented 10 years ago

it"s known. issue order problem.

2014-01-28 whitesand77 notifications@github.com

I ran the lastest (commit 132) through the NVIDIA Visual Profiler and noticed there is no concurrency at all with memcpy or kernel execution. Is this known?

I did a small test with the current code to actually use the 2 streams it's coded for and got 3-5% improvement but there was very little concurrency with the current state of code. I don't understand the code well enough to refactor it to fully be concurrent.

On other code I've seen 30-40% improvement with proper use of streams.

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/83 .

cbuchner1 commented 10 years ago

use -H 2 for less memcpy operations. and soon the remaining part will be eliminated by checking hashes on the GPU.

2014-01-28 Christian Buchner christian.buchner@gmail.com

it"s known. issue order problem.

2014-01-28 whitesand77 notifications@github.com

I ran the lastest (commit 132) through the NVIDIA Visual Profiler and

noticed there is no concurrency at all with memcpy or kernel execution. Is this known?

I did a small test with the current code to actually use the 2 streams it's coded for and got 3-5% improvement but there was very little concurrency with the current state of code. I don't understand the code well enough to refactor it to fully be concurrent.

On other code I've seen 30-40% improvement with proper use of streams.

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/83 .