tromp / cuckoo

a memory-bound graph-theoretic proof-of-work system
Other
822 stars 173 forks source link

Reduce cudaMemcpy size of hostA to sizeof(u32) #76

Closed 0xSSoul closed 5 years ago

0xSSoul commented 5 years ago

cuckoo/mean.cu cuckaroo/mean.cu cuckatoo/mean.cu

cudaMemcpy(hostA, indexesE, NX * NY * sizeof(u32), cudaMemcpyDeviceToHost);

should be change to

cudaMemcpy(hostA, indexesE, sizeof(u32), cudaMemcpyDeviceToHost);

to reduce io between device and host a little?

tromp commented 5 years ago

Well spotted! This line dates back from before we added the Tail kernel for edge compaction. Will fix in next update... Thanks!