yogevb / a-dda

Automatically exported from code.google.com/p/a-dda
0 stars 0 forks source link

GPU acceleration #118

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The idea is to use modern graphics cards to accelerate ADDA.

Marcus Huntemann and Georg Heygster have succeeded in porting ADDA to OpenCL, 
so incorporating their code should address this issue.

Original issue reported on code.google.com by yurkin on 2 Dec 2010 at 5:02

GoogleCodeExporter commented 9 years ago
The current revision r1023 is capable of doing the matrix vector multiplication 
part on the GPU as long as no dimension is bigger than 512/256 on Nvidia/AMD 
GPUs respectively. This restriction is because of the used FFT routine.

This might be a new issue but belongs to GPU acceleration as well:
AMD GPUs seem to suffer a lot more from unaligned read and write operations on 
the global GPU memory in a kernel. So the next task is to align the global 
memory access using local memory as cache. Both, AMD and Nvidia GPUs will 
benefit from that.

Original comment by Marcus.H...@gmail.com on 20 Feb 2011 at 8:26

GoogleCodeExporter commented 9 years ago

Original comment by yurkin on 10 Jun 2011 at 2:05

GoogleCodeExporter commented 9 years ago
I have added a few specific issues for further development of adda_ocl. It is 
already quite mature and almost ready for ADDA 1.1 release. Thus I am closing 
this issue.

Original comment by yurkin on 16 Apr 2012 at 1:38