Open nmrcardoso opened 9 years ago
It's likely in the future that cuFFT will support using multiple GPUs in a node, with each controlled by MPI. While there is a multi-GPU interface (cufftXt) this is only for deployment on a single thread.
I coded up a distributed FFT for QDP++ a while ago and I think it can be revamped to work with QUDA as a standalone. It requires MPI (or at least access to MPI communicators in QMP), but it should work. I am currently preparing stuff for the Lattice conference, but I work on adopting it to QUDA after that.
Thorsten, It would be great if you implement that in quda.
On Thu, May 28, 2015 at 1:58 PM, Thorsten Kurth notifications@github.com wrote:
I coded up a distributed FFT for QDP++ a while ago and I think it can be revamped to work with QUDA as a standalone. It requires MPI (or at least access to MPI communicators in QMP), but it should work. I am currently preparing stuff for the Lattice conference, but I work on adopting it to QUDA after that.
— Reply to this email directly or view it on GitHub https://github.com/lattice/quda/issues/255#issuecomment-106566354.
I have not checked but maybe http://accfft.org/ is of interest.
I already sent an e-mail to Amir Gholami, from accfft, last week asking if they are willing to support 4D FFTs. Still waiting for a response, hope for a positive one.
On Mon, Nov 23, 2015 at 4:18 PM, Mathias Wagner notifications@github.com wrote:
I have not checked but maybe http://accfft.org/ is of interest.
— Reply to this email directly or view it on GitHub https://github.com/lattice/quda/issues/255#issuecomment-158984341.
Even without 4d support though you could use this though couldn't you?
I saw a poster on this as SC15 last week and spoke to the presenter (whose name I don't recall). Looked very promising.
I just looked at the source code. It has been licensed under GPL so just a heads up to be careful that none of its source code is migrated into QUDA.
I have not yet looked closely, but I hope that I can still use 2D+2D or 3D+1D FFT decompositions for parallel FFT with this library. However, if they will support 4D it will be better to have on our side a cleaner code.
On Mon, Nov 23, 2015 at 4:43 PM, maddyscientist notifications@github.com wrote:
I just looked at the source code. It has been licensed under GPL so just a heads up to be careful that none of its source code is migrated into QUDA.
— Reply to this email directly or view it on GitHub https://github.com/lattice/quda/issues/255#issuecomment-158991300.
The gauge fixing with FFTs, computeGaugeFixingFFTQuda(...), does not support multi-GPUs. The multi-GPU gauge fixing support is only available wwith the gauge fixing with overrelaxation, computeGaugeFixingOVRQuda(...).
Also in the gauge fixing code with FFTs, the memory reads for Delta(x) and g(x) are enabled by default to use the texture memory:
define GAUGEFIXING_SITE_MATRIX_LOAD_TEX
to disable this for now the option is to comment this line. Also in order to reduce the memory usage in the FFT gauge fixing, the pre-calculation of the g(x) is disabled by default:
define GAUGEFIXING_DONT_USE_GX