AMReX-Codes / amrex

AMReX: Software Framework for Block Structured AMR
https://amrex-codes.github.io/amrex
Other
553 stars 352 forks source link

FFT Poisson Solver: Neumann and Dirichlet Boundaries #4202

Closed WeiqunZhang closed 3 weeks ago

WeiqunZhang commented 3 weeks ago

Add support for Neumann and Dirichlet boundaries in the FFT based Poisson solver. This requires cosine and sine transforms. For CPU builds, we use FFTW for these transforms. For GPU builds, we have implemented cosine and sine transforms using the real-to-complex transform provided by cuFFT, rocFFT and oneMKL.

WeiqunZhang commented 3 weeks ago

Notes on the implementation of cosine and sine transform are available at https://www.overleaf.com/read/krjbcfhfgvmj#f7c9e1.

AlexanderSinn commented 3 weeks ago

I once made an optimized FFT Poisson solver for GPU in HiPACE++: https://github.com/Hi-PACE/hipace/blob/development/src/fields/fft_poisson_solver/FFTPoissonSolverDirichletFast.cpp. It does a single-rank 2D DST-I using the Fast Sine Transform algorithm from page 238 of Computational Frameworks for the Fast Fourier Transform by Charles Van Loan. This does not require expanding the domain by 2x or 4x like it is currently done in this PR for the R2R FFTs. The following Pages have similar algorithms for DST-II, DST-III, DCT-II and DCT-III.

I also found that it was better to implement the R2R FFT directly in the Poisson solver instead of in the FFT wrapper so that the pre- and post-processing GPU kernels can be combined with the transposes (here ParallelCopy). 

WeiqunZhang commented 3 weeks ago

That's good know. What we need is batched 1D DST and DCT. I guess that might be even easier than the 2D FFT you have implemented.

WeiqunZhang commented 3 weeks ago

Ready for review, but let's not merge it until after the monthly release.