betterscientificsoftware / bssw.io

Better Scientific Software Homepage
https://bssw.io
Other
133 stars 85 forks source link

FFTs on Exascale Platforms: New and Ongoing Challenges for Algorithms and Standardization #1779

Open acanning99 opened 10 months ago

acanning99 commented 10 months ago

FFTs are heavily used in many modern scientific codes that are represented in the Exascale Computing Project (ECP) applications codes such as materials science, chemistry, molecular dynamics, accelerator design, experimental X-ray data imaging and cosmology. Over the last decade or so the FFTW API became the de-facto standard FFT interface. Vendors that provide FFT libraries still provide their own proprietary interface for backwards compatibility reasons, while all current vendor high-performance libraries, including AMD, Intel, IBM and Nvidia FFT libraries implement only a subset of the FFTW interface and FFTW itself is not supported on GPUs. Also, most scientific codes use distributed 3D parallel FFTs and there is no clear standard for the parallel distributed data layout with most scientific codes using their own domain specific data structures and hand written 3D FFTs. This is in contrast to linear algebra where the ScaLAPACK parallel data layout has become the standard and used in most scientific codes. This has resulted, for codes using their own 3D FFTs, in an enormous cost of resources in code rewriting moving to new architectures in particular with the proliferation of new GPUs and their specific languages. Also while single GPUs on exascale machines have become more and more powerful and very efficient for FFTs the communication bandwidth between nodes (and on node GPUs) has not increased in the same way. This has resulted in a very significant algorithmic challenge for efficient parallel scaling of distributed FFTs on exascale architectures to limit the communications to a minimum. In many scientific application codes the FFTs represent the bottleneck to efficient parallel scaling and hence can limit the physical sizes, resolution and time scales of scientific problems that can be studied. In this blog we will discuss in more detail, with examples, the issues listed above and what progress has been made in solving these issues in recent years and in particular what FFT software is available for exascale type machines such as the heFFTe and FFTX packages developed through the ECP. We will also discuss lessons learned in the development of these software packages.

acanning99 commented 10 months ago

I am planning to write this blog article in collaboration with people in the ECP CoPA-FFTX project (which I am part of) and the ECP heFFTe project. Stan Tomov (head of heFFTe) has approved this abstract for submission as a blog.

rinkug commented 10 months ago

Thank you @acanning99. @bernhold FYI.

bernhold commented 10 months ago

Thanks @acanning99 ! We're looking forward to your contribution. Can you give me a date for the initial draft? We won't hold you to it, but it will help us plan our blog publications and keep us from bugging you too often. Thanks

acanning99 commented 10 months ago

Hi David, sorry for the slow response. Was on vacation this week. I was thinking end of Sept but with all the deadlines for deliverables for ECP by end of Sept maybe mid Oct would be a more reasonable date

Andrew

On Sun, Aug 20, 2023 at 1:24 PM David E. Bernholdt @.***> wrote:

Thanks @acanning99 https://github.com/acanning99 ! We're looking forward to your contribution. Can you give me a date for the initial draft? We won't hold you to it, but it will help us plan our blog publications and keep us from bugging you too often. Thanks

— Reply to this email directly, view it on GitHub https://github.com/betterscientificsoftware/bssw.io/issues/1779#issuecomment-1685390174, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3ZUQMNA3OKYIXQW4JLYWWTXWJXBJANCNFSM6AAAAAA3WCW5EM . You are receiving this because you were mentioned.Message ID: @.***>

bernhold commented 10 months ago

Thanks Andrew. We'll look forward to something in the mid-October time frame.

bernhold commented 8 months ago

Hi @acanning99, with mid-October fast approaching, I just wanted to check how the article was coming? Do we need to revise the plan? Thanks

acanning99 commented 8 months ago

Hi David, maybe more reasonable date is towards the end of October as everyone still seems fairly busy with ECP stuff regards

Andrew

On Wed, Oct 11, 2023 at 3:15 PM David E. Bernholdt @.***> wrote:

Hi @acanning99 https://github.com/acanning99, with mid-October fast approaching, I just wanted to check how the article was coming? Do we need to revise the plan? Thanks

— Reply to this email directly, view it on GitHub https://github.com/betterscientificsoftware/bssw.io/issues/1779#issuecomment-1758622382, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3ZUQMMJTVUOY4WBA6JFHG3X64LATANCNFSM6AAAAAA3WCW5EM . You are receiving this because you were mentioned.Message ID: @.***>

bernhold commented 6 months ago

Hi @acanning99 just checking in for an update. Have you started writing yet? Thanks

bernhold commented 5 months ago

Hi @acanning99 wanted to check one more time for any updates on this? Thanks