Is gearshift also a high-level interface to computing fft's?

emmenlau commented 5 years ago

First, two thumbs up for this great work!

I found gearshift recently and find it useful to perform offline analysis of best-performing fft implementations/devices. But to me, the logical next step would be to unify this analysis with the application. My question: Is gearshift an abstract (high-level) interface to the different fft implementations? In other words, is gearshift also a suitable fft library?

I did not check the code, but was slightly surprised not to find this highlighted in the documentation or paper. It seems only logical that HPC software (with a need for frequent fft's) would no longer need to access all the different native implementations, but would rather employ gearshift to perform both an online benchmark of the requested fft size and compute the requested fft result?

Is this already possible, and/or an intended use? Or do you see obstacles for such a use?

tdd11235813 commented 5 years ago

thanks a bunch for your feedback, very much appreciated! :) You are right, a generic FFT interface that wraps different libraries (at compile-time) is a must-have, when you are into heterogeneous programming. gearshifft does not provide such API, as it is focusing on benchmarks of round-trip FFTs.

However, there is a C++11 library that tries to abstract FFT libraries. In our team the library liFFT was developed (credits to @Flamefire), and we really want to continue the development because we also need such thing for our applications like image reconstruction algorithms, which we also implement with alpaka. liFFT provides kind of the least common denominator of the FFT APIs (fftw, cufft[, clfft]). It surely needs some updates and some transform scenarios cannot be performed efficiently with liFFT at the moment. We still increase our team to get more C++ power, but of course, good C++ coders do not appear out of the blue ;) If you want to know more about liFFT or gearshifft, just let us know.

emmenlau commented 5 years ago

Dear @tdd11235813 , thanks for the quick and helpful response! https://github.com/ComputationalRadiationPhysics/liFFT looks very interesting and I was not aware of it! We where on the brink of starting our own development but liFFT and gearshift combine most of what we're looking for.

But two questions:

I very much cherish the idea of combining benchmark and execution into one. It would alleviate us from the pain to hard-code specs of all sorts of devices into our execution, and rather evaluate the actual execution performance on the clients hardware online. In this scenario we would not execute a full-blown benchmark, but we could quickly identify the order of magnitude of performance of available hardware devices on a client machine, and dispatch to the fastest (set of) device(s). Isn't this superior to a separate benchmark vs execution?
It seems a bit of maintenance overhead to maintain two such closely related works, or not? Would you continue maintaining both gearshift and liFFT in parallel?

A last comment, https://github.com/ComputationalRadiationPhysics/liFFT is currently licensed under LGPL which makes it a bit more challenging to use in commercial settings than gearshift. Its not prohibitive, but if you every think about changing to an even more liberal license, please count in my humble vote for it! :) :)

tdd11235813 commented 5 years ago

You mean, gearshifft should use liFFT instead of the raw FFT backends? The layer of liFFT is much more complex than the FFT layer in gearshifft. When we are sure, liFFT is zero overhead for all backends and the backends are implemented in liFFT anyways, then we can do this and save overhead :smiley: I already started to benchmark liFFT with gearshifft, but needed some things to make efficient round-trip FFTs with liFFT and I needed access to options like fftw rigors (measure, estimate, wisdoms,...).

It is more easy to add an FFT backend to gearshifft, also with support for library-specific features. liFFT has more use-cases to address, so the design is more advanced, while aiming for the least common denominator concept. If you want to try out a specific feature you might end up with redesigning the whole API. Of course, when you want to benchmark distributed FFTs or FFT batch processing or FFT callbacks, gearshifft also needs an additional benchmark workflow. Currently, I still would favor gearshifft with the raw backends and the liFFT backend for benchmarking FFTs, so the benchmark skeleton code would not be part of liFFT itself.

Regarding the license, I guess we can change it, but I need to sync with our devs first. Edit: will go for MPL2 for liFFT

mpicbg-scicomp / gearshifft

Is gearshift also a high-level interface to computing fft's? #130