mpicbg-scicomp / gearshifft

Benchmark Suite for Heterogenuous FFT Implementations
Apache License 2.0
34 stars 9 forks source link

Installation on Fedora 28 #121

Open bkmgit opened 6 years ago

bkmgit commented 6 years ago

Hi,

Am trying to do an installation on a clean system with Fedora 28, GCC 8.1.1. Have installed cmake 3.10.3, boost 1.67 and fftw -3.3.8 (double precision, shared and static libraries). A makefile is generated by using cmake as follows:

../../cmake-3.10.3install/bin/cmake .. -DFFTW_LIBRARY_DIR=/home/benson/projects/gearshifft/fftw-3.3.8install -DCMAKE_SHARED_LINKER_FLAGS=-lpthread -DBOOST_DIR=/home/benson/projects/gearshifft/boost_1_67_0install -DBOOST_INCLUDEDIR=/home/benson/projects/gearshifft/boost_1_67_0install/include
>> Setting up Release build
>> CXX11_ABI enabled.
CMake Warning at /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:801 (message):
  New Boost version may have incorrect or missing dependencies and imported
  targets
Call Stack (most recent call first):
  /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:907 (_Boost_COMPONENT_DEPENDENCIES)
  /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:1542 (_Boost_MISSING_DEPENDENCIES)
  CMakeLists.txt:111 (find_package)

CMake Warning at /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:801 (message):
  New Boost version may have incorrect or missing dependencies and imported
  targets
Call Stack (most recent call first):
  /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:907 (_Boost_COMPONENT_DEPENDENCIES)
  /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:1542 (_Boost_MISSING_DEPENDENCIES)
  CMakeLists.txt:111 (find_package)

CMake Warning at /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:801 (message):
  New Boost version may have incorrect or missing dependencies and imported
  targets
Call Stack (most recent call first):
  /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:907 (_Boost_COMPONENT_DEPENDENCIES)
  /home/benson/projects/gearshifft/cmake-3.10.3install/share/cmake-3.10/Modules/FindBoost.cmake:1542 (_Boost_MISSING_DEPENDENCIES)
  CMakeLists.txt:111 (find_package)

CUDA_TOOLKIT_ROOT_DIR not found or specified
-- Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
<< cuFFT benchmark disabled.
-- Could NOT find OpenCL (missing: OpenCL_LIBRARY OpenCL_INCLUDE_DIR)
<< clFFT benchmark disabled.
-- Could NOT find HCFFT (missing: HCFFT_LIBRARIES HCFFT_INCLUDE_DIRS)
<< hcFFT benchmark disabled.
FFTW_SERIAL_STUBLIB empty for fftw3f .. :FFTW_SERIAL_STUBLIB-NOTFOUND:
FFTW_OPENMP_STUBLIB empty for fftw3f .. :FFTW_OPENMP_STUBLIB-NOTFOUND:
FFTW_OPENMP_STUBLIB empty for fftw3f .. :FFTW_THREADS_STUBLIB-NOTFOUND:
FFTW_OPENMP_STUBLIB empty for fftw3 .. :FFTW_THREADS_STUBLIB-NOTFOUND:
++ FindFFTW
++ FFTW_INCLUDES    : /home/benson/projects/gearshifft/fftw-3.3.8install/include
++ FFTW_LIBRARIES   :  fftw3 fftw3_omp
++ FFTW_SERIAL_LIBS : /home/benson/projects/gearshifft/fftw-3.3.8install/lib/libfftw3.so
++ FFTW_THREADS_LIBS:
++ FFTW_OPENMP_LIBS : /home/benson/projects/gearshifft/fftw-3.3.8install/lib/libfftw3_omp.so
++ FFTW_INCLUDE_DIR : /home/benson/projects/gearshifft/fftw-3.3.8install/include
++ FFTW_LIBRARY_DIR : /home/benson/projects/gearshifft/fftw-3.3.8install/lib
>> FFTW -> fftw3;fftw3_omp /home/benson/projects/gearshifft/fftw-3.3.8install/include
-- Configuring done
-- Generating done
-- Build files have been written to: /home/benson/projects/gearshifft/gearshifft/release

After typing make I get the following:

[ 20%] Building CXX object src/CMakeFiles/gearshifft_fftw.dir/benchmark.cpp.o
[ 40%] Building CXX object src/CMakeFiles/gearshifft_fftw.dir/options.cpp.o
[ 60%] Linking CXX executable ../gearshifft_fftw
/usr/bin/ld: CMakeFiles/gearshifft_fftw.dir/benchmark.cpp.o: undefined reference to symbol 'pthread_join@@GLIBC_2.2.5'
//usr/lib64/libpthread.so.0: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make[2]: *** [src/CMakeFiles/gearshifft_fftw.dir/build.make:126: gearshifft_fftw] Error 1
make[1]: *** [CMakeFiles/Makefile2:86: src/CMakeFiles/gearshifft_fftw.dir/all] Error 2
make: *** [Makefile:141: all] Error 2

Any suggestions?

bkmgit commented 6 years ago

It compiled and tests passed with

../../cmake-3.10.3install/bin/cmake .. -DFFTW_LIBRARY_DIR=/home/benson/projects/gearshifft/fftw-3.3.8install -DCMAKE_SHARED_LINKER_FLAGS="-lfftw3 -lfftw3f -L/home/benson/projects/gearshifft/boost_1_67_0install/lib -lboost_program_options -L/lib64 -lpthread -lm" -DBOOST_DIR=/home/benson/projects/gearshifft/boost_1_67_0install -DBOOST_INCLUDEDIR=/home/benson/projects/gearshifft/boost_1_67_0install/include -DBoost_ADDITIONAL_VERSIONS=1.67.0 -DCMAKE_INSTALL_PREFIX=/home/benson/projects/gearshifft/gearshifftinstall -DCMAKE_EXE_LINKER_FLAGS="-lfftw3 -lfftw3f -L/home/benson/projects/gearshifft/boost_1_67_0install/lib -lboost_program_options -L/lib64 -lpthread -lm"

Single precision seems to be needed.

bkmgit commented 6 years ago

May be helpful to indicate example configuration files are in share/gearshifft

tdd11235813 commented 6 years ago

thanks for your feedback, I will look into the build process given your specs next days. For now, I can confirm, that fftwf (single-precision) is required, as in src/benchmark.cpp

using Precisions = List<float, double>;

the fftw and fftwf backend is loaded. As fftw+fftwf build process can be error-prone, an option for a specific precision could be nice-to-have. For example configurations I look for placing some hints to share/gearshifft in the program and doc.

tdd11235813 commented 6 years ago

I could reproduce this issue and updated the CMake config in the PR #122 . The README has been updated as well, and indicates the FFT extent configuration files. However, the program cannot know about that location, which can be changed after installing gearshifft with custom install prefix for the configuration files. The README also includes some install guides for compiling Boost and FFTW from scratch and how to let cmake know about the paths.

bkmgit commented 6 years ago

Thanks for updated installation guides, may want compilation flags for FFTW to be included in submissions as optional further information. Might you expect to allow for interoperability with Fortran FFT libraries, for example Ooura FFT, FFTE, fftpack, pfft KISS FFT or Glassman FFT ? Any suggestions for clFFT and hcFFT - one issue here might be clang and gcc interoperability.

psteinb commented 6 years ago

clfft and hcfft/rocfft are on our radar. We indeed didn't look to much yet into the fortran libraries as nobody around us uses them. Let's keep in mind how to propagate library flags. Btw, can you give an example for those?

bkmgit commented 6 years ago

Ok. Got clfft version to build. Main steps I followed: a) Build boost using clang compiler b) Build fftw3.3.8 using clang compiler c) Install opencl from Fedora repository, beignet and beignet-devel d) export PKG_CONFIG_PATH and LD_LIBRARY_PATH to also have own built versions of fftw and boost libraries e) export CMAKE_PREFIX_PATH=/home/benson/projects/gearshifft/fftw-3.3.8-clang-install:/home/benson/projects/gearshifft/boost_1_67_0_clang_install:/home/benson/projects/gearshifft/boost_1_67_0_clang_install/lib f) build gearshifft: ../../cmake-3.10.3install/bin/cmake .. -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DFFTW_LIBRARY_DIR="/home/benson/projects/gearshifft/fftw-3.3.8-clang-install/lib" -DCMAKE_SHARED_LINKER_FLAGS="-L/home/benson/projects/gearshift/fftw3-3.3.8-clang-install/lib -lfftw3 -lfftw3f -L/home/benson/projects/gearshifft/boost_1_67_0_clang_install/lib -lboost_program_options -L/lib64 -lpthread -lm" -DBOOST_DIR=/home/benson/projects/gearshifft/boost_1_67_0_clang_install -DBOOST_INCLUDEDIR=/home/benson/projects/gearshifft/boost_1_67_0_clang_install/include -DBoost_ADDITIONAL_VERSIONS=1.67.0 -DCMAKE_INSTALL_PREFIX=/home/benson/projects/gearshifft/gearshifftinstall2 -DCMAKE_EXE_LINKER_FLAGS="-L/home/benson/projects/gearshift/fftw3-3.3.8-clang-install/lib -lfftw3 -lfftw3f -L/home/benson/projects/gearshifft/boost_1_67_0_clang_install/lib -lboost_program_options -L/lib64 -lpthread -lm" -D_CLFFT_INCLUDE_DIRS="/home/benson/projects/gearshifft/clfftinstall/include" -DOpenCL_INCLUDE_DIR="/usr/include" -DOpenCL_LIBRARY="/usr/lib64/libOpenCL.so.1" -D_CLFFT_LIBRARY="/home/benson/projects/gearshifft/clfftinstall/lib64/libclFFT.so" -DCLFFT_ROOT_DIR="/home/benson/projects/gearshifft/clfftinstall" -DFFTW_INCLUDES="/home/benson/projects/gearshifft/fftw-3.3.8-clang-install/include" -DFFTW_LIBRARIES=/home/benson/projects/gearshifft/fftw-3.3.8-clang-install/lib -DCMAKE_BUILD_TYPE=Release -DGEARSHIFFT_INSTALL_CONFIG_PATH=/home/benson/projects/gearshifft/gearshifftinstall2/share/gearshifft

bkmgit commented 6 years ago

Examples of flags

1) Boost ./bootstrap.sh --with-toolset=clang --prefix=/home/benson/projects/gearshifft/boost_1_67_0_clang_install --with-libraries=program_options,filesystem,system,test 826 ./b2 install --variant=release 2) FFTW single precision ./configure CC=/usr/bin/clang --enable-static --enable-shared --enable-avx2 --enable-sse2 --prefix=/home/benson/projects/gearshifft/fftw-3.3.8-clang-install --enable-float --disable-fortran 3) FFTW double precision

./configure CC=/usr/bin/clang --enable-static --enable-shared --enable-avx2 --enable-sse2 --prefix=/home/benson/projects/gearshifft/fftw-3.3.8-clang-install --disable-fortran

bkmgit commented 6 years ago

Have you done any tests with rocFFT or hcFFT? If so, any configuration suggestions? Have you tried anything on ARM?

tdd11235813 commented 6 years ago

thanks for all the config information, I look into this next days. I am only familiar with cuFFT, clFFT and FFTW so far. I also tested FFTW flags two years ago (--enable-avx2, --enable-sse2, ..) and ended up with a single --enable-sse2 (on haswell), while other combinations were slower. Just want to say, that even there is potential for playing around. For the results submissions: yes, it would be helpful to have information about the compile flags for the corresponding FFT lib. The only way is to provide an additional file about the compile procedure. Maybe a README.md file: results/bkmgit-intel-i5-6200/README.md What do you think? I can incorporate new compile procedures to the main README.md later on.

psteinb commented 6 years ago

I am planning on testing rocFFT or hcFFT soon. I need to upgrade my rocm installation first and I only have an 'oldish' R9 Nano. But it should be enough for getting the hang of the software. I am in contact with an ARM integrator, but have had no success yet to get my hands on a system.

I personally agree to look into configuration issues. However, including that in the results file is tricky, we would have to forward this information from cmake into the code (which is possible), but I am wondering how easy it is to get the configure flags from any given fft library into cmake??? For now, a note in a dedicated README.md appears most feasible to me.

bkmgit commented 6 years ago

Reproducibility is a big issue in high performance computing. Any progress on this would be great. The following may be relevant:

psteinb commented 6 years ago

We know of the 2 software packages you mentioned. gearshifft is already on spack:

$ ./bin/spack list gearshifft
==> 1 packages.
gearshifft

I am not sure we should support popper from within gearshifft. I was just saying that inferring the build flags from external libraries either from the gearshifft c++ code or the cmake build system is far from trivial. Libraries come with all variants of reproducibility support : from none to a libname_config to internal variables which yield a string with the configure flags.

bkmgit commented 6 years ago

Added a README.md here Still not fully reproducible, but hopefully enough. Let me know what further information would be helpful. Expect to add a few more results on AMD hardware.

bkmgit commented 6 years ago

Making FFTW optional would be great. Adding a simple default C/Fortran library that does not require a lengthy installation procedure would be very helpful, even if one can only compare FFTs for a few transform sizes.

psteinb commented 6 years ago

Which C/Fortran library would you suggest?

psteinb commented 6 years ago

AMD hardware would be helpful to have. I only have access to a Ryzen Threadripper as of recently. But I think EPYC measurements would be lovely to have ... not to speak of VEGA numbers under ROCm. Volunteers?

bkmgit commented 6 years ago

Kiss FFT, FFTE, Glassman FFT or Ooura's FFT routines. What do you want the default to provide? Rough accuracy test for a variety of problem sizes? Are factors of 2,3,5 and 7 enough? Is comparison to 1D enough, or are 2D and 3D also helpful? For default, probably a C/C++ code is best. One can also add option of measuring Fortran codes, but think this should be optional to minimize setup complexity.

tdd11235813 commented 6 years ago

I am completely open to other FFT libraries. It should be at least an FFT library, which is used by the community. I cannot estimate about this on the libraries mentioned above.

I also thought about the installation procedure, which could be automatized via cmake routines. The user could select between an already installed FFTW or the automatic installation during the make process. Files are downloaded and build into a directory ext/, and the import targets for cmake will be properly set, as it knows all the paths and flags. Btw, the single & double precision benchmarks will be relaxed to single and/or double precision benchmarks, so one FFTW precision is ok. The clFFT back-end will be perhaps only semi-automatic. The point is, FFTW could be the default one, too. Download and build can be done by cmake. If FFTW is not ok, then another, more streamlined FFT library can be taken as default, still with automatic install routine in the background via cmake script. How do you think?

tdd11235813 commented 6 years ago

@psteinb I can play around with a vega64 GPU, but not sure, when time allows it. Btw, it might be a simple task, to include ROCm's rocFFT, as there is a HIP interface. So just some copy and renaming of the cufft routines, and it should go :) Then we also could see, if rocFFT/ROCm is producing correct results ;)No access to EPYC platform though.

bkmgit commented 6 years ago

The benchmark code is nice and simple, so with an embedded FFT routine, compilation could be done with a simple makefile - this would allow for compile and run without much configuration. Not ideal for speed, but would greatly enhance adoption. FFTW performance depends on compilation options as well. Making boost optional would be great as well - on many systems, just headers are setup.

Great on vega64 GPU. clFFT and MESA-OpenCL on Fedora 28 seems to work, though am getting a few accuracy errors on a Polaris GPU.

tdd11235813 commented 6 years ago

Boost is mandatory for gearshifft, but it can also be part of the cmake install routine, if it cannot be found on the system. An FFT library is integrated into gearshifft by implementing the routines for plan creation, data transfer and plan execution. Not sure, what you mean with simple makefile and embedded FFT routine. You mean out of the box, out of gearshifft tests?

At the moment I would tend to provide a separate cmake based setup of FFTW to allow a test scenario, that is independent of installed FFT libs. This includes a test case, where gearshifft is not involved. This is already implemented, see here. So you can check, if everything is running. Afterwards you can go ahead with real benchmarks of real FFT libs :)

bkmgit commented 6 years ago

Boost is used mostly to check precision, for which one can write a simple wrapper that gives most of the functionality and have reduced error reporting - as an example clFFT can use googletest and FFTW to verify correctness, but one need not install these to get clFFT to work. The netlib blas and lapack routines require a fortran compiler that is relatively close to the standard to be installed and run tests. This makes it easy to get a basic but not optimal installation working.

It would also be helpful for tolerance for errors in fft to vary with precision, usually expect close to machine precision for exact FFT routines - at the moment the default is 4 digits, which is ok for half precision, but will not give warnings for single and double precision.

bkmgit commented 6 years ago

Accuracy errors on Polaris GPU are from certain transforms not being implemented in clFFT. Those that are implemented seem to pass. Seems like there is some FFT implementation to do.

tdd11235813 commented 6 years ago

Boost is used mostly to check precision, for which one can write a simple wrapper that gives most of the functionality and have reduced error reporting - as an example clFFT can use googletest and FFTW to verify correctness, but one need not install these to get clFFT to work.

I have troubles to understand, what do you want to test or what do you want to see in gearshifft. Do you want to test that gearshifft is running or the FFT library itself? As the libraries mostly have their own test suites, it is not a must-have for me, to test the FFT libs out of gearshifft. The whole gearshifft benchmark framework is based on boost, e.g., the test tree infrastructure, meta programming structures and so on. gearshifft does not work without boost.

The netlib blas and lapack routines require a fortran compiler that is relatively close to the standard to be installed and run tests. This makes it easy to get a basic but not optimal installation working.

The basic installation for gearshifft covers at least a C++14 compiler, Boost and cmake, so this should be tested. cmake binaries can directly be downloaded and used without requiring root privileges. Boost can be downloaded either manually or during cmake build system preparation. The C++14 compiler is however a thing, that is supposed to be provided on the system.

It would also be helpful for tolerance for errors in fft to vary with precision, usually expect close to machine precision for exact FFT routines - at the moment the default is 4 digits, which is ok for half precision, but will not give warnings for single and double precision.

The precision can be configured via cmake. Call ccmake . in your build directory to see the options. Values lower than zero will trigger the program to set the error margin depending on the precision type of the FFT. I better should set the default value to -1 for the dynamic behavior.

bkmgit commented 6 years ago

To collect data and compare different implementations, it is nice to have a reproducible setup. This includes: a) Some default transform sizes, upper bound can be increased, but the default sizes should be specified b) Some default error tolerances c) A standardized format for reporting results d) Some way of describing the machine and setup (this is more challenging) e) A description for how to integrate new FFT libraries into the test suite Gearshifft seems to provide most of these, but does not have a default easy setup configuration. Downloading dependencies from internet is overkill for this, as I think most of the functionality can be programmed directly. The FFT is also of importance on embedded devices and IoT platforms, for which memory and processor speed constraints may warrant a lighter package with minimal dependencies, see for example here, here, here and here

tdd11235813 commented 6 years ago

I will think about it, but Boost as dependency will remain. Have you experiences with embedded devices and compiling code (+things like boost) for them? I will look for some numbers or issues about that, to estimate the limits. I thought Boost is modular enough even for such purposes. Integrating a lightweight FFT library is on the ToDo list.

bkmgit commented 6 years ago

It would be nice if it could run on an atmel atmega chip without too much modification, but expect this may be too much to ask. Making it easy to run on a phone with minor modifications would be great as well.

psteinb commented 6 years ago

Just had a look at this chip. It appears to be very powerful. What is the application you are after? From the docs, I couldn't get out what the development environment around it is. LLVM and sorts now also have RISC backends, but I am unclear if people use this already in production. If clang can build it, using boost shouldn't be a big deal. But I am not an expert...just guessing here.