sbrisard / janus

Discretization of the Lippmann--Schwinger equation with periodic boundary conditions
BSD 3-Clause "New" or "Revised" License
9 stars 3 forks source link

Do not link `fft._serial` against `libfftw3_mpi` #22

Closed sbrisard closed 4 years ago

sbrisard commented 5 years ago

This is presently what happens. We have found out that it causes some issues on (e.g.) MacOS + Homebrew.

sbrisard commented 4 years ago

The problem comes from the installation procedure. Indeed, the libraries option in the [build_ext] section of the setup.cfg file must be set to

libraries = fftw3, fftw3_mpi

for parallel implementation of Janus. This means that the -lfftw3_mpi option is then passed to the compiler for every single extension, which is a bug. In fact, every single binary is linked against fftw, even in pure serial setups. The solution is to introduce a more granular setup.cfg, with two new sections

[fftw]
include_dirs = /path/to/headers
library_dirs = /path/to/binaries
libraries = fftw3
[fftw_mpi]
include_dirs = /path/to/headers
library_dirs = /path/to/binaries
libraries = fftw3_mpi

Compilation with MPI will occur only if the [fftw_mpi] is present.

mrambausek commented 4 years ago

Unfortunately, the proposed "more granular" setup.cfg seems to not fix this issue on CentOS 7. Compilations works but tests fail due to an import error telling "undefined symbol: fftw_mpi_local_size_2d". Moreover, for "libraries = fftw3, fftw3_mpi" to work I also had to pass "-lmpi" also to the serial version for the tests to run. This could be routed in certain linking of Ubuntu's fftw packages.

sbrisard commented 4 years ago

Thanks for this report! Indeed, I only have Ubuntu and Windows boxes... However, I noticed a bug two weeks ago and did not take time to fix it. Could you please check out the latest version and try and reinstall, with the exact installation procedure given in the docs? Thanks!

EDIT: please use following setup.cfg for the time being (do NOT place fftw3_mpi in the [fftw] section, as this might be the cause of your troubles).

[fftw]
libraries = fftw3
[fftw_mpi]
libraries = fftw3, fftw3_mpi

On a linux box, you should not need to define include_dirs and library_dirs. Also note that you need to have mpi4py properly installed for the MPI version to be installed, too.

mrambausek commented 4 years ago

Thank you for the fast reply. I tried with updated code and the suggested setup.cfg. At this point I should mention that I have been trying things on our cluster for now, but this should not change a lot. As far as I can tell, this boils down to setting include and library paths, which is okay for now). However, with the suggested instructions everything builds, but when running the tests I still encounter "undefined symbol: fftw_mpi_local_size_2d" I compared the gcc command lines printed during build. They are identical to those of my working approach (see above). I am not an expert in python setup things. Are there other files produced by setup.py which I can have a look at?

sbrisard commented 4 years ago

Hi, could you run python setup.py clean (or python3 setup.py clean), then rebuild completely and post the whole output:

python setup.py develop --user > build.log

That would help me a lot. I am sorry that the install is not so smooth, but your contribution is extremely valuable.

mrambausek commented 4 years ago

I am happy to help here. I understand that build/installation across platforms is not trivial.

'----------------------------------------------------------------------------------------------------------------

Compiling janus/utils/checkarray.pyx because it changed. Compiling janus/operators.pyx because it changed. Compiling janus/material/elastic/linear/isotropic.pyx because it changed. Compiling janus/green.pyx because it changed. Compiling janus/fft/serial/_serial_fft.pyx because it changed. Compiling janus/fft/parallel/_parallel_fft.pyx because it changed. [1/6] Cythonizing janus/fft/parallel/_parallel_fft.pyx [2/6] Cythonizing janus/fft/serial/_serial_fft.pyx [3/6] Cythonizing janus/green.pyx [4/6] Cythonizing janus/material/elastic/linear/isotropic.pyx [5/6] Cythonizing janus/operators.pyx [6/6] Cythonizing janus/utils/checkarray.pyx running develop running egg_info writing Janus.egg-info/PKG-INFO writing dependency_links to Janus.egg-info/dependency_links.txt writing top-level names to Janus.egg-info/top_level.txt reading manifest file 'Janus.egg-info/SOURCES.txt' writing manifest file 'Janus.egg-info/SOURCES.txt' running build_ext building 'janus.utils.checkarray' extension creating build/temp.linux-x86_64-3.7 creating build/temp.linux-x86_64-3.7/janus creating build/temp.linux-x86_64-3.7/janus/utils gcc -pthread -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include -fPIC -I/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/include/python3.7m -c janus/utils/checkarray.c -o build/temp.linux-x86_64-3.7/janus/utils/checkarray.o gcc -pthread -shared -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -L/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,-rpath=/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,--no-as-needed -Wl,--sysroot=/ -L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -lmpi -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include build/temp.linux-x86_64-3.7/janus/utils/checkarray.o -o build/lib.linux-x86_64-3.7/janus/utils/checkarray.cpython-37m-x86_64-linux-gnu.so building 'janus.operators' extension gcc -pthread -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include -fPIC -I/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/include/python3.7m -c janus/operators.c -o build/temp.linux-x86_64-3.7/janus/operators.o gcc -pthread -shared -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -L/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,-rpath=/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,--no-as-needed -Wl,--sysroot=/ -L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -lmpi -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include build/temp.linux-x86_64-3.7/janus/operators.o -o build/lib.linux-x86_64-3.7/janus/operators.cpython-37m-x86_64-linux-gnu.so building 'janus.material.elastic.linear.isotropic' extension creating build/temp.linux-x86_64-3.7/janus/material creating build/temp.linux-x86_64-3.7/janus/material/elastic creating build/temp.linux-x86_64-3.7/janus/material/elastic/linear gcc -pthread -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include -fPIC -I/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/include/python3.7m -c janus/material/elastic/linear/isotropic.c -o build/temp.linux-x86_64-3.7/janus/material/elastic/linear/isotropic.o gcc -pthread -shared -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -L/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,-rpath=/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,--no-as-needed -Wl,--sysroot=/ -L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -lmpi -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include build/temp.linux-x86_64-3.7/janus/material/elastic/linear/isotropic.o -o build/lib.linux-x86_64-3.7/janus/material/elastic/linear/isotropic.cpython-37m-x86_64-linux-gnu.so building 'janus.green' extension gcc -pthread -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include -fPIC -I/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/include/python3.7m -c janus/green.c -o build/temp.linux-x86_64-3.7/janus/green.o gcc -pthread -shared -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -L/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,-rpath=/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,--no-as-needed -Wl,--sysroot=/ -L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -lmpi -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include build/temp.linux-x86_64-3.7/janus/green.o -lfftw3 -o build/lib.linux-x86_64-3.7/janus/green.cpython-37m-x86_64-linux-gnu.so building 'janus.fft.serial._serial_fft' extension creating build/temp.linux-x86_64-3.7/janus/fft creating build/temp.linux-x86_64-3.7/janus/fft/serial gcc -pthread -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include -fPIC -I/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/include/python3.7m -c janus/fft/serial/_serial_fft.c -o build/temp.linux-x86_64-3.7/janus/fft/serial/_serial_fft.o gcc -pthread -shared -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -L/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,-rpath=/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,--no-as-needed -Wl,--sysroot=/ -L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -lmpi -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include build/temp.linux-x86_64-3.7/janus/fft/serial/_serial_fft.o -lfftw3 -o build/lib.linux-x86_64-3.7/janus/fft/serial/_serial_fft.cpython-37m-x86_64-linux-gnu.so building 'janus.fft.parallel._parallel_fft' extension creating build/temp.linux-x86_64-3.7/janus/fft/parallel gcc -pthread -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include -fPIC -I./janus/fft/parallel -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/include -I/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib/python3.7/site-packages/mpi4py/include -I/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/include/python3.7m -c janus/fft/parallel/_parallel_fft.c -o build/temp.linux-x86_64-3.7/janus/fft/parallel/_parallel_fft.o -pthread gcc -pthread -shared -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -L/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,-rpath=/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,--no-as-needed -Wl,--sysroot=/ -L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -lmpi -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include build/temp.linux-x86_64-3.7/janus/fft/parallel/_parallel_fft.o -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -o build/lib.linux-x86_64-3.7/janus/fft/parallel/_parallel_fft.cpython-37m-x86_64-linux-gnu.so -Wl,-rpath -Wl,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -Wl,--enable-new-dtags -lmpi -pthread copying build/lib.linux-x86_64-3.7/janus/utils/checkarray.cpython-37m-x86_64-linux-gnu.so -> janus/utils copying build/lib.linux-x86_64-3.7/janus/operators.cpython-37m-x86_64-linux-gnu.so -> janus copying build/lib.linux-x86_64-3.7/janus/material/elastic/linear/isotropic.cpython-37m-x86_64-linux-gnu.so -> janus/material/elastic/linear copying build/lib.linux-x86_64-3.7/janus/green.cpython-37m-x86_64-linux-gnu.so -> janus copying build/lib.linux-x86_64-3.7/janus/fft/serial/_serial_fft.cpython-37m-x86_64-linux-gnu.so -> janus/fft/serial copying build/lib.linux-x86_64-3.7/janus/fft/parallel/_parallel_fft.cpython-37m-x86_64-linux-gnu.so -> janus/fft/parallel Creating /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib/python3.7/site-packages/Janus.egg-link (link to .) Adding Janus 0.1 to easy-install.pth file

Installed /tmp/build-janus-JANUS-test-build/janus Processing dependencies for Janus==0.1 Finished processing dependencies for Janus==0.1

mrambausek commented 4 years ago

The test output

'-------------------------------------------------------------------------------------------------

running test running egg_info writing Janus.egg-info/PKG-INFO writing dependency_links to Janus.egg-info/dependency_links.txt writing top-level names to Janus.egg-info/top_level.txt reading manifest file 'Janus.egg-info/SOURCES.txt' writing manifest file 'Janus.egg-info/SOURCES.txt' running build_ext copying build/lib.linux-x86_64-3.7/janus/utils/checkarray.cpython-37m-x86_64-linux-gnu.so -> janus/utils copying build/lib.linux-x86_64-3.7/janus/operators.cpython-37m-x86_64-linux-gnu.so -> janus copying build/lib.linux-x86_64-3.7/janus/material/elastic/linear/isotropic.cpython-37m-x86_64-linux-gnu.so -> janus/material/elastic/linear copying build/lib.linux-x86_64-3.7/janus/green.cpython-37m-x86_64-linux-gnu.so -> janus copying build/lib.linux-x86_64-3.7/janus/fft/serial/_serial_fft.cpython-37m-x86_64-linux-gnu.so -> janus/fft/serial copying build/lib.linux-x86_64-3.7/janus/fft/parallel/_parallel_fft.cpython-37m-x86_64-linux-gnu.so -> janus/fft/parallel parallel (unittest.loader._FailedTest) ... ERROR janus.fft.parallel (unittest.loader._FailedTest) ... ERROR

====================================================================== ERROR: parallel (unittest.loader._FailedTest)

ImportError: Failed to import test module: parallel Traceback (most recent call last): File "/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib/python3.7/unittest/loader.py", line 154, in loadTestsFromName module = import(module_name) File "/tmp/build-janus-JANUS-test-build/janus/janus/fft/parallel/init.py", line 7, in from ._parallel_fft import create_real ImportError: /tmp/build-janus-JANUS-test-build/janus/janus/fft/parallel/_parallel_fft.cpython-37m-x86_64-linux-gnu.so: undefined symbol: fftw_mpi_local_size_2d

====================================================================== ERROR: janus.fft.parallel (unittest.loader._FailedTest)

ImportError: Failed to import test module: janus.fft.parallel Traceback (most recent call last): File "/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib/python3.7/unittest/loader.py", line 470, in _find_test_path package = self._get_module_from_name(name) File "/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib/python3.7/unittest/loader.py", line 377, in _get_module_from_name import(name) File "/tmp/build-janus-JANUS-test-build/janus/janus/fft/parallel/init.py", line 7, in from ._parallel_fft import create_real ImportError: /tmp/build-janus-JANUS-test-build/janus/janus/fft/parallel/_parallel_fft.cpython-37m-x86_64-linux-gnu.so: undefined symbol: fftw_mpi_local_size_2d


Ran 2 tests in 0.000s

FAILED (errors=2) Test failed: error: Test failed:

sbrisard commented 4 years ago

Could you please reproduce the command line that triggered these error message (in the tests)? Thanks.

sbrisard commented 4 years ago

Could it be that fftw_mpi is not on your LD_LIBRARY_PATH (or equivalent)? See ldconfig

sbrisard commented 4 years ago

Hey, I'm sorry to ask this silly question, but are you sure that you checked out the latest version of Janus. The reason why I am asking is that the compiler options are very weird in the output you sent (and which was extremely useful, thanks! See here, for checkarray, which is a very simple C extension, with no other dependencies that Cython

building 'janus.utils.checkarray' extension
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/janus
creating build/temp.linux-x86_64-3.7/janus/utils
gcc -pthread -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include -fPIC -I/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/include/python3.7m -c janus/utils/checkarray.c -o build/temp.linux-x86_64-3.7/janus/utils/checkarray.o

The -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include option should not be there. Did you hard-code it? Also

gcc -pthread -shared -B /var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/compiler_compat -L/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,-rpath=/var/tmp/JANUS-JANUS-test-build/miniconda/envs/janus/lib -Wl,--no-as-needed -Wl,--sysroot=/ -L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -lmpi -I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include build/temp.linux-x86_64-3.7/janus/utils/checkarray.o -o build/lib.linux-x86_64-3.7/janus/utils/checkarray.cpython-37m-x86_64-linux-gnu.so

the option -L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib should not be there, either. In the latest version of Janus, checkarray is indeed defined as follows

    utils = Extension('janus.utils.checkarray', sources=['janus/utils/checkarray.pyx'])

note that there is no include_dirs, library_dirs, libraries arguments, which means that this module should be built with standard paths only.

sbrisard commented 4 years ago

EDIT: while the above comment is valid, I do get the same error message on a Ubuntu box. Keep posted!

mrambausek commented 4 years ago

Could it be that fftw_mpi is not on your LD_LIBRARY_PATH (or equivalent)? See ldconfig

It is in LD_LIBRARY_PATH, but for some reason I have to specify it via LDFLAGS. Otherwise, it is not detected. It seems as if setup.py does not take my LD_LIBRARY_PATH into account.

Could you please reproduce the command line that triggered these error message (in the tests)? Thanks.

(within my miniconda environment python points to python3) python setup.py test

Hey, I'm sorry to ask this silly question, but are you sure that you checked out the latest version of Janus. The reason why I am asking is that the compiler options are very weird in the output you sent (and which was extremely useful, thanks!

No worries, the reason for this weird compiler options is probably related to some problem in the build system. I hard code these flags via export CPPFLAGS="-I/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include" export LDFLAGS="-L/opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.3/lib -lmpi" I did not find another way to make things work. Specifiying 'include_dirs' etc in setup.cfg did not do the job for me. Just digged deeper: 'include_dirs' is handled properly for the serial but not for the parallel build, for which I receive "fatal error: fftw3.h: No such file or directory".

sbrisard commented 4 years ago

OK, I understand. The problem is that, by doing so, you are linking every single module against mpi. You should not encounter any problem with the serial tests (I have none on a Ubuntu box). I do get the error message with the parallel tests, though. I need to understand why include_dirs and lib_dirs are ignored. Your comments are extremely precious. I will dig deeper into that problem, and probably get back to you and ask you to kindly try the modified version, if you get the time.

sbrisard commented 4 years ago

Hopefully rev a6196a0 solves this issue. You might need to use the following setup.cfg

[fftw]
include_dirs = /opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include
library_dirs = /opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib
libraries = fftw3
[fftw_mpi]
include_dirs = /opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/include
library_dirs = /opt/ohpc/pub/libs/gnu8/openmpi3/fftw/3.3.8/lib
libraries = fftw3, fftw3_mpi

For MPI, you should be fine, provided that mpicc --show is on your path. Otherwise, add the relevant paths to the fftw_mpi section. This is not the recommended way, though, since mpicc --show might return more elaborate compiler directive (that are captured by setup.cfg)

mrambausek commented 4 years ago

Nice! I happily confirm that it passes the tests with 'a6196a0' and the given setup.cfg.

sbrisard commented 4 years ago

Thank you very much for your help! I will close this issue, then.