geodynamics / axisem

AxiSEM is a parallel spectral-element method to solve 3D wave propagation in a sphere with axisymmetric or spherically symmetric visco-elastic, acoustic, anisotropic structures.
66 stars 31 forks source link

SOLVER fails on ppc64le: Depends on SSE intrinsics (xmmintrin.h) #43

Closed jlost closed 7 years ago

jlost commented 8 years ago

Following the README's instructions to run axisem on a ppc64le (IBM POWER8) machine, I encounter the following error when compiling the SOLVER:

[u0017592@sys-82824 SOLVER]$ ./submit.csh testrun1
Using mesh  MESHES/testmesh
copying mesh_params.h from  MESHES/testmesh
mpif90 -O3 -fopenmp         -Dsolver -c  global_parameters.f90
mpif90 -O3 -fopenmp         -Dsolver -c  data_proc.f90
mpif90 -O3 -fopenmp         -Dsolver -c  clocks.f90
mpif90 -O3 -fopenmp         -Dsolver -c  kdtree2.f90
gcc -O3                   -c -o ftz.o ftz.c
gcc -O3                   -c -o pthread.o pthread.c
cd UTILS; make
ftz.c:23:23: fatal error: xmmintrin.h: No such file or directory
#include <xmmintrin.h>
                    ^
compilation terminated.
mpif90 -O3 -fopenmp         -Dsolver -c  data_io.f90
mpif90 -O3 -fopenmp         -Dsolver -c  data_spec.f90
make[1]: Entering directory `/home/u0017592/projects/axisem/SOLVER/UTILS'
mpif90 -O3 -fopenmp          -c nc_postroutines.F90
mpif90 -O3 -fopenmp          -c field_transform.F90
mpif90 -O3 -fopenmp         -Dsolver -c  interpolation.f90
mpif90 -O3 -fopenmp         -Dsolver -c  list.f90
mpif90 -O3 -fopenmp         -Dsolver -c  data_time.f90
make: *** [ftz.o] Error 1
make: *** Waiting for unfinished jobs....
mpif90 -O3 -fopenmp          -c post_processing.F90
mpif90 -O3 -fopenmp       field_transform.o -o xfield_transform
mpif90 -O3 -fopenmp       post_processing.o nc_postroutines.o -o xpost_processing
make[1]: Leaving directory `/home/u0017592/projects/axisem/SOLVER/UTILS'
ERROR: Compilation failed, please check the errors.

ftz.c will have to be changed to optionally work with Altivec in order to support ppc64le machines, or some other alternative will have to be provided.

Note: I also had to manually remove -march=native as it isn't recognized by my platform's GCC (4.8.3 20140911 (Red Hat 4.8.3-9)) but that was a fairly obvious fix.

dmiller423 commented 7 years ago

pending pull req - https://github.com/dmiller423/axisem

sstaehler commented 7 years ago

@jlost: Sorry that everybody seems to have overlooked this issue so far. @dmiller423 Can you confirm that this fix solved the problem on ppc64le machines? If so, could you create a pull request?

dmiller423 commented 7 years ago

Yes, pull req. initiated and it works properly on a power8/le. It should be noted, subnormal values only create a performance bottleneck and are normally rare enough to go unnoticed. In any case, you may want to fixup the preprocessor checks if you want to disable FTZ handling and/or split code for every arch. Really fesetenv should have a generic flag for this, since it's possible on more than just intel hardware (X86/ARM/MIPS/PPC).

jlost commented 7 years ago

@sstaehler This fixed my problem.

It seems there are also some (unrelated) issues compiling with xlf, for which I can open up a new issue.

Can the PR be accepted?