nerscadmin / IPM

Integrated Performance Monitoring for High Performance Computing
http://ipm-hpc.org
GNU Lesser General Public License v2.1
81 stars 35 forks source link

Problem with Fortran #16

Closed ghost closed 8 years ago

ghost commented 8 years ago

Hello, dear Scott!

It's me again :( Now I have problem while linking IPM-2.0.5 (or using PRELOAD) with Intel MPI fortran (mpiifort). Error when dynamic linking looks like: _/bin/sh: symbol lookup error: /home/ordi/inst/IPM1/lib/libipmf.so: undefined symbol: ipm_in_fortran_pmpi static - during compilation: /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Rsend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toipm_in_fortran_pmpi' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Gatherv' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Bsend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Bsend_init' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Ssend' etc...

Addition information: Application - NAS Parallel Benchmark, NPB3.3-MPI (test FT, for example)

## Dynamic link ./configure --prefix=/home/ordi/inst/IPM1 --with-papi=/usr --enable-share MPICC=mpiicc MPIF77=mpif77 MPIFC=mpiifor IPM configuration: MPI profiling enabled : yes POSIX-I/O profiling enabled : no PAPI enabled : yes CFLAGS : -DHAVE_DYNLOAD -I/usr/include -DOS_LINUX LDFLAGS : -L/usr/lib -Wl,-rpath=/usr/lib LIBS : -lpapi MPI_STATUS_COUNT : count_lo Fortran underscore : -funderscore_post Building IPM Parser : no

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/ordi/inst/IPM1/lib

LD_PRELOAD=/home/ordi/inst/IPM1/lib/libipmf.so mpirun -np 4 ./ft.B.4 /bin/sh: symbol lookup error: /home/ordi/inst/IPM1/lib/libipmf.so: undefined symbol: ipm_in_fortran_pmpi

## STATIC LINK: Configuration options: ./configure --prefix=/home/ordi/inst/IPM1 --with-papi=/usr --enable-static MPICC=mpiicc MPIF77=mpif77 MPIFC=mpiifort IPM configuration: MPI profiling enabled : yes POSIX-I/O profiling enabled : no PAPI enabled : yes CFLAGS : -DHAVE_DYNLOAD -I/usr/include -DOS_LINUX LDFLAGS : -L/usr/lib -Wl,-rpath=/usr/lib LIBS : -lpapi MPI_STATUS_COUNT : count_lo Fortran underscore : -funderscore_post Building IPM Parser : no

Application Makefile: MPIFC = mpiifort FMPI_LIB = -L/opt/intel/impi/5.0.1.035/lib64/ -lmpi FMPI_INC = -I/opt/intel/impi/5.0.1.035/include64/ -I/home/ordi/inst/IPM1/include FFLAGS = -O3 FLINKFLAGS = -O3 -L/home/ordi/inst/IPM1/lib -lipmf

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH: /home/ordi/inst/IPM1/lib Error during compilation: make ft CLASS=B NPROCS=4

= NAS Parallel Benchmarks 3.3 = = MPI/F77/C =

cd FT; make NPROCS=4 CLASS=B make[1]: Entering directory /home/ordi/src/NPB3.3.1/NPB3.3-MPI/FT' make[2]: Entering directory/home/ordi/src/NPB3.3.1/NPB3.3-MPI/sys' cc -g -o setparams setparams.c make[2]: Leaving directory /home/ordi/src/NPB3.3.1/NPB3.3-MPI/sys' ../sys/setparams ft 4 B mpiifort -c -I/opt/intel/impi/5.0.1.035/include64/ -I/home/ordi/inst/IPM1/include -O3 ft.f cd ../common; mpiifort -c -I/opt/intel/impi/5.0.1.035/include64/ -I/home/ordi/inst/IPM1/include -O3 randi8.f cd ../common; mpiifort -c -I/opt/intel/impi/5.0.1.035/include64/ -I/home/ordi/inst/IPM1/include -O3 print_results.f cd ../common; mpiifort -c -I/opt/intel/impi/5.0.1.035/include64/ -I/home/ordi/inst/IPM1/include -O3 timers.f mpiifort -O3 -L/home/ordi/inst/IPM1/lib -lipmf -o ../bin/ft.B.4 ft.o ../common/randi8.o ../common/print_results.o ../common/timers.o -L/opt/intel/impi/5.0.1.035/lib64/ -lmpi /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Rsend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to ipm_in_fortran_pmpi' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Gatherv' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Bsend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Bsend_init' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Ssend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Irecv' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Send_init' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Buffer_detach' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Recv_init' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Reduce' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Barrier' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Ssend_init' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Allgather' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Recv' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Iprobe' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Sendrecv' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Issend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Alltoallv' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Ibsend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toipm_state' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Alltoall' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Comm_compare' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Irsend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Isend' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Scatter' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Reduce_scatter' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Send' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Comm_rank' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Buffer_attach' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Start' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Gather' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Scatterv' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Comm_free' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Sendrecv_replace' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Allreduce' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Comm_split' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Comm_group' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Rsend_init' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Comm_create' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Scan' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Bcast' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Allgatherv' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Probe' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference toIPM_MPI_Comm_size' /home/ordi/inst/IPM1/lib/libipmf.so: undefined reference to IPM_MPI_Comm_dup' make[1]: Leaving directory/home/ordi/src/NPB3.3.1/NPB3.3-MPI/FT'

PS: If I link application with -lipm flag instead of -lipmf - then application linked successfully. But there are no IPM effect during running - no xml, no statistic, nothing.

ghost commented 8 years ago

By the way, t works with preloadLD_PRELOAD=/home/ordi/inst/IPM1/lib/libipm.so instead libipmf.so, But it lacks lot of stats...

ghost commented 8 years ago

No, I will open it again.. Because profiling with fortran leaks a lot of information (attached). There aren't gflop/sec, no MPI stats.. Even if I exported IPM_log=full, IPM_REPORT=full. There is only PAPI and memory stats.

Please, help me.. What am I doing soo wrong?

Thank you for your time. ft.B.4_4_ordi.1462639367.ipm.xml_ipm_unknown.zip

azrael417 commented 8 years ago

In order for getting gflops/sec, you need flop counters but on modern Intel architectures, there are no flop counters any more. Obtaining flops is not so easy therefore, you need to count them by hand probably. I am not sure about the mpi stats, they should be there.

Am 07.05.2016 um 10:33 schrieb OrdiTader notifications@github.com:

No, I will open it again.. Because profiling with fortran leaks a lot of information (attached). There aren't gflop/sec, no MPI stats.. Even if I exported IPM_log=full, IPM_REPORT=full. There is only PAPI and memory stats.

Please, help me.. What am I doing soo wrong?

Thank you for your time. ft.B.4_4_ordi.1462639367.ipm.xml_ipm_unknown.zip

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub

swfrench commented 8 years ago

Hi Ordi,

Sorry the documentation in README.md is not so clear on this. For IPM version 2.0.5 and later, you need to link in both libipm and libipmf for Fortran programs. This change was due to some refactoring that allows us to more easily support Fortran for both OpenMPI and MPICH (and derivatives of the latter).

So, with that said, you will need to link your Fortran binaries with -L${PREFIX}/lib -lipmf -lipm (see the example in README.md) or LD_PRELOAD both shared object files - e.g.

LD_PRELOAD="${PREFIX}/lib/libipmf.so ${PREFIX}/lib/libipm.so" mpirun -n ...

@azrael417 is correct that some PAPI performance counters (particularly FLOP counts) are unlikely to be accurate on recent Intel architectures (generally Sandybrige or later).

Hope that helps!

Cheers

azrael417 commented 8 years ago

Hi Ordi, Scott,

flop counters are completely removed on haswell and more recent architectures:

pap[tkurth@cori07 ~]$ papi_avail

Available PAPI preset and user defined events plus hardware information.

PAPI Version : 5.4.1.3 Vendor string and code : GenuineIntel (1) Model string and code : Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz (63) CPU Revision : 2.000000 CPUID Info : Family: 6 Model: 63 Stepping: 2 CPU Max Megahertz : 2300 CPU Min Megahertz : 2300 Hdw Threads per core : 1 Cores per Socket : 16 Sockets : 2 NUMA Nodes : 2 CPUs per Node : 16 Total CPUs : 32 Running in a VM : no Number Hardware Counters : 11 Max Multiplex Counters : 192 ———————————————————————————————————————— ….. … …

and down below:

PAPI_SP_OPS 0x80000067 No No Floating point operations; optimized to count scaled single precision vector operations PAPI_DP_OPS 0x80000068 No No Floating point operations; optimized to count scaled double precision vector operations PAPI_VEC_SP 0x80000069 No No Single precision vector/SIMD instructions PAPI_VEC_DP 0x8000006a No No Double precision vector/SIMD instructions

No/No means that it is neither available nor a derived counter. So there is no easy method to count flops. The easiest thing here is to go through the sections of the code which do most of the work (use vTune) and count their flops by hand and measure the execution time separately (as vtune might slow it down a little bit).

Best Thorsten

Am 07.05.2016 um 12:48 schrieb Scott French notifications@github.com:

Hi Ordi,

Sorry the documentation in README.md is not so clear on this. For IPM version 2.0.5 and later, you need to link in both libipm and libipmf for Fortran programs. This change was due to some refactoring that allows us to more easily support Fortran for both OpenMPI and MPICH (and derivatives of the latter).

So, with that said, you will need to link your Fortran binaries with -L${PREFIX}/lib -lipmf -lipm (see the example in README.md) or LD_PRELOAD both shared object files - e.g.

LD_PRELOAD="${PREFIX}/lib/libipmf.so ${PREFIX}/lib/libipm.so" mpirun -n ... @azrael417 https://github.com/azrael417 is correct that some PAPI performance counters (particularly FLOP counts) are unlikely to be accurate on recent Intel architectures (generally Sandybrige or later).

Hope that helps!

Cheers

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/nerscadmin/IPM/issues/16#issuecomment-217664908

ghost commented 8 years ago

Dear Scott, Thorsten,

Thank you very much for your help. I've understood about gflops on Haswell (I actually runIPM on Haswell arch) And yes, preloading both libipm and libipmf worked! Thank you very much!