ParaToolsInc / taucmdr

Performance engineering for the rest of us.
http://www.taucommander.com
Other
29 stars 11 forks source link

mpi_(all)gather fortran reproducer fails with tau commander #92

Closed wohlbier closed 8 years ago

wohlbier commented 8 years ago

mpirun -n 4 ./a.out runs without error. Error below when run as tau mpirun -n 4 ./a.out

[wohlbier@riptide03 tau_mpiallgatherv]$ module list
Currently Loaded Modulefiles:
  1) costinit              4) mpi/openmpi/1.6.5     7) cmake/intel/3.2.3
  2) gcc/4.7.2             5) python/2.7.6
  3) intel/14.0.4          6) nesm/icc/ompi/4.0.7
[wohlbier@riptide03 tau_mpiallgatherv]$ cat main.f90
program main
  use mpi
  implicit none

  integer :: i, ierr = 0, rank, size
  integer, dimension(:), allocatable :: data

  call mpi_init(ierr)

  call mpi_comm_size(MPI_COMM_WORLD, size, ierr)
  call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr)

  ! allgather
  allocate(data(size))
  data = 0
  data(rank + 1) = (rank+1)*1000
  call mpi_allgather(MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, data, 1, &
       MPI_INTEGER, MPI_COMM_WORLD, ierr)
  print*, "rank: ", rank, " data: ", data

  ! gather
  data = 0
  data(rank + 1) = -(rank+1)*2000
  if (rank == 0) then
     call mpi_gather(MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, &
          data, 1, MPI_INTEGER, &
          0, MPI_COMM_WORLD, ierr)
  else
     call mpi_gather(data(rank+1), 1, MPI_INTEGER, &
          data, 1, MPI_INTEGER, &
          0, MPI_COMM_WORLD, ierr)
  end if
  print*, "rank: ", rank, " data: ", data

  call mpi_finalize(ierr)

end program main
[wohlbier@riptide03 tau_mpiallgatherv]$ tau init --use-mpi T
[TAU] Created a new project named 'tau_mpiallgatherv'.
[TAU] Probing System MPI C++ compiler 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpic++'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpic++'
[TAU]      wraps 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icpc'
[TAU] Probing System MPI C++ compiler 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpicxx'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpicxx'
[TAU]      wraps 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icpc'
[TAU] Probing System MPI C++ compiler 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpiCC'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpiCC' 
[TAU]     wraps 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icpc'
[TAU] Probing System MPI C compiler 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpicc'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpicc' 
[TAU]     wraps 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icc'
[TAU] Probing System MPI Fortran compiler 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif90'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif90'
[TAU]      wraps 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/ifort'
[TAU] Probing System MPI Fortran compiler 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif77'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif77'
[TAU]      wraps 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/ifort'
[TAU] Added target 'riptide03' to project configuration 'tau_mpiallgatherv'.
[TAU] Added application 'tau_mpiallgatherv' to project configuration 
[TAU]     'tau_mpiallgatherv'.
[TAU] Added measurement 'sample' to project configuration 'tau_mpiallgatherv'.
[TAU] Added measurement 'profile' to project configuration 'tau_mpiallgatherv'.
[TAU] Added measurement 'trace' to project configuration 'tau_mpiallgatherv'.
[TAU] Created a new experiment named 'riptide03-tau_mpiallgatherv-sample'.
[TAU] Selected experiment 'riptide03-tau_mpiallgatherv-sample'.
[TAU] Experiment may be performed without application rebuild.

== Project Configuration (/gpfs/home/wohlbier/devel/tau/tau_mpiallgatherv/.tau/project.json) ==

+-------------+-------------+-------------+-------------+-------------+
|    Name     |   Targets   | Application | Measurement |      #      |
|             |             |      s      |      s      | Experiments |
+=============+=============+=============+=============+=============+
| tau_mpiallg |  riptide03  | tau_mpiallg |   sample,   |      1      |
|      atherv |             |   atherv    |  profile,   |             |
|             |             |             |    trace    |             |
+-------------+-------------+-------------+-------------+-------------+

== Targets in project 'tau_mpiallgatherv' ================================

+-----------+-----------+-----------+-----------+-----------+-----------+
|   Name    |  Host OS  | Host Arch |   Host    |    MPI    |   SHMEM   |
|           |           |           | Compilers | Compilers | Compilers |
+===========+===========+===========+===========+===========+===========+
| riptide03 |   Linux   |  x86_64   |   Intel   |  System   |   None    |
+-----------+-----------+-----------+-----------+-----------+-----------+

== Applications in project 'tau_mpiallgatherv' ===========================

+--------+--------+--------+--------+--------+--------+--------+--------+
|  Name  | OpenMP | Pthrea |  MPI   |  CUDA  | OpenCL | SHMEM  |  MPC   |
|        |        |   ds   |        |        |        |        |        |
+========+========+========+========+========+========+========+========+
| tau_mp |   No   |   No   |  Yes   |   No   |   No   |   No   |   No   |
| iallga |        |        |        |        |        |        |        |
|  therv |        |        |        |        |        |        |        |
+--------+--------+--------+--------+--------+--------+--------+--------+

== Measurements in project 'tau_mpiallgatherv' ===========================

+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| Name  | Profi | Trace | Sampl | Sourc | Compi | OpenM |  I/O  | Wrap  |
|       |  le   |       |   e   |   e   |  ler  |   P   | Inst. |  MPI  |
|       |       |       |       | Inst. | Inst. | Inst. |       |       |
+=======+=======+=======+=======+=======+=======+=======+=======+=======+
| sampl |  tau  | none  |  Yes  | never | never | none  |  No   |  Yes  |
|     e |       |       |       |       |       |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| profi |  tau  | none  |  No   | autom | fallb | none  |  No   |  Yes  |
|    le |       |       |       | atic  |  ack  |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| trace | none  | otf2  |  No   | autom | fallb | none  |  No   |  Yes  |
|       |       |       |       | atic  |  ack  |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+

== Experiments in project 'tau_mpiallgatherv' ============================

+-----------+-----------+-----------+-----------+-----------+-----------+
|   Name    |  Trials   | Data Size |  Target   | Applicati | Measureme |
|           |           |           |           |    on     |    nt     |
+===========+===========+===========+===========+===========+===========+
| riptide03 |     0     |   0.0B    | riptide03 | tau_mpial |  sample   |
| -tau_mpia |           |           |           | lgatherv  |           |
| llgatherv |           |           |           |           |           |
|   -sample |           |           |           |           |           |
+-----------+-----------+-----------+-----------+-----------+-----------+

Selected experiment: riptide03-tau_mpiallgatherv-sample

[wohlbier@riptide03 tau_mpiallgatherv]$ tau mpif90 main.f90 
[TAU] Probing System MPI Fortran compiler 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif90'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif90'
[TAU]      wraps 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/ifort'
[TAU] TAU_OPTIONS=-optRevert -optNoCompInst
[TAU] TAU_MAKEFILE=/gpfs/home/wohlbier/devel/packages/taucmdr-0.1/.system/tau/5e53a88b777391c52c6195213f36c428/x86_64/lib/Makefile.tau-icpc-mpi
[TAU] /gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif90 
[TAU]     -g main.f90
[wohlbier@riptide03 tau_mpiallgatherv]$ mpirun -n 4 ./a.out 
 rank:            0  data:         1000        2000        3000        4000
 rank:            0  data:        -2000       -4000       -6000       -8000
 rank:            1  data:         1000        2000        3000        4000
 rank:            1  data:            0       -4000           0           0
 rank:            2  data:         1000        2000        3000        4000
 rank:            2  data:            0           0       -6000           0
 rank:            3  data:         1000        2000        3000        4000
 rank:            3  data:            0           0           0       -8000
[wohlbier@riptide03 tau_mpiallgatherv]$ tau mpirun -n 4 ./a.out 
[TAU] 
[TAU] == BEGIN Experiment at 2016-09-20 18:24:21.169059 ========================
[TAU] 
[TAU] mpirun -n 4 tau_exec -T mpi,icpc -ebs ./a.out
[riptide03:9836] *** An error occurred in MPI_Allgather
[riptide03:9836] *** on communicator MPI_COMM_WORLD
[riptide03:9836] *** MPI_ERR_TYPE: invalid datatype
[riptide03:9836] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 9664 on
node riptide03 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[riptide03:09661] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[riptide03:09661] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[TAU] **************************************************************************
[TAU] 
[TAU] WARNING
[TAU] 
[TAU] Return code 3 from 'mpirun -n 4 tau_exec -T mpi,icpc -ebs ./a.out'
[TAU] 
[TAU] **************************************************************************
[TAU] 
[TAU] == END Experiment at 2016-09-20 18:24:22.984226 ==========================
[TAU] 
[TAU] **************************************************************************
[TAU] 
[TAU] WARNING
[TAU] 
[TAU] Program exited with nonzero status code: 3
[TAU] 
[TAU] **************************************************************************
[TAU] Trial 0 produced 3 profile files.
wohlbier commented 8 years ago

Note that is for the default sample measurement with no source/compiler instrumentation. When I select the profile measurement it works fine.

Also, I'm finding for larger examples that when I try to use the default sample measurement my tau mpirun code doesn't do anything. It just sits there.

jlinford commented 8 years ago

It seems to be working with the latest unstable commit on Riptide, will you please check? You can also try the TAU Commander global installation at $PET_HOME/pkgs/taucmdr-0.1 where I've already installed TAU+dependencies with intel/14.0 and openmpi/intel/1.8.0.

$ module rm openmpi intel impi
$ module load intel/14.0 openmpi/intel/1.8.0
$ module list
Currently Loaded Modulefiles:
  1) costinit              2) intel/14.0            3) openmpi/intel/1.8.0

[jlinford@r2n13 x]$ tau init --mpi-compilers System
[TAU] Created a new project named 'x'.
[TAU] Could not find a Universal Parallel C compiler.
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/icpc' wrapped by System MPI C++ compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpic++'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/icpc' wrapped by System MPI C++ compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpicxx'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/icpc' wrapped by System MPI C++ compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpiCC'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/icc' wrapped by System MPI C compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpicc'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/ifort' wrapped by System MPI Fortran compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpif90'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/ifort' wrapped by System MPI Fortran compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpif77'
[TAU] Could not find a SHMEM C++ compiler.
[TAU] ***********************************************************************************************************************************************************************************************************
[TAU] 
[TAU] WARNING
[TAU] 
[TAU] Configured with compilers from different families:
[TAU]   - Intel C compiler '/gpfs/pkgs/mhpcc/intel/wrapper/icc'
[TAU]   - Intel C compiler '/gpfs/pkgs/mhpcc/intel/wrapper/icc' wrapped by System MPI C compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpicc'
[TAU]   - Intel C++ compiler '/gpfs/pkgs/mhpcc/intel/wrapper/icpc'
[TAU]   - Intel C++ compiler '/gpfs/pkgs/mhpcc/intel/wrapper/icpc' wrapped by System MPI C++ compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpic++'
[TAU]   - Intel Fortran compiler '/gpfs/pkgs/mhpcc/intel/wrapper/ifort'
[TAU]   - Intel Fortran compiler '/gpfs/pkgs/mhpcc/intel/wrapper/ifort' wrapped by System MPI Fortran compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpif90'
[TAU]   - OpenSHMEM SHMEM C compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/oshcc'
[TAU]   - OpenSHMEM SHMEM Fortran compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/oshfort'
[TAU] 
[TAU] ***********************************************************************************************************************************************************************************************************
[TAU] Added target 'r2n13' to project configuration 'x'.
[TAU] Added application 'x' to project configuration 'x'.
[TAU] Added measurement 'sample' to project configuration 'x'.
[TAU] Added measurement 'profile' to project configuration 'x'.
[TAU] Added measurement 'trace' to project configuration 'x'.
[TAU] Created a new experiment named 'r2n13-x-sample'.
[TAU] Selected experiment 'r2n13-x-sample'.
[TAU] Experiment may be performed without application rebuild.

== Project Configuration (/gpfs/home/jlinford/taucmdr-test/examples/x/.tau/project.json) ==================================================================================================================

+------+---------+--------------+------------------------+---------------+
| Name | Targets | Applications |      Measurements      | # Experiments |
+======+=========+==============+========================+===============+
|    x |  r2n13  |      x       | sample, profile, trace |       1       |
+------+---------+--------------+------------------------+---------------+

== Targets in project 'x' =================================================================================================================================================================================

+-------+---------+-----------+----------------+---------------+-----------------+
| Name  | Host OS | Host Arch | Host Compilers | MPI Compilers | SHMEM Compilers |
+=======+=========+===========+================+===============+=================+
| r2n13 |  Linux  |  x86_64   |     Intel      |    System     |    OpenSHMEM    |
+-------+---------+-----------+----------------+---------------+-----------------+

== Applications in project 'x' ============================================================================================================================================================================

+------+--------+----------+-----+------+--------+-------+-----+
| Name | OpenMP | Pthreads | MPI | CUDA | OpenCL | SHMEM | MPC |
+======+========+==========+=====+======+========+=======+=====+
|    x |   No   |    No    | No  |  No  |   No   |  No   | No  |
+------+--------+----------+-----+------+--------+-------+-----+

== Measurements in project 'x' ============================================================================================================================================================================

+---------+---------+-------+--------+--------------+----------------+--------------+-----------+----------+
|  Name   | Profile | Trace | Sample | Source Inst. | Compiler Inst. | OpenMP Inst. | I/O Inst. | Wrap MPI |
+=========+=========+=======+========+==============+================+==============+===========+==========+
|  sample |   tau   | none  |  Yes   |    never     |     never      |     none     |    No     |    No    |
+---------+---------+-------+--------+--------------+----------------+--------------+-----------+----------+
| profile |   tau   | none  |   No   |  automatic   |    fallback    |     none     |    No     |    No    |
+---------+---------+-------+--------+--------------+----------------+--------------+-----------+----------+
|   trace |  none   | otf2  |   No   |  automatic   |    fallback    |     none     |    No     |    No    |
+---------+---------+-------+--------+--------------+----------------+--------------+-----------+----------+

== Experiments in project 'x' =============================================================================================================================================================================

+----------------+--------+-----------+--------+-------------+-------------+
|      Name      | Trials | Data Size | Target | Application | Measurement |
+================+========+===========+========+=============+=============+
| r2n13-x-sample |   0    |   0.0B    | r2n13  |      x      |   sample    |
+----------------+--------+-----------+--------+-------------+-------------+

Selected experiment: r2n13-x-sample

[jlinford@r2n13 x]$ tau mpif90 main.f90 
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/ifort' wrapped by System MPI Fortran compiler '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpif90'
[TAU] TAU_OPTIONS=-optRevert -optNoCompInst
[TAU] TAU_MAKEFILE=/gpfs/home/jlinford/taucmdr-test/.system/tau/87c01952ef804be9ee0087a75d01be08/x86_64/lib/Makefile.tau-icpc
[TAU] /gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/bin/mpif90 -g main.f90

[jlinford@r2n13 x]$ tau mpirun -np 4 ./a.out 
[TAU] 
[TAU] == BEGIN Experiment at 2016-09-20 21:09:45.745609 =========================================================================================================================================================
[TAU] 
[TAU] mpirun -np 4 tau_exec -T serial,icpc -ebs ./a.out
 rank:            0  data:         1000        2000        3000        4000
 rank:            0  data:        -2000       -4000       -6000       -8000
 rank:            1  data:         1000        2000        3000        4000
 rank:            1  data:            0       -4000           0           0
 rank:            2  data:         1000        2000        3000        4000
 rank:            2  data:            0           0       -6000           0
 rank:            3  data:         1000        2000        3000        4000
 rank:            3  data:            0           0           0       -8000
[TAU] 
[TAU] == END Experiment at 2016-09-20 21:09:46.735105 ===========================================================================================================================================================
[TAU] 
[TAU] Trial 0 produced 1 profile files.

[jlinford@r2n13 x]$ tau show --profile-tool=pprof
[TAU] Opening /gpfs/home/jlinford/taucmdr-test/examples/x/.tau/x/r2n13-x-sample/0 in pprof
Reading Profile files in profile.*

NODE 0;CONTEXT 0;THREAD 0:
---------------------------------------------------------------------------------------
%Time    Exclusive    Inclusive       #Call      #Subrs  Inclusive Name
              msec   total msec                          usec/call 
---------------------------------------------------------------------------------------
100.0          432          432           1           0     432308 .TAU application
 99.5            0          430          43           0      10000 .TAU application => [CONTEXT] .TAU application
 99.5            0          430          43           0      10000 [CONTEXT] .TAU application
 41.7          180          180          18           0      10008 .TAU application => [CONTEXT] .TAU application => [SAMPLE] UNRESOLVED /lib64/libpthread-2.12.so
 41.7          180          180          18           0      10008 [SAMPLE] UNRESOLVED /lib64/libpthread-2.12.so
 32.4          139          139          14           0      10000 .TAU application => [CONTEXT] .TAU application => [SAMPLE] UNRESOLVED /lib64/libc-2.12.so
 32.4          139          139          14           0      10000 [SAMPLE] UNRESOLVED /lib64/libc-2.12.so
 11.6           50           50           5           0      10030 .TAU application => [CONTEXT] .TAU application => [SAMPLE] UNRESOLVED /gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/lib/libmpi.so.1.5.0
 11.6           50           50           5           0      10030 [SAMPLE] UNRESOLVED /gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/lib/libmpi.so.1.5.0
  4.6           19           19           2           0       9998 .TAU application => [CONTEXT] .TAU application => [SAMPLE] UNRESOLVED /gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/lib/libopen-pal.so.6.1.1
  4.6           19           19           2           0       9998 [SAMPLE] UNRESOLVED /gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.0/lib/libopen-pal.so.6.1.1
  4.6           19           19           2           0       9926 .TAU application => [CONTEXT] .TAU application => [SAMPLE] UNRESOLVED [vdso]
  4.6           19           19           2           0       9926 [SAMPLE] UNRESOLVED [vdso]
  2.3           10           10           1           0      10000 .TAU application => [CONTEXT] .TAU application => [SAMPLE] __intel_ssse3_rep_memcpy 
  2.3           10           10           1           0      10000 [SAMPLE] __intel_ssse3_rep_memcpy 
  2.3            9            9           1           0       9872 .TAU application => [CONTEXT] .TAU application => [SAMPLE] UNRESOLVED /lib64/librt-2.12.so
  2.3            9            9           1           0       9872 [SAMPLE] UNRESOLVED /lib64/librt-2.12.so
[jlinford@r2n13 x]$ 
wohlbier commented 8 years ago

I tried from the tip of unstable and it still does not work for me. Same errors as first reported. I do notice that you're doing this from a compute node where I'm doing it from a login node. I will try from a compute node.

I tried also using your installation in $PET_HOME. See below.

[wohlbier@riptide03 tau_mpiallgatherv]$ /gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/bin/tau init --use-mpi
[TAU] Created a new project named 'tau_mpiallgatherv'.
[TAU] Could not find a Universal Parallel C compiler.
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icc'
[TAU]      wrapped by System MPI C compiler 
[TAU]     '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.5/bin/mpicc'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icpc'
[TAU]      wrapped by System MPI C++ compiler 
[TAU]     '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.5/bin/mpic++'
[TAU] '/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/ifort'
[TAU]      wrapped by System MPI Fortran compiler 
[TAU]     '/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.5/bin/mpif90'
[TAU] Could not find a SHMEM C compiler.
[TAU] Could not find a SHMEM C++ compiler.
[TAU] Could not find a SHMEM Fortran compiler.
[TAU] Added target 'riptide03' to project configuration 'tau_mpiallgatherv'.
[TAU] Added application 'tau_mpiallgatherv' to project configuration 
[TAU]     'tau_mpiallgatherv'.
[TAU] Added measurement 'sample' to project configuration 'tau_mpiallgatherv'.
[TAU] Added measurement 'profile' to project configuration 'tau_mpiallgatherv'.
[TAU] Added measurement 'trace' to project configuration 'tau_mpiallgatherv'.
[TAU] Created a new experiment named 'riptide03-tau_mpiallgatherv-sample'.
[TAU] Installing libunwind from 
[TAU]     'http://www.cs.uoregon.edu/research/paracomp/tau/tauprofile/dist/libunwind-1.1.tar.gz'
[TAU]      to 
[TAU]     '/gpfs/home/wohlbier/.tau/libunwind/a580b6bd50ea9cf31a862ec603f58ead'
[TAU] Using libunwind source archive 
[TAU]     '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/libunwind-1.1.tar.gz'
[TAU] Extracting 
[TAU]     '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/libunwind-1.1.tar.gz'
[TAU]      to create '/dev/shm/tmpVCCQ2Z/libunwind-1.1'
[TAU] Configuring libunwind...
[TAU] Compiling libunwind...
[TAU] Installing libunwind...
[TAU] Verifying libunwind installation...
[TAU] Installing GNU Binutils from 
[TAU]     'http://www.cs.uoregon.edu/research/paracomp/tau/tauprofile/dist/binutils-2.23.2.tar.gz'
[TAU]      to 
[TAU]     '/gpfs/home/wohlbier/.tau/binutils/28f7f1ac613b5d561a5f50b2d2e4af95'
[TAU] Using GNU Binutils source archive 
[TAU]     '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/binutils-2.23.2.tar.gz'
[TAU] Extracting 
[TAU]     '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/binutils-2.23.2.tar.gz'
[TAU]      to create '/dev/shm/tmpVCCQ2Z/binutils-2.23.2'
[TAU] Configuring GNU Binutils...
[TAU] Compiling GNU Binutils...
[TAU] Installing GNU Binutils...
[TAU] Verifying GNU Binutils installation...
[TAU] Installing TAU Performance System at 
[TAU]     '/gpfs/home/wohlbier/.tau/tau/19754e34d61198228cad12218d1d8db9' from 
[TAU]     'http://tau.uoregon.edu/tau.tgz'
[TAU] Using TAU Performance System source archive 
[TAU]     '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/tau.tgz'
[TAU] Extracting '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/tau.tgz' 
[TAU]     to create '/dev/shm/tmpVCCQ2Z/./tau-2.25.2'
[TAU] Configuring TAU...
[TAU] Compiling and installing TAU...
[TAU] Verifying TAU Performance System installation...
[TAU] Selected experiment 'riptide03-tau_mpiallgatherv-sample'.
[TAU] Experiment may be performed without application rebuild.

== Project Configuration (/gpfs/home/wohlbier/devel/tau/tau_mpiallgatherv/.tau/project.json) ==

+-------------+-------------+-------------+-------------+-------------+
|    Name     |   Targets   | Application | Measurement |      #      |
|             |             |      s      |      s      | Experiments |
+=============+=============+=============+=============+=============+
| tau_mpiallg |  riptide03  | tau_mpiallg |   sample,   |      1      |
|      atherv |             |   atherv    |  profile,   |             |
|             |             |             |    trace    |             |
+-------------+-------------+-------------+-------------+-------------+

== Targets in project 'tau_mpiallgatherv' ================================

+-----------+-----------+-----------+-----------+-----------+-----------+
|   Name    |  Host OS  | Host Arch |   Host    |    MPI    |   SHMEM   |
|           |           |           | Compilers | Compilers | Compilers |
+===========+===========+===========+===========+===========+===========+
| riptide03 |   Linux   |  x86_64   |   Intel   |  System   |   None    |
+-----------+-----------+-----------+-----------+-----------+-----------+

== Applications in project 'tau_mpiallgatherv' ===========================

+--------+--------+--------+--------+--------+--------+--------+--------+
|  Name  | OpenMP | Pthrea |  MPI   |  CUDA  | OpenCL | SHMEM  |  MPC   |
|        |        |   ds   |        |        |        |        |        |
+========+========+========+========+========+========+========+========+
| tau_mp |   No   |   No   |  Yes   |   No   |   No   |   No   |   No   |
| iallga |        |        |        |        |        |        |        |
|  therv |        |        |        |        |        |        |        |
+--------+--------+--------+--------+--------+--------+--------+--------+

== Measurements in project 'tau_mpiallgatherv' ===========================

+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| Name  | Profi | Trace | Sampl | Sourc | Compi | OpenM |  I/O  | Wrap  |
|       |  le   |       |   e   |   e   |  ler  |   P   | Inst. |  MPI  |
|       |       |       |       | Inst. | Inst. | Inst. |       |       |
+=======+=======+=======+=======+=======+=======+=======+=======+=======+
| sampl |  tau  | none  |  Yes  | never | never | none  |  No   |  Yes  |
|     e |       |       |       |       |       |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| profi |  tau  | none  |  No   | autom | fallb | none  |  No   |  Yes  |
|    le |       |       |       | atic  |  ack  |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| trace | none  | otf2  |  No   | autom | fallb | none  |  No   |  Yes  |
|       |       |       |       | atic  |  ack  |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+

== Experiments in project 'tau_mpiallgatherv' ============================

+-----------+-----------+-----------+-----------+-----------+-----------+
|   Name    |  Trials   | Data Size |  Target   | Applicati | Measureme |
|           |           |           |           |    on     |    nt     |
+===========+===========+===========+===========+===========+===========+
| riptide03 |     0     |   0.0B    | riptide03 | tau_mpial |  sample   |
| -tau_mpia |           |           |           | lgatherv  |           |
| llgatherv |           |           |           |           |           |
|   -sample |           |           |           |           |           |
+-----------+-----------+-----------+-----------+-----------+-----------+

Selected experiment: riptide03-tau_mpiallgatherv-sample

[wohlbier@riptide03 tau_mpiallgatherv]$ module list
Currently Loaded Modulefiles:
  1) costinit              4) mpi/openmpi/1.6.5     7) cmake/intel/3.2.3
  2) gcc/4.7.2             5) python/2.7.6
  3) intel/14.0.4          6) nesm/icc/ompi/4.0.7
[wohlbier@riptide03 tau_mpiallgatherv]$ cat main.f90 
program main
  use mpi
  implicit none

  integer :: i, ierr = 0, rank, size
  integer, dimension(:), allocatable :: data

  call mpi_init(ierr)

  call mpi_comm_size(MPI_COMM_WORLD, size, ierr)
  call mpi_comm_rank(MPI_COMM_WORLD, rank, ierr)

  ! allgather
  allocate(data(size))
  data = 0
  data(rank + 1) = (rank+1)*1000
  call mpi_allgather(MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, data, 1, &
       MPI_INTEGER, MPI_COMM_WORLD, ierr)
  print*, "rank: ", rank, " data: ", data

  ! gather
  data = 0
  data(rank + 1) = -(rank+1)*2000
  if (rank == 0) then
     call mpi_gather(MPI_IN_PLACE, 0, MPI_DATATYPE_NULL, &
          data, 1, MPI_INTEGER, &
          0, MPI_COMM_WORLD, ierr)
  else
     call mpi_gather(data(rank+1), 1, MPI_INTEGER, &
          data, 1, MPI_INTEGER, &
          0, MPI_COMM_WORLD, ierr)
  end if
  print*, "rank: ", rank, " data: ", data

  call mpi_finalize(ierr)

end program main
[wohlbier@riptide03 tau_mpiallgatherv]$ /gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/bin/tau mpif90 main.f90 
[TAU] XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[TAU] 
[TAU] CRITICAL
[TAU] 
[TAU] No compiler in target 'riptide03' matches 
[TAU]     '/gpfs/pkgs/hpcmp/create/sh/nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif90'.
[TAU] The known compiler commands are:
[TAU]   
[TAU]     /gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/ifort
[TAU]      (Intel Fortran compiler)
[TAU]   
[TAU]     /gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icpc
[TAU]      (Intel C++ compiler)
[TAU]   /gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.5/bin/mpicc (System MPI C 
[TAU]     compiler)
[TAU]   /gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.5/bin/mpic++ (System MPI C++ 
[TAU]     compiler)
[TAU]   
[TAU]     /gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icc
[TAU]      (Intel C compiler)
[TAU]   /gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.5/bin/mpif90 (System MPI Fortran 
[TAU]     compiler)
[TAU]   
[TAU]     /gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/ifort
[TAU]      (Intel Fortran compiler)
[TAU]   
[TAU]     /gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icpc
[TAU]      (Intel C++ compiler)
[TAU]   
[TAU]     /gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/icc
[TAU]      (Intel C compiler)
[TAU] 
[TAU] Hints:
[TAU]   * Try one of the valid compiler commands
[TAU]   * Create and select a new target configuration that uses the 'mpif90' 
[TAU]     compiler
[TAU]   * Check loaded modules and the PATH environment variable
[TAU] 
[TAU] TAU cannot proceed with the given inputs.
[TAU] Please check the selected configuration for errors or contact 
[TAU]     <support@paratools.com> for assistance.
[TAU] 
[TAU] XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[wohlbier@riptide03 tau_mpiallgatherv]$ module list
Currently Loaded Modulefiles:
  1) costinit              4) mpi/openmpi/1.6.5     7) cmake/intel/3.2.3
  2) gcc/4.7.2             5) python/2.7.6
  3) intel/14.0.4          6) nesm/icc/ompi/4.0.7
jlinford commented 8 years ago

Hi John,

Is there any way I can have access to the intel/14.0.4 and mpi/openmpi/1.6.5 modules? This is starting to look like a MPI version issue since it works for me with OpenMPI 1.8.0.

For this new error message, can you please try initializing with /gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/bin/tau init --use-mpi --mpi-compilers System? Without the --mpi-compilers flag the central installation will default to the MPI compilers that were used to build TAU in that location (trying to reduce TAU recompilation in case of tutorial, etc.). You can see this in the initialization message:

[TAU]
'/gpfs/pkgs/hpcmp/create/sh/nesm/.dev/intel/2013/composer_xe_2013_sp1.4.211/bin/intel64/ifort'
[TAU] wrapped by System MPI Fortran compiler [TAU]
'/gpfs/pkgs/mhpcc/openmpi/tm/intel/1.8.5/bin/mpif90'

You can change these defaults on a per-user and per-project basis. To set the default compiler commands for your user account:

tau configure -@user Target.MPI_CC=/gpfs/pkgs/hpcmp/create/sh/
nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpicc
tau configure -@user Target.MPI_CXX=/gpfs/pkgs/hpcmp/create/sh/
nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpicxx
tau configure -@user Target.MPI_FC=/gpfs/pkgs/hpcmp/create/sh/
nesm/.mpi/openmpi/1.6.5/intel-14.0/bin/mpif90

Now anytime TAU creates a target it will use these as the default compilers. To set them as the defaults for new targets in the project replace -@user with -@project in the above commands.

You can see all configured defaults at storage level LEVEL by calling tau configure -@LEVEL, e.g. tau conf -@system.

To summarize how defaults are determined:

  1. Check the command line.
  2. If no argument on the command line, walk up the storage hierarchy (project, user, system) looking for a configured default value.
  3. If no default value found, try to calculate one from the system environment or use the hard coded default.

Thanks!

wohlbier commented 8 years ago

Same errant results running it on a compute node. Do note that I'm using openmpi 1.6.5, by necessity, and not 1.8.

I will ask the project leader about getting you added to their project. I think that's the easiest way to get your access to the modules.

jlinford commented 8 years ago

It is version dependent: I built openmpi 1.6.5 from source and can replicate the error!

[jlinford@riptide04 x]$ module list
Currently Loaded Modulefiles:
  1) costinit     2) intel/14.0   3) gcc/4.8.2
[jlinford@riptide04 x]$ which mpicc
~/opt/openmpi-1.6.5/bin/mpicc
[jlinford@riptide04 x]$ which tau
/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/bin/tau

[jlinford@riptide04 x]$ tau init --mpi-compilers System --use-mpi
[TAU] Created a new project named 'x'.
[TAU] Could not find a Universal Parallel C compiler.
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/ifort' wrapped by System MPI Fortran compiler '/gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpif90'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/ifort' wrapped by System MPI Fortran compiler '/gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpif77'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/icc' wrapped by System MPI C compiler '/gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpicc'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/icpc' wrapped by System MPI C++ compiler '/gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpic++'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/icpc' wrapped by System MPI C++ compiler '/gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpicxx'
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/icpc' wrapped by System MPI C++ compiler '/gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpiCC'
[TAU] Could not find a SHMEM C compiler.
[TAU] Could not find a SHMEM C++ compiler.
[TAU] Could not find a SHMEM Fortran compiler.
[TAU] Added target 'riptide04' to project configuration 'x'.
[TAU] Added application 'x' to project configuration 'x'.
[TAU] Added measurement 'sample' to project configuration 'x'.
[TAU] Added measurement 'profile' to project configuration 'x'.
[TAU] Added measurement 'trace' to project configuration 'x'.
[TAU] Created a new experiment named 'riptide04-x-sample'.
[TAU] Installing libunwind from 'http://www.cs.uoregon.edu/research/paracomp/tau/tauprofile/dist/libunwind-1.1.tar.gz' to 
[TAU]     '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/libunwind/1485ff1108e3552d701b391882b5f9ec'
[TAU] Using libunwind source archive '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/libunwind-1.1.tar.gz'
[TAU] Extracting '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/libunwind-1.1.tar.gz' to create '/dev/shm/tmpONJtdB/libunwind-1.1'
[TAU] Configuring libunwind...
[TAU] Compiling libunwind...
[TAU] Installing libunwind...
[TAU] Verifying libunwind installation...
[TAU] Installing GNU Binutils from 'http://www.cs.uoregon.edu/research/paracomp/tau/tauprofile/dist/binutils-2.23.2.tar.gz' to 
[TAU]     '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/binutils/248bccb4d033a2d5a8492673567c7015'
[TAU] Using GNU Binutils source archive '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/binutils-2.23.2.tar.gz'
[TAU] Extracting '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/binutils-2.23.2.tar.gz' to create '/dev/shm/tmpONJtdB/binutils-2.23.2'
[TAU] Configuring GNU Binutils...
[TAU] Compiling GNU Binutils...
[TAU] Installing GNU Binutils...
[TAU] Verifying GNU Binutils installation...
[TAU] Installing TAU Performance System at '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/tau/9073766156ae0aaf27e005a761a34f32' from 'http://tau.uoregon.edu/tau.tgz'
[TAU] Using TAU Performance System source archive '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/tau.tgz'
[TAU] Extracting '/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/src/tau.tgz' to create '/dev/shm/tmpONJtdB/./tau-2.25.2'
[TAU] Configuring TAU...
[TAU] Compiling and installing TAU...
[TAU] Verifying TAU Performance System installation...
[TAU] Selected experiment 'riptide04-x-sample'.
[TAU] Experiment may be performed without application rebuild.

== Project Configuration (/gpfs/home/jlinford/taucmdr-test/examples/x/.tau/project.json) ==========================================================================================

+------+-----------+--------------+------------------------+---------------+
| Name |  Targets  | Applications |      Measurements      | # Experiments |
+======+===========+==============+========================+===============+
|    x | riptide04 |      x       | sample, profile, trace |       1       |
+------+-----------+--------------+------------------------+---------------+

== Targets in project 'x' =========================================================================================================================================================

+-----------+---------+-----------+----------------+---------------+-----------------+
|   Name    | Host OS | Host Arch | Host Compilers | MPI Compilers | SHMEM Compilers |
+===========+=========+===========+================+===============+=================+
| riptide04 |  Linux  |  x86_64   |     Intel      |    System     |      None       |
+-----------+---------+-----------+----------------+---------------+-----------------+

== Applications in project 'x' ====================================================================================================================================================

+------+--------+----------+-----+------+--------+-------+-----+
| Name | OpenMP | Pthreads | MPI | CUDA | OpenCL | SHMEM | MPC |
+======+========+==========+=====+======+========+=======+=====+
|    x |   No   |    No    | Yes |  No  |   No   |  No   | No  |
+------+--------+----------+-----+------+--------+-------+-----+

== Measurements in project 'x' ====================================================================================================================================================

+---------+---------+-------+--------+--------------+----------------+--------------+-----------+----------+
|  Name   | Profile | Trace | Sample | Source Inst. | Compiler Inst. | OpenMP Inst. | I/O Inst. | Wrap MPI |
+=========+=========+=======+========+==============+================+==============+===========+==========+
|  sample |   tau   | none  |  Yes   |    never     |     never      |     none     |    No     |   Yes    |
+---------+---------+-------+--------+--------------+----------------+--------------+-----------+----------+
| profile |   tau   | none  |   No   |  automatic   |    fallback    |     none     |    No     |   Yes    |
+---------+---------+-------+--------+--------------+----------------+--------------+-----------+----------+
|   trace |  none   | otf2  |   No   |  automatic   |    fallback    |     none     |    No     |   Yes    |
+---------+---------+-------+--------+--------------+----------------+--------------+-----------+----------+

== Experiments in project 'x' =====================================================================================================================================================

+--------------------+--------+-----------+-----------+-------------+-------------+
|        Name        | Trials | Data Size |  Target   | Application | Measurement |
+====================+========+===========+===========+=============+=============+
| riptide04-x-sample |   0    |   0.0B    | riptide04 |      x      |   sample    |
+--------------------+--------+-----------+-----------+-------------+-------------+

Selected experiment: riptide04-x-sample

[jlinford@riptide04 x]$ tau mpif90 main.f90 
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/ifort' wrapped by System MPI Fortran compiler '/gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpif90'
[TAU] TAU_OPTIONS=-optRevert -optNoCompInst
[TAU] TAU_MAKEFILE=/gpfs/pkgs/hpcmp/PTOOLS/pkgs/taucmdr-0.1/.system/tau/9073766156ae0aaf27e005a761a34f32/x86_64/lib/Makefile.tau-icpc-mpi
[TAU] /gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpif90 -g main.f90

[jlinford@r2n65 x]$ mpif90 main.f90 
[jlinford@r2n65 x]$ mpirun -np 4 ./a.out 
 rank:            0  data:         1000        2000        3000        4000
 rank:            0  data:        -2000       -4000       -6000       -8000
 rank:            1  data:         1000        2000        3000        4000
 rank:            1  data:            0       -4000           0           0
 rank:            2  data:         1000        2000        3000        4000
 rank:            2  data:            0           0       -6000           0
 rank:            3  data:         1000        2000        3000        4000
 rank:            3  data:            0           0           0       -8000

tau mpirun -np 4 ./a.out 
[TAU] 
[TAU] == BEGIN Experiment at 2016-09-21 13:59:25.350092 =================================================================================================================================
[TAU] 
[TAU] mpirun -np 4 tau_exec -T mpi,icpc -ebs ./a.out
[r2n65:7126] *** An error occurred in MPI_Allgather
[r2n65:7126] *** on communicator MPI_COMM_WORLD
[r2n65:7126] *** MPI_ERR_TYPE: invalid datatype
[r2n65:7126] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 6956 on
node r2n65 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[r2n65:06953] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[r2n65:06953] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[TAU] ***********************************************************************************************************************************************************************************
[TAU] 
[TAU] WARNING
[TAU] 
[TAU] Return code 3 from 'mpirun -np 4 tau_exec -T mpi,icpc -ebs ./a.out'
[TAU] 
[TAU] ***********************************************************************************************************************************************************************************
[TAU] 
[TAU] == END Experiment at 2016-09-21 13:59:25.521992 ===================================================================================================================================
[TAU] 
[TAU] ***********************************************************************************************************************************************************************************
[TAU] 
[TAU] WARNING
[TAU] 
[TAU] Program exited with nonzero status code: 3
[TAU] 
[TAU] ***********************************************************************************************************************************************************************************
[TAU] Trial 3 produced 3 profile files.
wohlbier commented 8 years ago

Phew! I was beginning to think I was crazy.

I wonder if this is related to why my application won't run the sample experiment. When I tau select sample and run it, it just hangs. Any idea?

jlinford commented 8 years ago

Appears to work on Riptide as of commit 469f0b9343fa6825a7e91845d33cd7186fae36da.

[jlinford@riptide04 x]$ module list
Currently Loaded Modulefiles:
  1) costinit        2) gcc/gnu/4.9.3   3) intel/14.0
[jlinford@riptide04 x]$ which mpicc
~/opt/openmpi-1.6.5/bin/mpicc
[jlinford@riptide04 x]$ tau init --mpi-compilers System --use-mpi
....
Selected experiment: riptide04-x-sample

[jlinford@riptide04 x]$ tau mpif90 main.f90 
[TAU] '/gpfs/pkgs/mhpcc/intel/wrapper/ifort' wrapped by System MPI Fortran compiler '/gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpif90'
[TAU] TAU_OPTIONS=-optRevert -optNoCompInst
[TAU] TAU_MAKEFILE=/gpfs/home/jlinford/workspace/taucmdr/.system/tau/9073766156ae0aaf27e005a761a34f32/x86_64/lib/Makefile.tau-icpc-mpi
[TAU] /gpfs/home/jlinford/opt/openmpi-1.6.5/bin/mpif90 -g main.f90
[jlinford@riptide04 x]$ tau mpirun -np 4 ./a.out 
[TAU] 
[TAU] == BEGIN Experiment at 2016-09-22 22:01:48.526429 =================================================================================================================================
[TAU] 
[TAU] mpirun -np 4 tau_exec -T mpi,icpc -ebs ./a.out
--------------------------------------------------------------------------
WARNING: There are more than one active ports on host 'riptide04', but the
default subnet GID prefix was detected on more than one of these
ports.  If these ports are connected to different physical IB
networks, this configuration will fail in Open MPI.  This version of
Open MPI requires that every physically separate IB subnet that is
used between connected MPI processes must have different subnet ID
values.

Please see this FAQ entry for more details:

  http://www.open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid

NOTE: You can turn off this warning by setting the MCA parameter
      btl_openib_warn_default_gid_prefix to 0.
--------------------------------------------------------------------------
 rank:            3  data:         1000        2000        3000        4000
 rank:            3  data:            0           0           0       -8000
 rank:            0  data:         1000        2000        3000        4000
 rank:            0  data:        -2000       -4000       -6000       -8000
 rank:            1  data:         1000        2000        3000        4000
 rank:            1  data:            0       -4000           0           0
 rank:            2  data:         1000        2000        3000        4000
 rank:            2  data:            0           0       -6000           0
[riptide04:24843] 3 more processes have sent help message help-mpi-btl-openib.txt / default subnet prefix
[riptide04:24843] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[TAU] 
[TAU] == END Experiment at 2016-09-22 22:01:51.455543 ===================================================================================================================================
[TAU] 
[TAU] Trial 0 produced 4 profile files.