Open GoogleCodeExporter opened 9 years ago
Waiting on further comment. We had a possible solution over an email chat but
it turns out that it doesn't cover all MPI implementations.
Original comment by chapp...@gmail.com
on 4 May 2012 at 12:51
The attached patch includes the "MPI rank" in the output filename, at least for
the openmpi MPI implementation.
Background: MPI provides methods for parallel computing beneath
multithreading, i.e. parallel processing without shared memory. A particular
job may run on many processors distributed among many nodes simultanously,
therefore getpid() does not provide a unique id.
In the attached patch, I include the MPI rank (in the openmpi MPI
implementation) explicitely when generating file names for the profiler.
A method independent of the specific MPI implementation would require explicit
calls to the MPI library in order to obtain the process rank, which in turn
requires initializing MPI. This has two disadvantages: for one, libtcmalloc
would then depend on MPI (which is not that common outside the high-performance
computing (HPC) context); for another, interacting with the MPI implementation
from within libtcmalloc will violate assumptions made by the program being
debugged.
My patch evaluates an environment variable defined by the openmpi
implementation, which according to their FAQ [1] is guaranteed to be stable in
future releases.
The patch is written in such a way that it is easy to add environment variables
related to other MPI implementations; intel mpi provides $PMI_RANK [2].
I implemented only for openmpi, as thats the implementation I can test on our
HPC cluster.
[1] http://www.open-mpi.org/faq/?category=running#mpi-environmental-variables
[2] https://software.intel.com/de-de/forums/topic/284007
--
c u
henning
Original comment by henning....@googlemail.com
on 4 Aug 2015 at 6:35
Attachments:
Original issue reported on code.google.com by
niuqingp...@gmail.com
on 2 Apr 2012 at 6:35