darshan-hpc / darshan

Darshan I/O characterization tool
Other
56 stars 27 forks source link

issue with darshan interface for mpif90 of hpc-sdk-22.5 #973

Open pankajd-57 opened 8 months ago

pankajd-57 commented 8 months ago

Hi,

I have darshan version 3.4.4 and hpc-sdk-22.5.

I have to generate mpicc and mpif90 interfaces for darshan.

FOr mpif90, i have ################################################### mpif90 --version nvfortran 22.5-0 64-bit target on x86-64 Linux -tp zen2 NVIDIA Compilers and Tools Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. ################################################################

I have to change line in file "darshan-gen-fortran.pl"

to "my $version_out = $input_file --version 2>/dev/null |head -n 1;"

from

"my $version_out = $input_file -V 2>/dev/null |head -n 1;"

to generate the dmpif90 interface using

./darshan-gen-fortran.pl /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/bin/mpif90 --output dmpif90

This generates the interface but with the error

###################################### dmpif90 --version nm: '/tmp/tmp.o9LB8Yb13K': No such file

nvfortran 22.5-0 64-bit target on x86-64 Linux -tp zen2 NVIDIA Compilers and Tools Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x8664/22.5/compilers/lib/f90main.o: in function main': nvcsGlv2yRa0bPl.ll:(.text+0x2f): undefined reference toMAIN' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_write_at_all_begin' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_iwrite_at' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/compilers/lib/libnvf.so: undefined reference to `wrap_aio_write' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_read_at_all_begin' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_write_all_begin' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_read' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/compilers/lib/libnvf.so: undefined reference towrap_aio_read' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_read_ordered' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_read_at_all' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_write_ordered_begin' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_Init' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_open' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_iread_at' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to `wrap_PMPI_File_iwrite' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_read_all' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to__wrap_PMPI_File_write_all'

/usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_iwrite_shared' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_set_view' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to `wrap_PMPI_File_write' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_Finalize' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_read_at' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to `wrap_PMPI_File_iread' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_read_all_begin' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_write_ordered' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to `wrap_PMPI_Init_thread' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_read_shared' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_write_shared' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/compilers/lib/libnvf.so: undefined reference to `wrap_aio_return' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to __wrap_PMPI_File_read_ordered_begin' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference towrap_PMPI_File_iread_shared' /usr/bin/ld: /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib/libmpi_mpifh.so: undefined reference to `wrap_PMPI_File_sync' ################################################################################## whereas mpicc interface (dmpicc) seems is OK

############################################## dmpicc --version nm: '/tmp/tmp.4NHWUedf4i': No such file

nvc 22.5-0 64-bit target on x86-64 Linux -tp zen2 NVIDIA Compilers and Tools Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. /usr/bin/ld: /usr/lib/x86_64-linux-gnu/crt1.o: in function _start': (.text+0x24): undefined reference tomain' ############################################

I get an issue of compilation of QE-7.2 with Darshan's mpif90 interface. Please help.

pankajd-57 commented 8 months ago

This is seen in darshan interface (wrappr) script. $FC "${noshrargs[@]}" -I/opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/include -I/opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib -L/opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib -rpath /opt/hpc-sdk-22.5/Linux_x86_64/22.5/comm_libs/hpcx/hpcx-2.11/ompi/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -Wl,-Map,$tmpfile $LDFLAGS -o $binfile >& /dev/null

shanedsnyder commented 8 months ago

To be honest, we don't really use these wrapper generator scripts anymore and haven't for quite some time. They are pretty difficult to generalize across MPI implementations, and we eventually found other alternative options that are easier to use (e.g., using software modules for Cray environments we are deployed in). I realize that's not super helpful, but just wanted to start off by mentioning that these aren't really supported and we should maybe just consider removing them.

Would something like LD_PRELOAD work fine for your use cases? That's a pretty foolproof way of having Darshan interpose itself that doesn't rely on us hacking up a compiler wrapper, but it does require your executables be dynamically-linked.

I don't really have much to suggest on "fixing" the compiler wrapper script, but from the example output you share, it doesn't look like the script your ending up with is even doing the most basic task of attempting to link Darshan in (-ldarshan). You might be able to fiddle more with the script to see if you can get the desired behavior -- it does have lots of comments related to what it's trying to do.

carns commented 8 months ago

For a little more background: those wrapper generator scripts were designed for MPICH-based MPI builds, but it looks like you probably have an OpenMPI-based build. That corresponds to this section of the documentation ((basically your options are to use explicit link arguments at application compile time or LD_PRELOAD at application run time):

https://www.mcs.anl.gov/research/projects/darshan/docs/darshan-runtime.html#_linux_clusters_using_open_mpi

If you were using an MPICH derivative we would probably recommend trying a profile configuration rather than the wrapper generators (see https://www.mcs.anl.gov/research/projects/darshan/docs/darshan-runtime.html#_using_a_profile_configuration). That's more likely to work with recent MPICH releases (and we should update the documentation accordingly soon). I don't think there is an analogous configuration option for OpenMPI though.