pmodels / pilgrim

Logger for MPI communication
Other
26 stars 6 forks source link

multiple definition build errors #22

Closed jczhang07 closed 1 year ago

jczhang07 commented 1 year ago

@wangvsa @yfguo

I followed the build instructions but in the end met many "multiple definition of xxx" errors

cd ~/pilgrim
./autogen.sh
./configure
make

...
CCLD     pilgrim_app_generator
/usr/bin/ld: src/decoder/pilgrim_app_generator-pilgrim_metadata_decoder.o:/home/jczhang/pilgrim/./include/pilgrim_logger.h:17: multiple definition of `g_mpi_rank'; src/decoder/pilgrim_app_generator-pilgrim_app_generator.o:/home/jczhang/pilgrim/./include/pilgrim_logger.h:17: first defined here
/usr/bin/ld: src/decoder/pilgrim_app_generator-pilgrim_metadata_decoder.o:/home/jczhang/pilgrim/./include/pilgrim_logger.h:18: multiple definition of `g_mpi_size'; src/decoder/pilgrim_app_generator-pilgrim_app_generator.o:/home/jczhang/pilgrim/./include/pilgrim_logger.h:18: first defined here
/usr/bin/ld: src/decoder/pilgrim_app_generator-pilgrim_metadata_decoder.o:/home/jczhang/pilgrim/./include/pilgrim_logger.h:19: multiple definition of `g_program_start_time'; src/decoder/pilgrim_app_generator-pilgrim_app_generator.o:/home/jczhang/pilgrim/./include/pilgrim_logger.h:19: first defined here
...
wangvsa commented 1 year ago

Hi @jczhang07 I couldn't reproduce the error on my machines. Which compiler (and version) did you use?

jczhang07 commented 1 year ago

@wangvsa I used openmpi-4.1.3 + gcc-11.3.0 and nothing special. I attached the whole output here. output.txt

jczhang07 commented 1 year ago

Also failed on my Mac, with errors like

duplicate symbol '_g_mpi_size' in:
    src/.libs/pilgrim_wrappers.o
    src/.libs/pilgrim_mpi_objects.o
duplicate symbol '_g_program_start_time' in:
    src/.libs/pilgrim_wrappers.o
    src/.libs/pilgrim_mpi_objects.o
ld: 111 duplicate symbols for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
wangvsa commented 1 year ago

Thank you. I believe the error is due to the use of the latest version of gcc. In the latest gcc, they changed the default of -fcommon to -fno-common, which does not allow such duplicates. A temporary fix is to add -fcommon at configure time, i.e. CFLAGS=-fcommon ./configure I will also fix it in the next PR.

jczhang07 commented 1 year ago

@wangvsa OK, with that, I advanced further but there were new errors.

In file included from src/pilgrim_wrappers.c:5:
/nfs/gce/software/custom/linux-ubuntu22.04-x86_64/spack/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.3.0/openmpi-4.1.3-qrpnszy/include/mpi.h:1291:47: error: expected declaration specifiers or '...' before '(' token
 1291 | #define MPI_Aint_add(base, disp) ((MPI_Aint) ((char *) (base) + (disp)))
      |                                               ^
src/pilgrim_wrappers.c:153:10: note: in expansion of macro 'MPI_Aint_add'
  153 | MPI_Aint MPI_Aint_add(MPI_Aint base, MPI_Aint disp) { return imp_MPI_Aint_add(base, disp); }
      |          ^~~~~~~~~~~~
src/pilgrim_wrappers.c: In function 'imp_MPI_Keyval_free':
src/pilgrim_wrappers.c:1139:9: warning: 'PMPI_Keyval_free' is deprecated: PMPI_Keyval_free was deprecated in MPI-2.0; PMPI_Comm_free_keyval instead. [-Wdeprecated-declarations]
 1139 |         PILGRIM_TRACING_1(int, MPI_Keyval_free, (keyval));
      |         ^~~~~~~~~~~~~~~~~
In file included from src/pilgrim_wrappers.c:5:
wangvsa commented 1 year ago

Interesting, I haven't tested with openmpi before. It seems that openmpi implemented MPI_Aint_add and MPI_Aint_diff using MACROS instead of functions. which broke my wrappers for them. A quick (and ugly) fix is to simply comment out line 153 and 6027 of src/wrappers.c I'll need to come up with a permanent solution though.

jczhang07 commented 1 year ago

@wangvsa Commenting the two lines works! I was able to build. I ran my test,

$ mpirun -n 1 -x LD_PRELOAD=/home/jczhang/pilgrim/lib/libpilgrim.so ./ex50 -bs 8 -pc_type pbjacobi
$ cd pilgrim-logs
$ ls
funcs.dat  grammars.dat  pilgrim.mt

How to view the binary funcs.dat? I want to see which MPI functions were executed in the run, but do not care about their parameters, execution time or other stuff.

wangvsa commented 1 year ago

Oh I have a simple post-processing tool for that, called pilgrim2text, which you should have already built. Run ./pilgrim2text /path/to/pilgrim-logs. It will write out one text file per MPI rank under /path/to/pilgrim-logs/_text

For now it only prints the function names, if you want more information, let me know, its easy to add more.

jczhang07 commented 1 year ago

Using pilgrim2text I got what I want. Thanks a lot.

wangvsa commented 1 year ago

I have added -fcommon to the default makefile. PR