open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.12k stars 856 forks source link

when i run mpi program using ASAN, asan reports some memory leaks #12584

Open helin1995 opened 3 months ago

helin1995 commented 3 months ago

Background information

x86 centos 7.6, openmpi version: 5.0.3,gcc version: 7.3.0

CMake config: i want to use ASAN in my code, and config -fsanitize=address in cmake setting: set(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -ffunction-sections -O0 -Wall -g2 -ggdb -fsanitize=address -fsanitize-recover=address,all -fno-omit-frame-pointer -fno-stack-protector")

and my code is:

#include <mpi.h>

int main(int argc, char *argv[])
{
    int result = 0;
    int provided;
    MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
    MPI_Finalize();

    return result;
}

when i compile code,and then run the program, it reports some infomation:

==51012==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 5040 byte(s) in 7 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f044829decf in ompi_op_base_op_select (/usr/local/openmpi/lib/libmpi.so.40+0x1f5ecf)

Direct leak of 5040 byte(s) in 7 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f044829eb48 in avx_component_op_query (/usr/local/openmpi/lib/libmpi.so.40+0x1f6b48)

Direct leak of 1984 byte(s) in 2 object(s) allocated from:
    #0 0x7f044a55c488 in __interceptor_calloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:70
    #1 0x7f0443a93bbd in opal_hash_table_init2 (/usr/local/openmpi/lib/libopen-pal.so.80+0x24bbd)

Direct leak of 384 byte(s) in 12 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f044372c0a4 in pmix_hash_fetch (/usr/local/openmpi/lib/libpmix.so.2+0x1020a4)

Direct leak of 144 byte(s) in 2 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f04481043b5 in ompi_comm_init_mpi3 (/usr/local/openmpi/lib/libmpi.so.40+0x5c3b5)

Direct leak of 96 byte(s) in 3 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f04437a643e in fetch_nodeinfo (/usr/local/openmpi/lib/libpmix.so.2+0x17c43e)

Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f04481280b0 in ompi_group_allocate_plist_w_procs (/usr/local/openmpi/lib/libmpi.so.40+0x800b0)

Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f044813fedf in ompi_group_from_pset (/usr/local/openmpi/lib/libmpi.so.40+0x97edf)

Direct leak of 64 byte(s) in 2 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f04437a6bbe in fetch_appinfo (/usr/local/openmpi/lib/libpmix.so.2+0x17cbbe)

Direct leak of 64 byte(s) in 2 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f04437a5f69 in fetch_sessioninfo (/usr/local/openmpi/lib/libpmix.so.2+0x17bf69)

Direct leak of 56 byte(s) in 4 object(s) allocated from:
    #0 0x7f044a4f7100 in __interceptor___strdup ../../.././libsanitizer/asan/asan_interceptors.cc:576
    #1 0x7f044368cbec in get_data (/usr/local/openmpi/lib/libpmix.so.2+0x62bec)

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f0443759504 in PMIx_Value_create (/usr/local/openmpi/lib/libpmix.so.2+0x12f504)

Direct leak of 15 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f044810435a in ompi_comm_init_mpi3 (/usr/local/openmpi/lib/libmpi.so.40+0x5c35a)

Direct leak of 14 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f04481045c4 in ompi_comm_init_mpi3 (/usr/local/openmpi/lib/libmpi.so.40+0x5c5c4)

Direct leak of 14 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a4f7100 in __interceptor___strdup ../../.././libsanitizer/asan/asan_interceptors.cc:576
    #1 0x7f044368c4fd in get_data (/usr/local/openmpi/lib/libpmix.so.2+0x624fd)

Indirect leak of 992 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a55c488 in __interceptor_calloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:70
    #1 0x7f0443a93bbd in opal_hash_table_init2 (/usr/local/openmpi/lib/libopen-pal.so.80+0x24bbd)

Indirect leak of 320 byte(s) in 8 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f04481011c5 in ompi_attr_set_fint (/usr/local/openmpi/lib/libmpi.so.40+0x591c5)

Indirect leak of 239 byte(s) in 11 object(s) allocated from:
    #0 0x7f044a4f7100 in __interceptor___strdup ../../.././libsanitizer/asan/asan_interceptors.cc:576
    #1 0x7f044375a601 in pmix_bfrops_base_tma_value_xfer.constprop.110 (/usr/local/openmpi/lib/libpmix.so.2+0x130601)

Indirect leak of 14 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a4f7100 in __interceptor___strdup ../../.././libsanitizer/asan/asan_interceptors.cc:576
    #1 0x7f0443734f97 in pmix_bfrops_base_value_load (/usr/local/openmpi/lib/libpmix.so.2+0x10af97)

Indirect leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a55c2b0 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:62
    #1 0x7f044812d62d in ompi_proc_self (/usr/local/openmpi/lib/libmpi.so.40+0x8562d)

Indirect leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f044a55c488 in __interceptor_calloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:70
    #1 0x7f04481281ca in ompi_group_allocate (/usr/local/openmpi/lib/libmpi.so.40+0x801ca)

SUMMARY: AddressSanitizer: 14688 byte(s) leaked in 70 allocation(s).

problem

my questions are: 1.Are these memory leak issues attributed to OpenMPI? Will they affect my program? 2.If they are false positives, is there a way to suppress these issues?

Thanks。

hppritcha commented 2 months ago

Thanks for trying this. It looks like some of these may well be attributed to Open MPI. Most of the functions caught will only be run once in the call to MPI\_Init but i'm not sure about the others. Not sure about your second question.
I'll first see if I can reproduce this on the main branch.

jedbrown commented 1 month ago

I don't even get this far when using address sanitizer.

$ cat <<EOF >> test_mpi.c
#include <mpi.h>
int main(int argc, char *argv[])
{
  MPI_Init(&argc, &argv);
  MPI_Finalize();

  return 0;
}
EOF

$ mpicc -fsanitize=address test_mpi.c
$ gdb -ex r ./a.out
[...]
Thread 1 "a.out" received signal SIGILL, Illegal instruction.
0x00007ffff7852237 in mprotect () from /usr/lib/libasan.so.8
(gdb) bt
#0  0x00007ffff7852237 in mprotect () from /usr/lib/libasan.so.8
#1  0x00007ffff71d18af in mca_base_patcher_patch_apply_binary () from /usr/lib/libopen-pal.so.80
#2  0x00007ffff71d1997 in ?? () from /usr/lib/libopen-pal.so.80
#3  0x00007ffff71cc759 in ?? () from /usr/lib/libopen-pal.so.80
#4  0x00007ffff716e42a in mca_base_framework_components_open () from /usr/lib/libopen-pal.so.80
#5  0x00007ffff71cc228 in ?? () from /usr/lib/libopen-pal.so.80
#6  0x00007ffff716f640 in mca_base_framework_open () from /usr/lib/libopen-pal.so.80
#7  0x00007ffff71a6a3a in opal_common_ucx_mca_register () from /usr/lib/libopen-pal.so.80
#8  0x00007ffff763d60e in ?? () from /usr/lib/libmpi.so.40
#9  0x00007ffff716e42a in mca_base_framework_components_open () from /usr/lib/libopen-pal.so.80
#10 0x00007ffff76365b0 in ?? () from /usr/lib/libmpi.so.40
#11 0x00007ffff716f640 in mca_base_framework_open () from /usr/lib/libopen-pal.so.80
#12 0x00007ffff748c05c in ?? () from /usr/lib/libmpi.so.40
#13 0x00007ffff748d599 in ompi_mpi_instance_init () from /usr/lib/libmpi.so.40
#14 0x00007ffff747cd15 in ompi_mpi_init () from /usr/lib/libmpi.so.40
#15 0x00007ffff74c0364 in PMPI_Init () from /usr/lib/libmpi.so.40
#16 0x0000555555555250 in main ()
``` Package: Open MPI builduser@buildhost Distribution Open MPI: 5.0.3 Open MPI repo revision: v5.0.3 Open MPI release date: Apr 08, 2024 MPI API: 3.1.0 Ident string: 5.0.3 Prefix: /usr Configured architecture: x86_64-pc-linux-gnu Configured by: builduser Configured on: Sun Apr 28 16:52:11 UTC 2024 Configure host: buildhost Configure command line: '--prefix=/usr' '--enable-builtin-atomics' '--enable-memchecker' '--enable-mpi-fortran=all' '--enable-pretty-print-stacktrace' '--libdir=/usr/lib' '--sysconfdir=/etc/openmpi' '--with-hwloc=external' '--with-libevent=external' '--with-pmix=external' '--with-prrte=external' '--with-valgrind' '--with-ucc=/usr' '--with-ucx=/usr' '--with-cuda=/opt/cuda' '--with-cuda-libdir=/usr/lib' '--with-rocm=/opt/rocm' '--enable-mca-dso=accelerator_cuda,accelerator_rocm,btl_smcuda,rcache_gpusm,rcache_rgpusm,coll_ucc,scoll_ucc' '--with-show-load-errors=^accelerator,rcache,coll/ucc' Built by: builduser Built on: Sun Apr 28 16:52:11 UTC 2024 Built host: buildhost C bindings: yes Fort mpif.h: yes (all) Fort use mpi: yes (full: ignore TKR) Fort use mpi size: deprecated-ompi-info-value Fort use mpi_f08: yes Fort mpi_f08 compliance: The mpi_f08 module is available, but due to limitations in the gfortran compiler and/or Open MPI, does not support the following: array subsections, direct passthru (where possible) to underlying Open MPI's C functionality Fort mpi_f08 subarrays: no Java bindings: no Wrapper compiler rpath: runpath C compiler: gcc C compiler absolute: /opt/cuda/bin/gcc C compiler family name: GNU C compiler version: 13.2.1 C++ compiler: g++ C++ compiler absolute: /opt/cuda/bin/g++ Fort compiler: gfortran Fort compiler abs: /usr/bin/gfortran Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::) Fort 08 assumed shape: yes Fort optional args: yes Fort INTERFACE: yes Fort ISO_FORTRAN_ENV: yes Fort STORAGE_SIZE: yes Fort BIND(C) (all): yes Fort ISO_C_BINDING: yes Fort SUBROUTINE BIND(C): yes Fort TYPE,BIND(C): yes Fort T,BIND(C,name="a"): yes Fort PRIVATE: yes Fort ABSTRACT: yes Fort ASYNCHRONOUS: yes Fort PROCEDURE: yes Fort USE...ONLY: yes Fort C_FUNLOC: yes Fort f08 using wrappers: yes Fort MPI_SIZEOF: yes C profiling: yes Fort mpif.h profiling: yes Fort use mpi profiling: yes Fort use mpi_f08 prof: yes Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes, OMPI progress: no, Event lib: yes) Sparse Groups: no Internal debug support: no MPI interface warnings: yes MPI parameter check: runtime Memory profiling support: no Memory debugging support: no dl support: yes Heterogeneous support: no MPI_WTIME support: native Symbol vis. support: yes Host topology support: yes IPv6 support: no MPI extensions: affinity, cuda, ftmpi, rocm, shortfloat Fault Tolerance support: yes FT MPI support: yes MPI_MAX_PROCESSOR_NAME: 256 MPI_MAX_ERROR_STRING: 256 MPI_MAX_OBJECT_NAME: 64 MPI_MAX_INFO_KEY: 36 MPI_MAX_INFO_VAL: 256 MPI_MAX_PORT_NAME: 1024 MPI_MAX_DATAREP_STRING: 128 MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.0.3) MCA accelerator: cuda (MCA v2.1.0, API v1.0.0, Component v5.0.3) MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA btl: self (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA btl: ofi (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA btl: sm (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA btl: tcp (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA btl: uct (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA btl: smcuda (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v5.0.3) MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA memchecker: valgrind (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA mpool: hugepage (MCA v2.1.0, API v3.1.0, Component v5.0.3) MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component v5.0.3) MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA rcache: gpusm (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA rcache: rgpusm (MCA v2.1.0, API v3.3.0, Component v5.0.3) MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA smsc: cma (MCA v2.1.0, API v1.0.0, Component v5.0.3) MCA threads: pthreads (MCA v2.1.0, API v1.0.0, Component v5.0.3) MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA bml: r2 (MCA v2.1.0, API v2.1.0, Component v5.0.3) MCA coll: adapt (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: basic (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: han (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: inter (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: libnbc (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: self (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: sync (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: tuned (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: cuda (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: ftagree (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: monitoring (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA coll: sm (MCA v2.1.0, API v2.4.0, Component v5.0.3) MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA hook: comm_method (MCA v2.1.0, API v1.0.0, Component v5.0.3) MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA io: romio341 (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA mtl: ofi (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA op: avx (MCA v2.1.0, API v1.0.0, Component v5.0.3) MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v5.0.3) MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component v5.0.3) MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v5.0.3) MCA osc: ucx (MCA v2.1.0, API v3.0.0, Component v5.0.3) MCA part: persist (MCA v2.1.0, API v4.0.0, Component v5.0.3) MCA pml: cm (MCA v2.1.0, API v2.1.0, Component v5.0.3) MCA pml: monitoring (MCA v2.1.0, API v2.1.0, Component v5.0.3) MCA pml: ob1 (MCA v2.1.0, API v2.1.0, Component v5.0.3) MCA pml: ucx (MCA v2.1.0, API v2.1.0, Component v5.0.3) MCA pml: v (MCA v2.1.0, API v2.1.0, Component v5.0.3) MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v5.0.3) MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v5.0.3) MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component v5.0.3) MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component v5.0.3) ```