open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.15k stars 858 forks source link

Petsc test failing: possible MPI_REQUEST_FREE issue #1875

Closed jsquyres closed 8 years ago

jsquyres commented 8 years ago

According to Eric Chamberland in http://www.open-mpi.org/community/lists/devel/2016/07/19210.php, he's getting a failure in a petsc test. Here's the backtrace:

*** Error in `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.opt': free(): invalid pointer: 0x00007f9ab09c6020 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7277f)[0x7f9ab019b77f]
/lib64/libc.so.6(+0x78026)[0x7f9ab01a1026]
/lib64/libc.so.6(+0x78d53)[0x7f9ab01a1d53]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x172a1)[0x7f9aa3df32a1]
/opt/openmpi-2.x_opt/lib/libmpi.so.0(MPI_Request_free+0x4c)[0x7f9ab0761dac]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adaf9)[0x7f9ab7fa2af9]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f9ab7f9dc35]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574e7)[0x7f9ab7f4c4e7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f9ab7ef28ca]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_opt_Petsc.so(_Z15GIREFVecDestroyRP6_p_Vec+0xe)[0x7f9abc9746de]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_opt_Petsc.so(_ZN12VecteurPETScD1Ev+0x31)[0x7f9abca8bfa1]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_opt_Petsc.so(_ZN10SolveurGCPD2Ev+0x20c)[0x7f9abc9a013c]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_opt_Petsc.so(_ZN10SolveurGCPD0Ev+0x9)[0x7f9abc9a01f9]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_opt_Formulation.so(_ZN10ProblemeGDD2Ev+0x42)[0x7f9abeeb94e2]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.opt[0x4159b9]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f9ab014ab25]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.opt[0x4084dc]

@hjelmn @bosilca Could you have a look?

ericch1 commented 8 years ago

Ok, thanks for your reply.

I will report bugs I have here for the 2.0x master branch (from ompi.git repo)... Maybe when things will stable, I should return to the ompi-release.git repo?

Thanks, Eric

jsquyres commented 8 years ago

@ericch1 Just to be clear on our versioning system:

  1. The ompi repo contains our master development branch. It is currently marked as version 3.0.0a1.
  2. The ompi-release repo contains all of our release branches (e.g., v2.x, ...etc.).

We did it this way because github didn't used to have per-branch ACLs. Now they do, and we anticipate merging ompi-release back into ompi sometime soon. However, we're also in the middle of moving all of our infrastructure from one hosting provider to another, and that has a deadline associated with it, so merging ompi-release back into ompi may take a back seat for a little while.

If we can figure out this problem on master and merge the fix over to the release branch for the v2.0.1 release, that would be great. I know @hjelmn was poking into this today.

ericch1 commented 8 years ago

Hi Jeff,

please have a look at the following exchange with Matthew Knepley from PETSc:

http://lists.mcs.anl.gov/pipermail/petsc-users/2016-July/029911.html

In this report, I used the ompi-release/v2.x branch... But it looks like the same MPI_Request_free problem...

Eric

ggouaillardet commented 8 years ago

@ericch1 per the message on the ML

On Mon, Jul 25, 2016 at 12:44 PM, Eric Chamberland <
Eric.Chamberland at giref.ulaval.ca> wrote:

> Ok,
>
> here is the 2 points answered:
>
> #1) got valgrind output... here is the fatal free operation:
>

Okay, this is not the MatMult scatter, this is for local representations of
ghosted vectors. However, to me
it looks like OpenMPI mistakenly frees its built-in type for MPI_DOUBLE.

> ==107156== Invalid free() / delete / delete[] / realloc()
> ==107156==    at 0x4C2A37C: free (in
> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==107156==    by 0x1E63CD5F: opal_free (malloc.c:184)
> ==107156==    by 0x27622627: mca_pml_ob1_recv_request_fini
> (pml_ob1_recvreq.h:133)
> ==107156==    by 0x27622C4F: mca_pml_ob1_recv_request_free
> (pml_ob1_recvreq.c:90)
> ==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)
> ==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
> ==107156==    by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219)
> ==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
> ==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
> ==107156==    by 0x14A33809: VecDestroy (vector.c:432)
> ==107156==    by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&)
> (girefConfigurationPETSc.h:115)
> ==107156==    by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc()
> (VecteurPETSc.cc:2292)
> ==107156==    by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:287)
> ==107156==    by 0x10BA9F48: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:281)
> ==107156==    by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D()
> (PPReactionsAppuiEL3D.cc:216)
> ==107156==    by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in
> /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
> ==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)
> ==107156==  Address 0x1d6acbc0 is 0 bytes inside data symbol
> "ompi_mpi_double"
> --107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to
> 0x4c2f330 (__GI_stpcpy)
> ==107156==
> ==107156== Process terminating with default action of signal 6 (SIGABRT):
> dumping core
> ==107156==    at 0x1DD520C7: raise (in /lib64/libc-2.19.so)
> ==107156==    by 0x1DD53534: abort (in /lib64/libc-2.19.so)
> ==107156==    by 0x1DD4B145: __assert_fail_base (in /lib64/libc-2.19.so)
> ==107156==    by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so)
> ==107156==    by 0x27626D12: mca_pml_ob1_send_request_fini
> (pml_ob1_sendreq.h:221)
> ==107156==    by 0x276274C9: mca_pml_ob1_send_request_free
> (pml_ob1_sendreq.c:117)
> ==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)
> ==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
> ==107156==    by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225)
> ==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
> ==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
> ==107156==    by 0x14A33809: VecDestroy (vector.c:432)
> ==107156==    by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&)
> (girefConfigurationPETSc.h:115)
> ==107156==    by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc()
> (VecteurPETSc.cc:2292)
> ==107156==    by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:287)
> ==107156==    by 0x10BA9F48: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:281)
> ==107156==    by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D()
> (PPReactionsAppuiEL3D.cc:216)
> ==107156==    by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in

do you have any idea on how the MPI_Request was created ? e.g. MPI_Send_init, MPI_Isend, MPI_Ibcast or other ?

knepley commented 8 years ago

MPI_Isend() and MPI_Irecv()

ggouaillardet commented 8 years ago

got it, thanks ! I can try to build a simple reproducer with that.

btw, can you confirm the destructor is invoked before MPI_Finalize() ?

ericch1 commented 8 years ago

Yes, I can confirm this:

yes the destructor is invoked before MPI_Finalize...

Eric

hjelmn commented 8 years ago

Are you sure about that? From what I can tell the to->requests and from->requests arrays only hold requests allocated through MPI_*_init.

-Nathan

On Jul 26, 2016, at 08:12 AM, Matthew Knepley notifications@github.com wrote:

MPI_Isend() and MPI_Irecv() — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ericch1 commented 8 years ago

100% sure: no more, no less.

Eric

knepley commented 8 years ago

Nathan,

I think you are right, that I was wrong. The Isend and Irecv seem to be just calculating setup information. Then the stored pattern is setup by the _init calls. Here is a pointer to the code https://bitbucket.org/petsc/petsc/src/e99b8dcdee95b77e1530d3c4d61c134af21db400/src/vec/vec/utils/vpscat.c?at=master&fileviewer=file-view-default#vpscat.c-2654

Sorry about that.

Thanks,

 Matt

On Tue, Jul 26, 2016 at 7:58 AM, Eric Chamberland notifications@github.com wrote:

100% sure: no more, no less.

Eric

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/open-mpi/ompi/issues/1875#issuecomment-235294983, or mute the thread https://github.com/notifications/unsubscribe-auth/AAjoiWVdUCc-_TZtlEDOf6Z0DkWUHC01ks5qZiClgaJpZM4JMi_z .

What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener

hjelmn commented 8 years ago

We did fix a leak in MPI_Start/MPI_Startall in 2.0.0. The initial request passed to MPI_Start/MPI_Startall was being leaked. This was leaving an extra retain on both the communicator and datatype. Its very possible this bug is in Open MPI and has been for many years but was masked by the leak. It will probably take us at least a couple of days to figure it out.

-Nathan

On Jul 26, 2016, at 09:44 AM, Matthew Knepley notifications@github.com wrote:

Nathan,

I think you are right, that I was wrong. The Isend and Irecv seem to be just calculating setup information. Then the stored pattern is setup by the _init calls. Here is a pointer to the code https://bitbucket.org/petsc/petsc/src/e99b8dcdee95b77e1530d3c4d61c134af21db400/src/vec/vec/utils/vpscat.c?at=master&fileviewer=file-view-default#vpscat.c-2654

Sorry about that.

Thanks,

Matt

On Tue, Jul 26, 2016 at 7:58 AM, Eric Chamberland notifications@github.com wrote:

100% sure: no more, no less.

Eric

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/open-mpi/ompi/issues/1875#issuecomment-235294983, or mute the thread https://github.com/notifications/unsubscribe-auth/AAjoiWVdUCc-_TZtlEDOf6Z0DkWUHC01ks5qZiClgaJpZM4JMi_z .

What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ggouaillardet commented 8 years ago

i tried to be creative but was unable to build a reproducer. what i found is a memory leak if we MPI_Recv_init(...) followed by MPI_Request_free(...), then the datatype is not OBJ_RELEASE'd (and i guess the same thing happen for the communicator)

at this stage, i cannot exclude the extra RELEASE or the missing RETAIN does not occur "somewhere else" in the code, and the crash in the destructor is just a consequence of that.

note it is possible to mpirun --mca mpi_param_check 1 to be 100% sure the destructor is not invoked after MPI_Finalize().

ggouaillardet commented 8 years ago

btw, is MPI_THREAD_MULTIPLE involved when the crash occurs ? I clearly did not test that path ...

ericch1 commented 8 years ago

Ok, I will test --mca mpi_param_check 1.

MPI_Init_thread is called with MPI_THREAD_FUNNELED.

Are there other configure/mca options I can trigger to give a better "trace" or log of what happens?

ggouaillardet commented 8 years ago

--mca mpi_paramcheck 1 simply adds extra check (MPI* is called between MPI_Init and MPI_Finalize, types are commited, ... and aborts with a friendly error message if MPI is incorrectly used), it is on by default, but could have been tuned on your site.

if not already done, you can configure Open MPI with --enable-debug --enable-picky that adds some extra sanity checks at runtime.

do you always get the same error at the same place ? e.g. do you suspect a race condition ?

you can also try to mpirun --mca btl tcp,self ... that will run slowly, but if it still fails with that, it will be easier to hunt.

ideally, you would be able to write a trim version of your test that evidences the bug, but I fully understand that might be difficult.

ericch1 commented 8 years ago

if not already done, you can configure Open MPI with --enable-debug --enable-picky that adds some extra sanity checks at runtime.

ok, will activate them after testing --mca mpi_param_check 1

do you always get the same error at the same place ? e.g. do you suspect a race condition ?

Good question. Since I checked only some of the 20 that bugs, I can't tell for sure they are all bugging at the exact same place. Ok, picking a single test a checking for past 5 nights: hooke_3d_pen_8Hexa8_parallele: backtrace is the same for all 5 nights: * Error in `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': free(): invalid pointer: 0x00007fd693e65bc0 * ======= Backtrace: ========= /lib64/libc.so.6(+0x7277f)[0x7fd6935bc77f] /lib64/libc.so.6(+0x78026)[0x7fd6935c2026] /lib64/libc.so.6(+0x78d53)[0x7fd6935c2d53] /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7fd692d9fd60] /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7fd687548dae] /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7fd6875494ca] /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7fd693ba89dd] /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7fd693ba8ad6] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7fd69b44cb09] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7fd69b447c45] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7fd69b3f64f7] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7fd69b39c8da] /pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Petsc.so(_Z15GIREFVecDestroyRP6_p_Vec+0x1c)[0x7fd6a0e305bc] /pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Petsc.so(_ZN12VecteurPETSc17detruitObjetPETScEv+0xa5)[0x7fd6a0faff25] /pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Petsc.so(_ZN12VecteurPETScD1Ev+0xbe)[0x7fd6a0fafd1e] /pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Petsc.so(_ZN12VecteurPETScD0Ev+0x19)[0x7fd6a0faff59] /pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_PrePostTraitement.so(_ZN20PPReactionsAppuiEL3DD1Ev+0x9c)[0x7fd6a072358c] /pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Formulation.so(_ZN10ProblemeGDD1Ev+0x240b)[0x7fd6a4a9c1fb] /pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev[0x432013] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd69356bb25] /pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev[0x41858e]

you can also try to mpirun --mca btl tcp,self ... that will run slowly, but if it still fails with that, it will be easier to hunt.

I am running on a single node but launching many processes... Ah! Btw, I run multiple tests at the same time! I can run up to 8 tests that uses 2 processes each at the same time. I am also using:

export OMPI_MCA_mpi_yield_when_idle=1 export OMPI_MCA_hwloc_base_binding_policy=none

ideally, you would be able to write a trim version of your test that evidences the bug, but I fully understand that might be difficult.

It will take a lot of time to trim... But I hope that we found something before I have to do this...

ericch1 commented 8 years ago

Ok, maybe have a look at: http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.27.08h19m26s_ompi_info_all.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.27.08h19m26s_config.log

looks like mpi_param_check is already "true" by default...

And I just noticed there is a strange error during make install: WARNING! Common symbols found: btl_openib_lex.o: 0000000000000008 C btl_openib_ini_yyleng btl_openib_lex.o: 0000000000000008 C btl_openib_ini_yytext keyval_lex.o: 0000000000000008 C opal_util_keyval_yyleng keyval_lex.o: 0000000000000008 C opal_util_keyval_yytext show_help_lex.o: 0000000000000008 C opal_show_help_yyleng show_help_lex.o: 0000000000000008 C opal_show_help_yytext rmaps_rank_file_lex.o: 0000000000000008 C orte_rmaps_rank_file_leng rmaps_rank_file_lex.o: 0000000000000008 C orte_rmaps_rank_file_text hostfile_lex.o: 0000000000000008 C orte_util_hostfile_leng hostfile_lex.o: 0000000000000008 C orte_util_hostfile_text Makefile:2206: recipe for target 'install-exec-hook' failed make[3]: [install-exec-hook] Error 1 (ignored) (see the full log here: http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.27.08h19m26s.script.log)

ggouaillardet commented 8 years ago

thanks, you can safely ignore the warning about common symbols, this is intended for developers only

ericch1 commented 8 years ago

Ah! Btw, I run multiple tests at the same time! I can run up to 8 tests that uses 2 processes each at the same time.

Hmmm, I don't think the problem is related to running multiple programs at the same time since running the test alone give the same error...

jladd-mlnx commented 8 years ago

@ericch1 It might be worthwhile doing a Git Bisect study rather than trying to chase a memory corruption/double free type of issue. Just a thought.

ericch1 commented 8 years ago

export OMPI_MCA_mpi_yield_when_idle=1 export OMPI_MCA_hwloc_base_binding_policy=none

forget these, I ran without them and got the same bug.

ericch1 commented 8 years ago

you can also try to mpirun --mca btl tcp,self ...

changes nothing, same problem.

hjelmn commented 8 years ago

@jladd-mlnx Probably not. This issue was probably triggered by a leak fix in MPI_Start. It highly likely that this issue has been in the code base for years but was masked by leaking fragments. The leaked fragments hold extra retains on the comm and datatype. See https://github.com/open-mpi/ompi/commit/e968ddfe641fec7e3a350e0cd38c4e581cc314b7.

hjelmn commented 8 years ago

Sorry, didn't mean to close. Wrong button :)

jladd-mlnx commented 8 years ago

@ericch1 A simple way to test @hjelmn 's hypothesis would be to revert e968ddfe641fec7e3a350e0cd38c4e581cc314b7 and see if the bug persists. If it does, then the hypothesis is false. If it "fixes" the issue, then next steps can be taken.

hjelmn commented 8 years ago

@jladd-mlnx Really not worth checking. Without that commit there will be 1 extra retain per request. This problem occurs when the reference count of MPI_DOUBLE reaches 0. It will never reach 0 with e968ddf reverted. We are working on locating the bug now. This isn't happening with MTT start tests but we don't test every path.

ericch1 commented 8 years ago

ok, recompiled with "--enable-debug --enable-picky": http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.27.12h00_config.log http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.27.12h00_ompi_info_all.txt

but I have the same exact error:

** Error in `//pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': free(): invalid pointer: 0x00007f70a4176bc0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7277f)[0x7f70a38cd77f]
/lib64/libc.so.6(+0x78026)[0x7f70a38d3026]
/lib64/libc.so.6(+0x78d53)[0x7f70a38d3d53]
/opt/openmpi-2.x_debug/lib64/libopen-pal.so.20(opal_free+0x1f)[0x7f70a30b0d60]
/opt/openmpi-2.x_debug/lib64/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f70977dfdae]
/opt/openmpi-2.x_debug/lib64/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f70977e04ca]
/opt/openmpi-2.x_debug/lib64/libmpi.so.20(+0x9f9ed)[0x7f70a3eb99ed]
/opt/openmpi-2.x_debug/lib64/libmpi.so.20(MPI_Request_free+0xf7)[0x7f70a3eb9ae6]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f70ab75db09]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f70ab758c45]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f70ab7074f7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f70ab6ad8da]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Petsc.so(_Z15GIREFVecDestroyRP6_p_Vec+0x1c)[0x7f70b114566c]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Petsc.so(_ZN12VecteurPETSc17detruitObjetPETScEv+0xa5)[0x7f70b12c55a5]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Petsc.so(_ZN12VecteurPETScD1Ev+0xbe)[0x7f70b12c539e]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Petsc.so(_ZN12VecteurPETScD0Ev+0x19)[0x7f70b12c55d9]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_PrePostTraitement.so(_ZN20PPReactionsAppuiEL3DD1Ev+0x9c)[0x7f70b0a3858c]
/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/lib/libgiref_dev_Formulation.so(_ZN10ProblemeGDD1Ev+0x240b)[0x7f70b4db31fb]
//pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev[0x42f87e]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f70a387cb25]
//pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev[0x41802e]

@ericch1 A simple way to test @hjelmn 's hypothesis would be to revert e968ddf and see if the > bug persists. If it does, then the hypothesis is false. If it "fixes" the issue, then next steps can be > taken.

Ok, I will re-launch the whole compilation... but what is the good sha in ompi-release deposit? I am at : commit c71996ea8310e8bddfb8f10f3278f0cf36f32c1f Merge: 9f365c5 54674e8 Author: Jeff Squyres jsquyres@users.noreply.github.com Date: Wed Jul 27 10:01:21 2016 -0400

and git revert e968ddf fatal: bad revision 'e968ddf'

should I revert c464f9f? Doh, just saw your newer message telling me to forget this... :/

_EDIT: Added verbatim block_

hjelmn commented 8 years ago

@ericch1 The problem is that if this is a double request free I don't think we don't have a way to detect that even in a debug build. It might be possible to put a check to see if a request is returned to the free list twice.

jladd-mlnx commented 8 years ago

@hjelmn This has proven useful in the past. http://linux.die.net/man/3/efence

jsquyres commented 8 years ago

@hjelmn You could temporarily change it to not actually free the memory, but instead, scribble some well-known pattern on the memory (so that the contents are invalid / detectable, but won't cause a segv/bus error). Just my $0.02...

hjelmn commented 8 years ago

@jsquyres Problem is it isn't really a free. Its a return of a request to the free list :-/. We can kind of detect that by a loop in the lifo. We can assert on req_free_called I think.

hjelmn commented 8 years ago

@ericch1 Yeah, that should do it. Try this and see whether it changes anything:

diff --git a/ompi/mca/pml/ob1/pml_ob1_sendreq.c b/ompi/mca/pml/ob1/pml_ob1_sendreq.c
index 57ff6fd..90ea180 100644
--- a/ompi/mca/pml/ob1/pml_ob1_sendreq.c
+++ b/ompi/mca/pml/ob1/pml_ob1_sendreq.c
@@ -98,6 +98,7 @@ void mca_pml_ob1_send_request_process_pending(mca_bml_base_btl_t *bml_btl)
 static int mca_pml_ob1_send_request_free(struct ompi_request_t** request)
 {
     mca_pml_ob1_send_request_t* sendreq = *(mca_pml_ob1_send_request_t**)request;
+    assert (false == sendreq->req_send.req_base.req_free_called);
     if(false == sendreq->req_send.req_base.req_free_called) {

         sendreq->req_send.req_base.req_free_called = true;
ericch1 commented 8 years ago

@jladd-mlnx Isn't valgrind output sufficient in this case?

see it at https://github.com/open-mpi/ompi/issues/1875#issuecomment-235267969

hjelmn commented 8 years ago

Now that I have had a chance to evaluate the request free code in ob1 I don't think it is possible for a double MPI_Request_free to cause this sort of problem. Still digging.

jladd-mlnx commented 8 years ago

@hjelmn So, you're saying that reverting e968ddf would have no effect?

hjelmn commented 8 years ago

@jladd-mlnx Not at all. Just saying a double free from the user side won't trigger it. Means this is internal to ob1.

ericch1 commented 8 years ago

@hjelmn If I applied the patch correctly, the make and make install correctly, it didn't made a difference, sorry.

ggouaillardet commented 8 years ago

if i understand correctly, with opal_free_list_t, it is up to the developer to return an item that is in an OBJ_CONSTRUCT'ed state.

here are two patches that force that for ob1 send and recv req.

you can start with the first patch (the request is zero'ed and constructed before being returned to the list) and if it still does not work, then try the second patch (the request is zero'ed and constructed after being retrieved from the list, this is really an overkill though ...)

diff --git a/ompi/mca/pml/ob1/pml_ob1_recvreq.h b/ompi/mca/pml/ob1/pml_ob1_recvreq.h
index 6d57569..1111f32 100644
--- a/ompi/mca/pml/ob1/pml_ob1_recvreq.h
+++ b/ompi/mca/pml/ob1/pml_ob1_recvreq.h
 #define MCA_PML_OB1_RECV_REQUEST_RETURN(recvreq)                        \
     {                                                                   \
         mca_pml_ob1_recv_request_fini (recvreq);                        \
+        memset(recvreq, 0, ((opal_object_t *)recvreq)->obj_class->cls_sizeof); \
+        OBJ_CONSTRUCT_INTERNAL((opal_free_list_item_t*)recvreq, mca_pml_base_recv_requests.fl_frag_class); \
         opal_free_list_return (&mca_pml_base_recv_requests,             \
                                (opal_free_list_item_t*)(recvreq));      \
     }
diff --git a/ompi/mca/pml/ob1/pml_ob1_sendreq.h b/ompi/mca/pml/ob1/pml_ob1_sendreq.h
index d9fa0c8..cfb569b 100644
--- a/ompi/mca/pml/ob1/pml_ob1_sendreq.h
+++ b/ompi/mca/pml/ob1/pml_ob1_sendreq.h
 #define MCA_PML_OB1_SEND_REQUEST_RETURN(sendreq)                        \
     do {                                                                \
         mca_pml_ob1_send_request_fini (sendreq);                        \
+        memset(sendreq, 0, ((opal_object_t *)sendreq)->obj_class->cls_sizeof); \
+        OBJ_CONSTRUCT_INTERNAL((opal_free_list_item_t*)sendreq, mca_pml_base_send_requests.fl_frag_class); \
         opal_free_list_return ( &mca_pml_base_send_requests,            \
                                 (opal_free_list_item_t*)sendreq);       \
     } while(0)
diff --git a/ompi/mca/pml/ob1/pml_ob1_recvreq.h b/ompi/mca/pml/ob1/pml_ob1_recvreq.h
index 6d57569..1111f32 100644
--- a/ompi/mca/pml/ob1/pml_ob1_recvreq.h
+++ b/ompi/mca/pml/ob1/pml_ob1_recvreq.h
@@ -82,6 +82,8 @@ static inline bool unlock_recv_request(mca_pml_ob1_recv_request_t *recvreq)
 do {                                                               \
     recvreq = (mca_pml_ob1_recv_request_t *)                          \
         opal_free_list_get (&mca_pml_base_recv_requests);             \
+    memset(recvreq, 0, ((opal_object_t *)recvreq)->obj_class->cls_sizeof); \
+    OBJ_CONSTRUCT_INTERNAL((opal_free_list_item_t*)recvreq, mca_pml_base_recv_requests.fl_frag_class); \
 } while(0)

diff --git a/ompi/mca/pml/ob1/pml_ob1_sendreq.h b/ompi/mca/pml/ob1/pml_ob1_sendreq.h
index d9fa0c8..cfb569b 100644
--- a/ompi/mca/pml/ob1/pml_ob1_sendreq.h
+++ b/ompi/mca/pml/ob1/pml_ob1_sendreq.h
@@ -127,6 +127,8 @@ get_request_from_send_pending(mca_pml_ob1_send_pending_t *type)
         if( OPAL_LIKELY(NULL != proc) ) {                               \
             sendreq = (mca_pml_ob1_send_request_t*)                     \
                 opal_free_list_wait (&mca_pml_base_send_requests);      \
+            memset(sendreq, 0, ((opal_object_t *)sendreq)->obj_class->cls_sizeof); \
+            OBJ_CONSTRUCT_INTERNAL((opal_free_list_item_t*)sendreq, mca_pml_base_send_requests.fl_frag_class); \
             sendreq->req_send.req_base.req_proc = proc;                 \
         }                                                               \
     }
ericch1 commented 8 years ago

I feel so stupid: I am unable to apply the first patch, where am I wrong? The patch "apply" but does nothing... okok I will do it by hand, but again, what's wrong?

see:

patch --verbose -p1 < patch1.txt
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|diff --git a/ompi/mca/pml/ob1/pml_ob1_recvreq.h b/ompi/mca/pml/ob1/pml_ob1_recvreq.h
|index 6d57569..1111f32 100644
|--- a/ompi/mca/pml/ob1/pml_ob1_recvreq.h
|+++ b/ompi/mca/pml/ob1/pml_ob1_recvreq.h
| #define MCA_PML_OB1_RECV_REQUEST_RETURN(recvreq)                        \
|     {                                                                   \
|         mca_pml_ob1_recv_request_fini (recvreq);                        \
|+        memset(recvreq, 0, ((opal_object_t *)recvreq)->obj_class->cls_sizeof); \
|+        OBJ_CONSTRUCT_INTERNAL((opal_free_list_item_t*)recvreq, mca_pml_base_recv_requests.fl_frag_class); \
|         opal_free_list_return (&mca_pml_base_recv_requests,             \
|                                (opal_free_list_item_t*)(recvreq));      \
|     }
--------------------------
patching file ompi/mca/pml/ob1/pml_ob1_recvreq.h
Using Plan A...
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|diff --git a/ompi/mca/pml/ob1/pml_ob1_sendreq.h b/ompi/mca/pml/ob1/pml_ob1_sendreq.h
|index d9fa0c8..cfb569b 100644
|--- a/ompi/mca/pml/ob1/pml_ob1_sendreq.h
|+++ b/ompi/mca/pml/ob1/pml_ob1_sendreq.h
| #define MCA_PML_OB1_SEND_REQUEST_RETURN(sendreq)                        \
|     do {                                                                \
|         mca_pml_ob1_send_request_fini (sendreq);                        \
|+        memset(sendreq, 0, ((opal_object_t *)sendreq)->obj_class->cls_sizeof); \
|+        OBJ_CONSTRUCT_INTERNAL((opal_free_list_item_t*)sendreq, mca_pml_base_send_requests.fl_frag_class); \
|         opal_free_list_return ( &mca_pml_base_send_requests,            \
|                                 (opal_free_list_item_t*)sendreq);       \
|     } while(0)
--------------------------
patching file ompi/mca/pml/ob1/pml_ob1_sendreq.h
Using Plan A...
done

but git status: it status On branch v2.x Your branch is up-to-date with 'origin/v2.x'. Untracked files: (use "git add ..." to include in what will be committed)

    patch1.txt
    petsc-3.7.2-debug/
    petsc-3.7.2.tar.gz

nothing added to commit but untracked files present (use "git add" to track)

and verifying the files manually shows no difference...

ggouaillardet commented 8 years ago

Maybe something went wrong when you did a copy/paste to patch1.txt Could be a windows end of line ... you can try dos2unix patch1.txt

It could be easier to manually add the 4 lines contained in the first patch.

ericch1 commented 8 years ago

hmmm, after editing files by hand, I did a git diff and got :

diff --git a/ompi/mca/pml/ob1/pml_ob1_recvreq.h b/ompi/mca/pml/ob1/pml_ob1_recvreq.h
index 6d57569..16e8cb6 100644
--- a/ompi/mca/pml/ob1/pml_ob1_recvreq.h
+++ b/ompi/mca/pml/ob1/pml_ob1_recvreq.h
@@ -143,6 +143,8 @@ static inline void mca_pml_ob1_recv_request_fini (mca_pml_ob1_recv_request_t *re
 #define MCA_PML_OB1_RECV_REQUEST_RETURN(recvreq)                        \
     {                                                                   \
         mca_pml_ob1_recv_request_fini (recvreq);                        \
+                         memset(recvreq, 0, ((opal_object_t *)recvreq)->obj_class->cls_sizeof); \
+                         OBJ_CONSTRUCT_INTERNAL((opal_free_list_item_t*)recvreq, mca_pml_base_recv_requests.fl_frag_class); \
         opal_free_list_return (&mca_pml_base_recv_requests,             \
                                (opal_free_list_item_t*)(recvreq));      \
     }
diff --git a/ompi/mca/pml/ob1/pml_ob1_sendreq.h b/ompi/mca/pml/ob1/pml_ob1_sendreq.h
index d9fa0c8..2569ca6 100644
--- a/ompi/mca/pml/ob1/pml_ob1_sendreq.h
+++ b/ompi/mca/pml/ob1/pml_ob1_sendreq.h
@@ -232,6 +232,8 @@ static inline void mca_pml_ob1_send_request_fini (mca_pml_ob1_send_request_t *se
 #define MCA_PML_OB1_SEND_REQUEST_RETURN(sendreq)                        \
     do {                                                                \
         mca_pml_ob1_send_request_fini (sendreq);                        \
+                         memset(sendreq, 0, ((opal_object_t *)sendreq)->obj_class->cls_sizeof); \
+                         OBJ_CONSTRUCT_INTERNAL((opal_free_list_item_t*)sendreq, mca_pml_base_send_requests.fl_frag_class); \
         opal_free_list_return ( &mca_pml_base_send_requests,            \
                                 (opal_free_list_item_t*)sendreq);       \
     } while(0)

so the path was just missing the @@ lines...???

ericch1 commented 8 years ago

Ok, tested both patches and got the same result. :(

And just to be sure, I added a "printf" into the patched code and saw it in my output...

hjelmn commented 8 years ago

Ok, that suggests that the request is not being returned twice internally. Its looking more like there is an extra OBJRELEASE on a datatype somewhere. It can't be coming from the user as it would have aborted earlier with the parameter check turned on. Now the question is where is the extra release on the MPI*_init/MPI_Start path.

ggouaillardet commented 8 years ago

@hjelmn from my observations, persistent requests free the datatype only when MPI_Request_free is invoked. so the extra release/missing retain could be somewhere else.

ggouaillardet commented 8 years ago

@ericch1 very sorry about that, I screwed up when copy/paste and split the patch do you have a debugger such as ddt you can use to set tracepoints ? it would be interesting to print ompi_mpi_double.dt.super.super.obj_reference_count, from->n and to->n before and after the persistent requests are created and destroyed.

if you do not have a debugger, you can

struct dummy_object_t {
long pad1; // only if ompi configured with --enable-debug
void *pad2;
int count;
};

printf("count = %d\n", ((struct dummy_object_t*)MPI_DOUBLE)->count);
jsquyres commented 8 years ago

@ericch1 (after further internal discussions) We are running into problems trying to reproduce this issue. Can you send us some kind of small reproducer? We realize your code is internal, but a) we can't find the issue with a code review, and b) we can't reproduce the error to identify what the underlying issue is.

hjelmn commented 8 years ago

@ericch1 Can you try #1935 and see if that helps? That fixes a bug that was masked by request leaking. For me it produces a SEGV but it might have other implications.

ericch1 commented 8 years ago

Sorry, I was "out of town" for last 12 days... I will try the 2 last proposed patches as soon as I can, but we will back at work on Aug 22.

jsquyres commented 8 years ago

Given the time delay and the fact that we can't replicate the issue, this is slipping to v2.0.2.

ericch1 commented 8 years ago

ok, testing master which includes commit from pr #1935...

I have a question: to help people build a "small" reproducer when they are in a situation like mine or simply using in 3rd party libs (like PETSC), is it feasible to simply link to a MPI lib which, at execution, will extract all the calls (and data) made to the MPI lib into a "c code" which will be the reproducer? Do you think this is feasible?

If yes, it would be very useful!

Thanks,

Eric