Closed mpichbot closed 7 years ago
Originally by suhuang on 2013-08-02 12:12:35 -0500
Attachment added: 0001-Trac-665.-Memory-leak-fix-for-MPI-RMA.patch
(6.8 KiB)
Originally by balaji on 2013-08-06 12:26:48 -0500
Is there a branch on mpich-ibm for this, or should I commit the attachment on this ticket?
Originally by blocksom on 2013-08-06 12:34:31 -0500
No branch for this one, just the attached patch which has been reviewed internally at ibm and signed-off.
Originally by balaji on 2013-08-06 12:38:52 -0500
Pushed to mpich/master
in [7ddfa17b].
FWIW, the attachment didn't have a signoff. Also, it had a bunch of random white-space changes, trailing white-space on a bunch of lines, etc. I've pushed it in, since it only affected the pamid device.
Originally by suhuang on 2013-08-02 12:11:52 -0500
The problem was found and fixed by Sameer Kumar:
commit 91102a978077a01a81ad6285712e6cf26c0cdc06
The put request is only freed when all previous puts have completed. It was probably added to free the request after the last put in a non-contiguous set of puts completes. However it will lead to a memory leak when several MPI_Put calls are issued without a fence, unlock or win_complete between them. Then only the last put request is freed.
The fix avoids use of sync state to check when state is freed. Instead it uses counters in the request.
void
26 MPIDI_Win_DoneCB(pami_context_t context,
27 void * cookie,
28 pami_result_t result)
29 {
30 MPIDI_Win_request _req = (MPIDI_Winrequest)cookie;
31 ++req->win->mpid.sync.complete;
32 33 if ((req->buffer_free) && (req->type == MPIDI_WIN_REQUEST_GET))
34 {
35 ++req->origin.completed;
36 if (req->origin.completed == req->target.dt.num_contig)
37 {
38 int mpi_errno;
39 mpi_errno = MPIR_Localcopy(req->buffer,
40 req->origin.dt.size,
41 MPI_CHAR,
42 req->origin.addr,
43 req->origin.count,
44 req->origin.datatype);
45 MPID_assert(mpi_errno == MPI_SUCCESS);
46 MPID_Datatype_release(req->origin.dt.pointer);
47 MPIU_Free(req->buffer);
48 req->buffer_free = 0;
49 }
50 }
51
52 if (req->win->mpid.sync.total == req->win->mpid.sync.complete)
53 {
54 if (req->buffer_free)
55 MPIU_Free(req->buffer);
56 if (req->accum_headers)
57 MPIU_Free(req->accum_headers);
58 MPIU_Free(req);
59 }
60 MPIDI_Progress_signal();
61 }