chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.79k stars 421 forks source link

Dist/domain/array leaks #7891

Closed benharsh closed 4 years ago

benharsh commented 6 years ago

This issue exists to track distribution, domain, and array leaks. Hopefully we can split this into separate issues as we begin to understand the cause of the leaks. Until then, let's use a single issue because dists/domains/arrays are closely linked and a bug in one type can lead to leaks in the others.

I have attempted to create some rough categories below, but there are likely multiple issues involved in each area.

DefaultRectangular arrays/domains:

Private:

DefaultDist:

DefaultAssociative arrays/domains:

Block arrays/domains:

Array Views:

Sparse arrays/domains:

Stencil arrays/domains:

DimensionalDist:

ben-albrecht commented 6 years ago

The following program demonstrates a leak that likely explains majority of the layoutCS-related leaks:

use layoutCS;

var dom = {1..100, 1..100};
var csrDom: sparse subdomain(dom) dmapped CS();
./leak --memLeaks
=================================================================================================================
Allocated Memory (Bytes)         Number   Size     Total    Description                      Address
=================================================================================================================
leak.chpl:4                      1        144      144      DefaultSparseDom(2,int(64),domain(2,int(64),false))0x00007faab1c03d40
leak.chpl:4                      1        112      112      [domain(1,int(64),false)] 2*int(64)0x00007faab1c03e80
leak.chpl:4                      1        80       80       domain(1,int(64),false)          0x00007faab1c03e00
leak.chpl:4                      1        96       96       domain(2,int(64),false)          0x00007faab1c03f70
leak.chpl:4                      1        24       24       listNode(BaseArr)                0x00007faab1c03f20
=================================================================================================================
bradcray commented 6 years ago

Nice minimal example! It appears that neither DefaultSparse.chpl nor LayoutCS.chpl contain any of the dsiDestroy*() routines, which could definitely explain that...

ben-albrecht commented 6 years ago

Nice minimal example! It appears that neither DefaultSparse.chpl nor LayoutCS.chpl contain any of the dsiDestroy*() routines, which could definitely explain that...

I observed this as well. However, the following the example does not leak:

var dom = {1..100, 1..100};
var sparseDom: sparse subdomain(dom);

I would expect both examples to leak if the lack of dsiDestroy*() methods was the cause. It's probably not a bad idea to still implement these methods, however.

bradcray commented 6 years ago

Oh, that is weird. @benharsh also noted that he didn't think it was so surprising that these didn't have destroy calls because everything should be cleaned up automatically (or, at least, it's plausible that they would be...). I was thinking that because DefaultRectangular had them, others likely should as well, but he noted that DefaultRectangular's implementation is more primitive (it has to allocate/free ddatas).

benharsh commented 6 years ago

I think the leak is coming from here:

https://github.com/chapel-lang/chapel/blob/master/modules/internal/ChapelArray.chpl#L771

Here we create a temporary that is never destroyed. I think what we really want is to access the runtime type information of domainType directly, instead of creating a temporary. If I manually edit the generated code to do so the leak goes away (very limited use-case though).

benharsh commented 6 years ago

To avoid duplication of effort: I think I have a branch that fixes this sparse leak. It adds a compiler primitive that can get a field from the runtime type struct, and is used in the module code instead of a temporary.

I'm seeing a complete reduction in leaks for npb/cg/bradc/cg-sparse.chpl. I'm also seeing a major reduction in leaks for library/packages/LinearAlgebra/correctness/no-dependencies/correctness.chpl -- 26K to 480B.

I'm going to run this through memleaks and valgrind to see what comes back.

benharsh commented 6 years ago

Opened #11140 that significantly reduces the number of sparse-related leaks. With this branch I observed zero leaks of DefaultSparseDom.

bradcray commented 5 years ago

I split the Dimensional and Private cases out into their own issues (#12731 and #12732) because they seem relatively straightforward and separable from this bigger issue. I also started going through to see which of these are still open, but timed out before completing.

e-kayrakli commented 4 years ago

As far as I can see, only a small portion of these leaks remain and they are due to https://github.com/chapel-lang/chapel/issues/15611

https://github.com/chapel-lang/chapel/issues/15623 has the summary of all memory leaks we have as of today.