chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.8k stars 422 forks source link

Use-after-free error when returning a distributed domain type #6614

Open vasslitvinov opened 7 years ago

vasslitvinov commented 7 years ago

The problem

Scenario: the Chapel program calls a function returning a type that's a dmapped domain.

In the generated code / at run time, that function returns a chpl___RuntimeTypeInfo struct, which points to the distribution class allocated by the dmapped clause.

Before returning, however, that function deletes this distribution class.

Accessing the distribution class through the runtime type after the function returns results in a use-after-free / invalid memory access error. This happens, for example, when we create a variable of the type returned by the function. The error is detectable with valgrind.

The function can be an "if" function (see an example below).

Original user code

This issue was encountered by @ben-albrecht on an earlier version of the test studies/prk/Sparse/sparse.chpl developed in #6573. The essence of that code is:

use LayoutCSR;
param distributed = false;
var DDD = {1..10, 1..10};
const matrixDom: if distributed then sparse subdomain(DDD)
                 else sparse subdomain(DDD) dmapped CSR();

Here the culprit function is the "if" function that the compiler generates implicitly for the expression if distributed then ... else .... Here it is a type function.

Other versions

We can make an explicit function like makedomain here:

use LayoutCSR;
var DDD = {1..10, 1..10};
proc makedomain() type
  return sparse subdomain(DDD) dmapped CSR();
const matrixDom: makedomain();

We can further simplify the "sparse subdomain" thing like so:

use LayoutCSR;
proc makedomain() type
  return domain(2) dmapped CSR();
const matrixDom: makedomain();

This requires wrestling CSR into acting as a rectangular (not sparse) domain map. Use this patch.

Some details

Consider again this snippet:

proc makedomain() type {
  return domain(2) dmapped CSR();
}

The clause dmapped CSR() is converted by the compiler into:

var temp7 = new _distribution(new CSR());

Here, temp7 "owns" the newly-allocated CSR class. So, when makedomain() exits, temp7 goes out of scope and deallocates the class.

The expression (some domain type) dmapped (some _distribution) is converted by the compiler into:

var temp9 = chpl__distributed((the _distribution), (the domain type));

temp9 is a runtime-type record. It contains the _distribution in one of its fields. chpl_distributed() is currently defined at ChapelArray.chpl:744.

Alas, temp9 is returned by the compiled version of makedomain().

bradcray commented 7 years ago

I'm hoping @mppf will weigh in on this since it seems similar to other cases he's already had to handle such as supporting functions that return arrays over domains that are about to go out of scope.

mppf commented 7 years ago

IIRC we don't treat runtime types the same as normal records, for the purposes of copy/deinitialize etc. I think we would need to do so in order to correct this kind of error in a principled way.