cornell-zhang / heterocl

HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Heterogeneous Computing
https://cornell-zhang.github.io/heterocl/
Apache License 2.0
326 stars 92 forks source link

Unexpected tensor allocation #39

Closed Blaok closed 6 years ago

Blaok commented 6 years ago

When testing the stencil backend, I found that in the IR generated for the Gaussian benchmark, the output tensor is explicitly allocated. I believe this is incorrect because the interface already generates an implicit tensor allocation by calling tvm_struct_get. The blur benchmark works fine.

This unexpected tensor allocation is breaking the code for SODA code generation. More specifically, it invalidates the VarExpr comparison because the newly generated Variable is used in the IR, which is not linked to the interface. This results in incorrect detection of output or local tensors in SODA. As a workaround, I had to compare by name_hint, but it may not work in other situations, as the name suggests.

The IR is printed in the test_soda.py unit test and can be reproduced by running python -m unittest test_soda in heterocl/heterocl/tests.

seanlatias commented 6 years ago

In our current scenario, you cannot just return the output. The output is sent in as an argument. Thus, what you need to do is to "update" the output. Following is an example.

def top(input_, output_):
  # compute body
  # compute the final output
  return hcl.update(output_, lambda x: input_[x] + 1)

Following is the incorrect way.

def top(input_, output_):
  # compute body
  return hcl.compute(output_.shape, lambda x: input_[x] + 1)