Closed Blaok closed 6 years ago
In our current scenario, you cannot just return the output. The output is sent in as an argument. Thus, what you need to do is to "update" the output. Following is an example.
def top(input_, output_):
# compute body
# compute the final output
return hcl.update(output_, lambda x: input_[x] + 1)
Following is the incorrect way.
def top(input_, output_):
# compute body
return hcl.compute(output_.shape, lambda x: input_[x] + 1)
When testing the stencil backend, I found that in the IR generated for the Gaussian benchmark, the output tensor is explicitly allocated. I believe this is incorrect because the interface already generates an implicit tensor allocation by calling
tvm_struct_get
. The blur benchmark works fine.This unexpected tensor allocation is breaking the code for SODA code generation. More specifically, it invalidates the
VarExpr
comparison because the newly generatedVariable
is used in the IR, which is not linked to the interface. This results in incorrect detection ofoutput
orlocal
tensors in SODA. As a workaround, I had to compare byname_hint
, but it may not work in other situations, as the name suggests.The IR is printed in the
test_soda.py
unit test and can be reproduced by runningpython -m unittest test_soda
inheterocl/heterocl/tests
.