Open chhzh123 opened 4 years ago
I found the problem. The simplification logic only works when the loop range is smaller than the array size.
https://github.com/cornell-zhang/heterocl/blob/d3173471e877c32fd9327e882575499c46f10f69/tvm/src/pass/simple_passes.cc#L174-L176
I think it's unnecessary? And the correctness of the program should be guaranteed by the users (e.g. avoid accessing invalid memory).
Even this comparison is added, the program can still access memory out of the boundary. For example, change the compute function to hcl.compute((1,1,10,10), lambda i, c, x, y: A[i, c, x, y], "B", dtype)
, then (y / 8 + x)/8
can be 1
when x=y=9
, which is larger than the size of the 1st dimension.
@seanlatias
This is the legacy from Halide. I think maybe we can add a tag or something to turn on/off this feature.
I found the problem. The simplification logic only works when the loop range is smaller than the array size. https://github.com/cornell-zhang/heterocl/blob/d3173471e877c32fd9327e882575499c46f10f69/tvm/src/pass/simple_passes.cc#L174-L176
I think it's unnecessary?
These several lines should not be directly removed. Otherwise, the correctness of the program may not be ensured. As an example,
uint32 A[4][4];
uint32 B[16];
for (uint32 i = 0; i < 16; ++i)
B[i] = A[i / 4][i % 4];
will be simplified to
for (uint32 i = 0; i < 16; ++i)
B[i] = A[0][i];
which may cause SegFault and is incorrect.
@seanlatias we should first support multi-dimensional array in our internal IR.
I'm not sure why the access pattern in the generated VHLS code of the following example becomes so complex. Seems it only happens when the output array is larger the input array.
I think accessing
A[0][0][x][y]
is okay. Is the simplification logic somewhere not working properly?