For the existing HeteroCL IR, we only support flattened tensor expressions, which may cause some problems during code generation and may make some analyses difficult. To solve that, we propose to introduce two new IR nodes for multi-dimensional load/store. For the original Halide IR, it uses Provide and Call for multi-dimensional tensor accesses while TVM IR removes them. However, the names are not straightforward. Thus, we propose the name to be NDLoad and NDStore, following the same name convention in NumPy, where ND stands for N-dimensional.
With this, we also need to modify some of our passes. Following are some changes.
The current HeteroCL python fronted automatically flattens the tensor accesses. We need to maintain the ND information instead.
The current StorageFlattening pass is simply doing buffer binding. We need to implement the real flattening.
For the order of the passes, everything before StorageFlattening should be kept as ND. We should also add a configuration for whether we skip the step. In general, StorageFlattening should only be used for CPU execution.
This change should fix the following problem(s):
Incorrect/Weird index generation in HLS code (#276).
For the existing HeteroCL IR, we only support flattened tensor expressions, which may cause some problems during code generation and may make some analyses difficult. To solve that, we propose to introduce two new IR nodes for multi-dimensional load/store. For the original Halide IR, it uses
Provide
andCall
for multi-dimensional tensor accesses while TVM IR removes them. However, the names are not straightforward. Thus, we propose the name to beNDLoad
andNDStore
, following the same name convention in NumPy, whereND
stands for N-dimensional.With this, we also need to modify some of our passes. Following are some changes.
StorageFlattening
pass is simply doing buffer binding. We need to implement the real flattening.StorageFlattening
should be kept as ND. We should also add a configuration for whether we skip the step. In general,StorageFlattening
should only be used for CPU execution.This change should fix the following problem(s):