Closed gysit closed 4 years ago
a possible solution for the loop unrolling maybe to extend the return op with an optional unrolling attribute. For examle, a stencil returning two arguments has the following return:
stencil.return %a %b : f64
if we want to unroll along the j-dimension we could add the optional attribute
stencil.return unroll[1,2,1] %a1 %a2 %b1 %b2 : f64
which specifies an unroll factor of 2 along the j-dimension. In a first iteration we may only support unrolling in one dimension (either j or k).
introduce a pass on the stencil dialect level that enables unrolling multiple loop iterations along one or two dimensions and that uses cse to remove redundant computation. This pass may already introduce loops and potentially writes before the actual conversion to the standard dialect. The new parallel loop op may be a good candidate for this refactoring. Alternatively, we can introduce an new op/datastructure to represent multiple output iterations (e.g. a special version of the return op).