Open rizwanashraf opened 1 year ago
The for-loop construct currently accumulates on top of output values from previous iterations that does not satisfy what user maybe intending to do. For example, the following:
for index in range(1,5,2): # start, end, step
C[i, k] = A[i, j] * B[j, k];
end
produces the following SCF code:
# %c1 = 1; %c5 = 5; %c2 = 2;
scf.for %arg0 = %c1 to %c5 step %c2 {
# the matrix multiplication
scf.for %arg1 = %c0 to %c8 step %c1 {
scf.for %arg2 = %c0 to %c4 step %c1 {
scf.for %arg3 = %c0 to %c2 step %c1 {
%0 = memref.load %alloc[%arg1, %arg2] : memref<8x4xf64>
%1 = memref.load %alloc_0[%arg2, %arg3] : memref<4x2xf64>
%2 = memref.load %alloc_1[%arg1, %arg3] : memref<8x2xf64>
%3 = arith.mulf %0, %1 : f64
%4 = arith.addf %2, %3 : f64
memref.store %4, %alloc_1[%arg1, %arg3] : memref<8x2xf64>
}
}
}
}
To capture user intent, we will need to initialize the output in the outermost loop. The question is that should we support something like this?
We are adding basic support for loop programming constructs to enable users to write iterative algorithms. At this time, we will not support single-element access of a tensor using the loop iterator and plan to include it in the future.
Some examples of COMET DSL with for-loop are as below:
We also support the following: