kaivenlong commented 3 years ago

Hi, after building the polymer-opt with the following commit, I try to execute (matmul.mlir is from /test/archive/polymer-translate/export-scop/matmul.mlir)

./bin/polymer-opt -pluto-opt matmul.mlir

It promotes the following error:

./matmul.mlir:4:8: error: custom op 'alloc' is unknown %A = alloc() : memref<64x64xf32> ^

Then I change all "alloc" to "memref.alloc".

then execute the command like the readme says, it doesn't take effect and the result is as the source one.

only after execute the command like: polymer-opt -debug -extract-scop-stmt -pluto-opt matmul.mlir

it seems that the pluto transorm is applied:

map0 = affine_map<(d0) -> (d0 * 32)>

map1 = affine_map<(d0) -> (d0 * 32 + 32)>

module { func private @S0(%arg0: index, %arg1: index, %arg2: memref<64x64xf32>, %arg3: index, %arg4: memref<64x64xf32>, %arg5: memref<64x64xf32>) attributes {scop.stmt} { %0 = affine.load %arg5[symbol(%arg0), symbol(%arg3)] : memref<64x64xf32> %1 = affine.load %arg4[symbol(%arg3), symbol(%arg1)] : memref<64x64xf32> %2 = mulf %0, %1 : f32 %3 = affine.load %arg2[symbol(%arg0), symbol(%arg1)] : memref<64x64xf32> %4 = addf %2, %3 : f32 affine.store %4, %arg2[symbol(%arg0), symbol(%arg1)] : memref<64x64xf32> return } func @matmul() { %0 = memref.alloc() : memref<64x64xf32> %1 = memref.alloc() : memref<64x64xf32> %2 = memref.alloc() : memref<64x64xf32> affine.for %arg0 = 0 to 2 { affine.for %arg1 = 0 to 2 { affine.for %arg2 = 0 to 2 { affine.for %arg3 = #map0(%arg0) to #map1(%arg0) { affine.for %arg4 = #map0(%arg2) to #map1(%arg2) { affine.for %arg5 = #map0(%arg1) to #map1(%arg1) { call @S0(%arg3, %arg5, %0, %arg4, %1, %2) : (index, index, memref<64x64xf32>, index, memref<64x64xf32>, memref<64x64xf32>) -> () } } } } } } return } }

May there is some mismatch between the source code and the test, just file an issue.

commit 76d9c2c98c11eecdeb0695ce0b052aca338e85ef (origin/main, origin/HEAD) Merge: 4b8e998 7756df4 Author: Ruizhe Zhao vincentzhaorz@gmail.com Date: Tue May 25 16:32:48 2021 +0000

Merge branch 'main' of github.com:kumasento/polymer into main

YangWang92 commented 3 years ago

I guess that polymer uses legacy LLVM/MLIR that does not support memref.alloc/memref.dim.

kumasento commented 3 years ago

Hi @kaivenlong Thank you so much for pointing this out :)

Indeed, there is a mismatch between the README and the actual implementation. I think your solution is correct (maybe -debug is not necessary). And the complete compile pipeline would be something like this:

https://github.com/kumasento/polymer/blob/d5a49055aca0ac1ab059a06609ea485037c1e284/example/polybench/eval-perf#L203-L210

I will improve the doc later.

kumasento commented 3 years ago

I guess that polymer uses legacy LLVM/MLIR that does not support memref.alloc/memref.dim.

Indeed, that example was older than the memref dialect :)

YangWang92 commented 3 years ago

I guess that polymer uses legacy LLVM/MLIR that does not support memref.alloc/memref.dim.

Indeed, that example was older than the memref dialect :)

Hi @kumasento ,

Thanks for sharing such a helpful tool.

After fixing the memref errors like this

func @matmul() {
  %A = memref.alloc() : memref<64x64xf32>
  %B = memref.alloc() : memref<64x64xf32>
  %C = memref.alloc() : memref<64x64xf32>

  affine.for %i = 0 to 64 {
    affine.for %j = 0 to 64 {
      affine.for %k = 0 to 64 {
        %0 = affine.load %A[%i, %k] : memref<64x64xf32>
        %1 = affine.load %B[%k, %j] : memref<64x64xf32>
        %2 = mulf %0, %1 : f32
        %3 = affine.load %C[%i, %j] : memref<64x64xf32>
        %4 = addf %2, %3 : f32
        affine.store %4, %C[%i, %j] : memref<64x64xf32>
      }
    }
  }

  return
}

I cannot get an optimized results from polymer-opt.

Did I make something wrong?

YangWang92 commented 3 years ago

Sorry about the reply. I got this after updating args, and it works!

$ ./polymer-opt \ -reg2mem \ -insert-redundant-load \ -extract-scop-stmt \ -canonicalize \ -pluto-opt="dump-clast-after-pluto=${POLYMER_CLAST_FILE}" \ -canonicalize \ ./matmul.mlir

 # [File generated by the OpenScop Library 0.9.2]

<OpenScop>

# =============================================== Global
# Language
C

# Context
CONTEXT
0 2 0 0 0 0

# Parameters are provided
1
<strings>
# NULL strings
</strings>

# Number of statements
1

# =============================================== Statement 1
# Number of relations describing the statement:
6

# ----------------------------------------------  1.1 Domain
DOMAIN
12 8 6 0 0 0
# e/i| fk0  fk1  fk2  i0   i1   i2 |  1  
   1    0    0    0    1    0    0    0    ## i0 >= 0
   1    0    0    0   -1    0    0   63    ## -i0+63 >= 0
   1    0    0    0    0    1    0    0    ## i1 >= 0
   1    0    0    0    0   -1    0   63    ## -i1+63 >= 0
   1    0    0    0    0    0    1    0    ## i2 >= 0
   1    0    0    0    0    0   -1   63    ## -i2+63 >= 0
   1  -32    0    0    1    0    0    0    ## -32*fk0+i0 >= 0
   1   32    0    0   -1    0    0   31    ## 32*fk0-i0+31 >= 0
   1    0  -32    0    0    1    0    0    ## -32*fk1+i1 >= 0
   1    0   32    0    0   -1    0   31    ## 32*fk1-i1+31 >= 0
   1    0    0  -32    0    0    1    0    ## -32*fk2+i2 >= 0
   1    0    0   32    0    0   -1   31    ## 32*fk2-i2+31 >= 0

# ----------------------------------------------  1.2 Scattering
SCATTERING
6 14 6 6 0 0
# e/i| c1   c2   c3   c4   c5   c6 | fk0  fk1  fk2  i0   i1   i2 |  1  
   0   -1    0    0    0    0    0    1    0    0    0    0    0    0    ## c1 == fk0
   0    0   -1    0    0    0    0    0    1    0    0    0    0    0    ## c2 == fk1
   0    0    0   -1    0    0    0    0    0    1    0    0    0    0    ## c3 == fk2
   0    0    0    0   -1    0    0    0    0    0    1    0    0    0    ## c4 == i0
   0    0    0    0    0   -1    0    0    0    0    0    0    1    0    ## c5 == i2
   0    0    0    0    0    0   -1    0    0    0    0    1    0    0    ## c6 == i1

# ----------------------------------------------  1.3 Access
READ
3 11 3 6 0 0
# e/i| Arr  [1]  [2]| fk0  fk1  fk2  i0   i1   i2 |  1  
   0    0   -1    0    0    0    0    1    0    0    0    ## [1] == i0
   0    0    0   -1    0    0    0    0    0    1    0    ## [2] == i2
   0   -1    0    0    0    0    0    0    0    0    1    ## Arr == A1

READ
3 11 3 6 0 0
# e/i| Arr  [1]  [2]| fk0  fk1  fk2  i0   i1   i2 |  1  
   0    0   -1    0    0    0    0    0    0    1    0    ## [1] == i2
   0    0    0   -1    0    0    0    0    1    0    0    ## [2] == i1
   0   -1    0    0    0    0    0    0    0    0    2    ## Arr == A2

READ
3 11 3 6 0 0
# e/i| Arr  [1]  [2]| fk0  fk1  fk2  i0   i1   i2 |  1  
   0    0   -1    0    0    0    0    1    0    0    0    ## [1] == i0
   0    0    0   -1    0    0    0    0    1    0    0    ## [2] == i1
   0   -1    0    0    0    0    0    0    0    0    3    ## Arr == A3

WRITE
3 11 3 6 0 0
# e/i| Arr  [1]  [2]| fk0  fk1  fk2  i0   i1   i2 |  1  
   0    0   -1    0    0    0    0    1    0    0    0    ## [1] == i0
   0    0    0   -1    0    0    0    0    1    0    0    ## [2] == i1
   0   -1    0    0    0    0    0    0    0    0    3    ## Arr == A3

# ----------------------------------------------  1.4 Statement Extensions
# Number of Statement Extensions
1
<body>
# Number of original iterators
6
# List of original iterators
fk0 fk1 fk2 i0 i1 i2
# Statement body expression
S0(i0, i1, i2)
</body>
#
# =============================================== Extensions
<arrays>
# Number of arrays
3
# Mapping array-identifiers/array-names
1 A1
2 A2
3 A3
</arrays>

<comment>
matmul</comment>

<scatnames>
t1 t2 t3 t4 t5 t6
</scatnames>

</OpenScop>

for (t1=0;t1<=1;t1++) {
  for (t2=0;t2<=1;t2++) {
    for (t3=0;t3<=1;t3++) {
      for (t4=32*t1;t4<=32*t1+31;t4++) {
        for (t5=32*t3;t5<=32*t3+31;t5++) {
          for (t6=32*t2;t6<=32*t2+31;t6++) {
            S0(t4, t6, t5)
          }
        }
      }
    }
  }
}
#map0 = affine_map<(d0) -> (d0 * 32)>
#map1 = affine_map<(d0) -> (d0 * 32 + 32)>
module  {
  func private @S0(%arg0: index, %arg1: index, %arg2: memref<64x64xf32>, %arg3: index, %arg4: memref<64x64xf32>, %arg5: memref<64x64xf32>) attributes {scop.stmt} {
    %0 = affine.load %arg5[symbol(%arg0), symbol(%arg3)] : memref<64x64xf32>
    %1 = affine.load %arg4[symbol(%arg3), symbol(%arg1)] : memref<64x64xf32>
    %2 = mulf %0, %1 : f32
    %3 = affine.load %arg2[symbol(%arg0), symbol(%arg1)] : memref<64x64xf32>
    %4 = addf %2, %3 : f32
    affine.store %4, %arg2[symbol(%arg0), symbol(%arg1)] : memref<64x64xf32>
    return
  }
  func @matmul() {
    %0 = memref.alloc() : memref<64x64xf32>
    %1 = memref.alloc() : memref<64x64xf32>
    %2 = memref.alloc() : memref<64x64xf32>
    affine.for %arg0 = 0 to 2 {
      affine.for %arg1 = 0 to 2 {
        affine.for %arg2 = 0 to 2 {
          affine.for %arg3 = #map0(%arg0) to #map1(%arg0) {
            affine.for %arg4 = #map0(%arg2) to #map1(%arg2) {
              affine.for %arg5 = #map0(%arg1) to #map1(%arg1) {
                call @S0(%arg3, %arg5, %0, %arg4, %1, %2) : (index, index, memref<64x64xf32>, index, memref<64x64xf32>, memref<64x64xf32>) -> ()
              }
            }
          }
        }
      }
    }
    return
  }
}

kumasento commented 3 years ago

Yeah your new output looks good to me. Let me know if you need further support :)

kaivenlong commented 3 years ago

Hi @kaivenlong Thank you so much for pointing this out :)

Indeed, there is a mismatch between the README and the actual implementation. I think your solution is correct (maybe -debug is not necessary). And the complete compile pipeline would be something like this:

https://github.com/kumasento/polymer/blob/d5a49055aca0ac1ab059a06609ea485037c1e284/example/polybench/eval-perf#L203-L210

I will improve the doc later.

Hi @kaivenlong Thank you so much for pointing this out :)

Indeed, there is a mismatch between the README and the actual implementation. I think your solution is correct (maybe -debug is not necessary). And the complete compile pipeline would be something like this:

https://github.com/kumasento/polymer/blob/d5a49055aca0ac1ab059a06609ea485037c1e284/example/polybench/eval-perf#L203-L210

I will improve the doc later.

Thank you, Ruizhe^_^.

kumasento / polymer

The instruction in readme seems to mismatch with the source code. #94

map0 = affine_map<(d0) -> (d0 * 32)>

map1 = affine_map<(d0) -> (d0 * 32 + 32)>