kumasento / polymer

Bridging polyhedral analysis tools to the MLIR framework
MIT License
99 stars 20 forks source link

Handling local variables #73

Open kumasento opened 3 years ago

kumasento commented 3 years ago

This issue was not properly dealt with in previous versions. Since I'm currently doing the refactorization, we should have a comprehensive solution for that.

Background

A local variable will be introduced when dealing with semi-affine maps. For example, a code as follows:

#map = affine_map<(d0) -> (d0 floordiv 2)>

func @load_store_local_vars_floordiv(%A : memref<?xf32>) -> () {
  %c0 = constant 0 : index
  %N = dim %A, %c0 : memref<?xf32>
  %M = affine.apply #map(%N)

  affine.for %i = 0 to %M {
    %0 = affine.load %A[%i] : memref<?xf32>
    affine.store %0, %A[%i + %M] : memref<?xf32>
  }

  return
}

%M will be a local variable for the domain of the internal affine.for resolved. The constraints around %N and %M can be derived from their floordiv 2 relation.

%i | %N | %M | 1 
 1    0    0   0 >= 0
 0    1   -2   0 >= 0
 0   -1    2   1 >= 0
-1    0    1  -1 >= 0

And these constraints can be translated into a proper OpenScop relation.

DOMAIN
4 5 1 0 1 1
# e/i| i1 | l1 | P1 |  1  
   1    1    0    0    0    ## i1 >= 0
   1    0   -2    1    0    ## -2*l1+P1 >= 0
   1    0    2   -1    1    ## 2*l1-P1+1 >= 0
   1   -1    1    0   -1    ## -i1+l1-1 >= 0

l1 is %M and P1 is %N.

The problem is around the access relation.

Getting the access relation

The access relation is derived from a AffineValueMap that we can get from MemRefAccess (ref). An AffineValueMap is an AffineMap together with a list of operands it applies. The issue of local variables is about these operands: we need to make sure every operand can be traced back to the values held by the domain, but a local variable is not there.

Specifically, given the code above, we can get a polyhedral statement as follows:

  func @load_store_local_vars_floordiv(%arg0: memref<?xf32>) {
    %c0 = constant 0 : index
    %0 = dim %arg0, %c0 : memref<?xf32>
    %1 = affine.apply #map(%0)
    affine.for %arg1 = 0 to %1 {
      call @S0(%arg0, %arg1, %1) : (memref<?xf32>, index, index) -> ()
    }
    return
  }
  func @S0(%arg0: memref<?xf32>, %arg1: index, %arg2: index) attributes {scop.stmt} {
    %0 = affine.load %arg0[%arg1] : memref<?xf32>
    affine.store %0, %arg0[%arg1 + %arg2] : memref<?xf32>
    return
  }

Because %1 (%M) is a symbol based on (isValidSymbol), we will directly pass it into @S0 in a form as %arg2. It will later become an operand of the AffineValueMap for the affine.store. That AffineValueMap would look like:

affine_map<()[s0, s1] -> (s0 + s1)>

and s0 and s1 will be replaced by %arg1 (%i) and %arg2 (%M). However, %arg2, or more specifically, %M, cannot be found in the domain we resolved. %M is a local variable, and a local variable is value-less (or internal, cannot find the correct wording) in the FlatAffineConstraints. Go back to the domain representation above (which is of type FlatAffineConstraints:

%i | %N | %M | 1 
 1    0    0   0 >= 0
 0    1   -2   0 >= 0
 0   -1    2   1 >= 0
-1    0    1  -1 >= 0

%M is not actually stored in the domain.

Solutions

Before we dive into the solutions, there are several things from the Polymer design that cannot be broken:

  1. A domain is immutable. Once we figure out what a domain looks like, especially what its dims and symbols are, we should not further alter it.
  2. An access relation should not use values other than those in the domain, excluding local values. That is, if an access relation needs to use a value marked as local in the domain, it should instantiate again that value as a local one inside the access relation.
  3. The statement interface should only include arrays, IVs, and parameters, e.g., including %M as an argument of @S0 in the previous example violates this rule. This is just a convention that can keep the design clean.

Our solution is:

  1. In -extract-scop-stmt, we will only stop searching when seeing an operand that is included in the current domain constraints (as well as the global parameter set). Such that all the local variable calculation will be included internally to each callee.
  2. When exporting OpenScop, we just build access constraints with all the affine.apply results flattened. This is done automatically, we just need to make sure that these affine constraints have the same columns as the domain (after merging with the context for their symbols) excluding all the local variables.

Result

Given input code:


#map = affine_map<(d0) -> (d0 floordiv 2)>

func @load_store_local_vars_floordiv(%A : memref<?xf32>) -> () {
  %c0 = constant 0 : index
  %N = dim %A, %c0 : memref<?xf32>
  %M = affine.apply #map(%N)

  affine.for %i = 0 to %M {
    %0 = affine.load %A[%i] : memref<?xf32>
    %j = affine.apply #map(%i)
    %1 = addf %0, %0 : f32
    affine.store %0, %A[%i + %M] : memref<?xf32>
    affine.store %1, %A[%j + %M] : memref<?xf32>
  }

  return
}

We will get:

#map = affine_map<(d0) -> (d0 floordiv 2)>
#set0 = affine_set<()[s0] : (s0 floordiv 2 - 1 >= 0, s0 floordiv 2 - 1 >= 0, s0 mod 2 >= 0, -s0 + (s0 floordiv 2) * 2 + 1 >= 0, s0 mod 2 >= 0, -s0 + (s0 floordiv 2) * 2 + 1 >= 0)>
#set1 = affine_set<(d0)[s0] : (d0 >= 0, s0 mod 2 >= 0, -s0 + (s0 floordiv 2) * 2 + 1 >= 0, -d0 + s0 floordiv 2 - 1 >= 0)>
#set2 = affine_set<(d0)[s0] : (d0 >= 0, s0 mod 2 >= 0, -s0 + (s0 floordiv 2) * 2 + 1 >= 0, -d0 + s0 floordiv 2 - 1 >= 0)>
#set3 = affine_set<(d0, d1, d2)[s0] : (d1 - d2 == 0, d0 - 1 == 0)>
#set4 = affine_set<(d0, d1, d2)[s0] : (d1 - d2 - s0 floordiv 2 == 0, d0 - 1 == 0, s0 mod 2 >= 0, -s0 + (s0 floordiv 2) * 2 + 1 >= 0)>
#set5 = affine_set<(d0, d1, d2)[s0] : (d1 - d2 floordiv 2 - s0 floordiv 2 == 0, d0 - 1 == 0, d2 mod 2 >= 0, -d2 + (d2 floordiv 2) * 2 + 1 >= 0, s0 mod 2 >= 0, -s0 + (s0 floordiv 2) * 2 + 1 >= 0)>
module  {
  func @load_store_local_vars_floordiv(%arg0: memref<?xf32>) attributes {scop.arg_names = ["A1"], scop.ctx = #set0, scop.ctx_params = ["P1"]} {
    %c0 = constant 0 : index
    %0 = dim {scop.param_names = ["P1"]} %arg0, %c0 : memref<?xf32>
    %1 = affine.apply #map(%0)
    affine.for %arg1 = 0 to %1 {
      call @S0(%arg0, %arg1, %0) {scop.domain = #set1, scop.domain_symbols = ["i1", "P1"], scop.scats = [0, 0]} : (memref<?xf32>, index, index) -> ()
      call @S1(%arg0, %0, %arg1) {scop.domain = #set2, scop.domain_symbols = ["i1", "P1"], scop.scats = [0, 1]} : (memref<?xf32>, index, index) -> ()
    } {scop.iv_name = "i1"}
    return
  }
  func @S0(%arg0: memref<?xf32>, %arg1: index, %arg2: index) attributes {scop.stmt} {
    %0 = affine.load %arg0[%arg1] {scop.access = #set3, scop.access_symbols = ["A1", "", "i1", "P1"]} : memref<?xf32>
    %1 = affine.apply #map(%arg2)
    affine.store %0, %arg0[%arg1 + %1] {scop.access = #set4, scop.access_symbols = ["A1", "", "i1", "P1"]} : memref<?xf32>
    return
  }
  func @S1(%arg0: memref<?xf32>, %arg1: index, %arg2: index) attributes {scop.stmt} {
    %0 = affine.load %arg0[%arg2] {scop.access = #set3, scop.access_symbols = ["A1", "", "i1", "P1"]} : memref<?xf32>
    %1 = addf %0, %0 : f32
    %2 = affine.apply #map(%arg2)
    %3 = affine.apply #map(%arg1)
    affine.store %1, %arg0[%2 + %3] {scop.access = #set5, scop.access_symbols = ["A1", "", "i1", "P1"]} : memref<?xf32>
    return
  }
}