Open riftEmber opened 3 years ago
Latest commit ae22717d7b9163346f6627893782788003cd82ac finalizes major bug fixes in Nested UFs and Infinite nesting.
I will move this issue to done
what's the branch for this feature?
It is on the main branch. Unit tests can be found in Compuationtest.NesedUF, ComputationTest.InfinitNesting and ComptationTest.NestedUFComputationTest
On second thought, there is one more thing I need to do for this to be considered done. It has to do with making sure that added tuple variables get reflected in other statements during code generation.
When some new tuple variable is introduced to flatten nesting, the tuple variable gets added in unique positions to the tuple declaration. This however could affect the overall execution schedule in the computation that has already been padded. While this might not be a problem for generating code for simple execution schedule, it might pose a problem with complex schedules
Can't we just say that padding a Computation that hasn't been completed yet is undefined behavior? I can't think of any good reason we'd want to do that
Yeah, that makes sense. But you have to reason about this as to what happens when the applied execution schedules are sent to Omega.
Also, code generation is the final phase of iegenlib, and padding always happens before code generation. However, here is a more detailed example of the problem.
Lets say there are two statements that exhibit nesting inside of a computation.
S0
domain := "{ [n, i, k] : i - P0(row1(n), col1(n)) = 0 &&"
" i - row1(n) = 0 && k - P1(row1(n), col1(n)) = 0 &&"
" col1(n) - col2(k) = 0 && n >= 0 && i >= 0 &&"
" col1(n) >= 0 && row1(n) >= 0 && k - rowptr(i) >= 0 &&"
" -n + NNZ - 1 >= 0 && -i + NR - 1 >= 0 &&"
" -k + rowptr(i + 1) - 1 >= 0 && NC - col1(n) - 1 >= 0"
" && NR - row1(n) - 1 >= 0 }"
ExectionSchedule := {[n,i,k] -> [0,n,1,i,0,k,0]}
And another statement that exhibits less complexity
S1
domain := "{[n,p]: 0 <= n < NNZ && 0 <= p < NNZ}"
Execution Schedule:= {[n,p] -> [0,n,0,p,0,0]}
According to the execution schedule, S0 and S1 share the same outer loop n, and the third tuple location in the rhs of the execution schedule specifies where both computations diverge during polyhedral scanning.
Lets now go ahead and apply our execution schedules first, just like how it is currently done on CodeGen
S0
"{ [0,n,1,i,0,k,0] : i - P0(row1(n), col1(n)) = 0 &&"
" i - row1(n) = 0 && k - P1(row1(n), col1(n)) = 0 &&"
" col1(n) - col2(k) = 0 && n >= 0 && i >= 0 &&"
" col1(n) >= 0 && row1(n) >= 0 && k - rowptr(i) >= 0 &&"
" -n + NNZ - 1 >= 0 && -i + NR - 1 >= 0 &&"
" -k + rowptr(i + 1) - 1 >= 0 && NC - col1(n) - 1 >= 0"
" && NR - row1(n) - 1 >= 0 }"
S1
"{[0,n,0,p,0,0,0]: 0 <= n < NNZ && 0 <= p < NNZ}"
Note the [0,n,0....]
and [0,n,1....]
Now let's go ahead and flatten UF nesting on S0.
S0
"{ [0, n, tv1, tv2, 1, i, 0, k,0] : tv1 - row1(n) = 0 &&"
" tv2 - col1(n) = 0 && i - P0(tv1, tv2) = 0 &&"
" i - row1(n) = 0 && k - P1(tv1, tv2) = 0 &&"
" col1(n) - col2(k) = 0 && n >= 0 && i >= 0 &&"
" col1(n) >= 0 && row1(n) >= 0 && k - rowptr(i) >= 0 &&"
" -n + NNZ - 1 >= 0 && -i + NR - 1 >= 0 && -k +"
" rowptr(i + 1) - 1 >= 0 && NC - col1(n) - 1 >= 0 &&"
" NR - row1(n) - 1 >= 0 }"
S1
"{[0,n,0,p,0,0,0]: 0 <= n < NNZ && 0 <= p < NNZ}"
Don't forget all of this is happening within codeGen's function
Now what we want to happen instead should be another step that ensures other statements in Computation reflects this new tuple variable location
What is the problem with the end results, is it just the different arity of the sets?
There is that and also the compromise in the execution schedule. You know how we put a constant after every tuple variable to indicate positions/order, that can be compromised as well. I might be wrong and should probably run that through Omega to see if polyhedral scanning can handle this case.
IEGenLib supports nested UF calls (a parameter to a UFCallTerm is a UFCallTerm itself), but Omega does not. An example of this issue would be the set
{[i,j,k]: A(i, B(j)) = 0}
. The relevant restriction here is that parameters to a UF must be a prefix (in correct order) of either the input or output tuple, or simply the (set) tuple in the case of a set.One possible solution is to replace the nested call with an existential variable and substitute out, like this:
{[i,j,k]: A(i, z) = 0 \exists z=B(j)}
. However, this fails asz
is not in the set tuple. A potential fix is addingz
to the set tuple.Another proposed solution was a replacement like so:
{[i,j,k]: A_0(i, j) = 0}
, with the accompanying macro#define A_0(p0, p1) A(p0, B(p1))
. This has the issue of hidingA_0
's (A
's) connection toB
from our representation, only applying it in a macro later.