Open weiya711 opened 3 years ago
There is a simple fix to this which is to define the iteration algebra as:
struct xorAndAlgebra {
IterationAlgebra operator()(const std::vector<IndexExpr>& regions) {
auto m1 = Intersect(regions[0], regions[2]);
auto m2 = Intersect(regions[1], regions[2]);
auto noIntersect = Complement(Intersect(Intersect(regions[0], regions[1]), regions[2]));
return Intersect(noIntersect, Union(m1, m2));
}
};
The bug is fundamentally caused by the fact that the iteration lattice construction algorithm is incorrect when taking the cartesian product of points for the two input iteration lattices. Discussed this with @rohany @fredrikbk and @RawnH. The intersection rule in intersectLattices(MergeLattice left, MergeLattice right)
and union rule in unionLattices(MergeLattice left, MergeLattice right
are not correct in taking the cartesian product of all points. This is because if both left and right lattice have a shared input tensor (either as an iterator or locator) and then takes the cartesian product of points to merge the two lattices together, the cartesian product will try to merge some points that should NEVER intersect.
As an example: left: (~A U ~C) merged with right: (~B U ~C) will give the input lattices left: (i | A, C | O), (i | C | P), (i | A | P), (i | | P), ( | | ) right: (i | B, C | O), (i | C | P), (i | B | P), (i | | P), ( | | ) And when merging to find point (i | A, B, C | ?) we get these three point products:
To check for non-overlapping points when taking the cartesian product, Solution: Follow the root point of each input lattice to see which tensor inputs are NOT in the product points. From the example above, 2. left root has A, C but left point (i | A | P) is missing C and right point (i | B, C | O) has a C so they do not overlap in the iteration space. This same justification can be applied to 3. So the fix is to add a check (in pseudocode below):
product_points = {}
if ((tensor in left_root && !(tensor in left_pt) && tensor in right_pt) ||
(tensor in right_root && !(tensor in right_pt) && tensor in left_pt) {
hasIntersection = false
}
if (hasIntersection)
product_points += (left_pt, right_pt)
Therefore, the quick fix works since the iteration algebra does not need to merge two iteration lattices that have the same input tensor since the computation is (~A U ~C) merged with (~B) only.
The algorithm fix for this in taco can be found in the branch: https://github.com/tensor-compiler/taco/tree/array_algebra_iter_const
When generating code for
xor(and(A, C), and(B, C))
in taco certain implementations do not match numpy.1) The below statement produces the correct code that matches dense numpy and pydata/sparse. For
data/image/no/image1.jpg
it produces 816 nnzs:2) This (supposedly) equivalent statement produces too many zeros. For
data/image/no/image1.jpg
it produces 2646 nnzs:3) This (supposedly) equivalent statement also produces too many zeros and produces the same result as 2). For
data/image/no/image1.jpg
it produces 2646 nnzs:For case 3) the generated code from the lowerer is:
Where the case statement
if ((jA0 == j && jC0 == j) && jB0 == j)
does not seem correct since the intersection of all three tensors (A, B, and C) should not be included in the defined fused iteration space ofxor(and(A, C), and(B, C))
.The above ops are defined as: