Investigate using canonicalization passes to add graphblas.convert_layout ops when appropriate

paul-tqh-nguyen commented 3 years ago

Currently, in our lowering passes, we add graphblas.convert_layout ops when we're given a CSC matrix when the "real" lowering expects a CSR matrix.

It might be cleaner to use canonicalization passes passes to do this sort of thing.

It might also be a good idea to add tensor casts to convert fixed shaped tensors to tensors using ? in the shape. This would help with https://github.com/metagraph-dev/mlir-graphblas/issues/129.

For graphblas.comment ops, it's unclear whether or not these should be handled during lowering or canonicalization.

paul-tqh-nguyen commented 2 years ago

@jim22k and I came up with a plan to address this. I'll describe our proposed approach below.

We'll want to change the standard passes we use.

Convenience Pass: Similar level of abstraction as --graphblas-structuralize. This pass will make our code more optimization-friendly. It'll add graphblas.convert_layout ops where necessary (but won't in cases where the given types are handled properly in --graphblas-lower). This will add the simplest set of graphblas.convert_layout ops to get the thing working (the next optimization pass will fix any slow ops sequences introduced).
Optimization Pass: Similar to --graphblas-optimize. This will do additional optimizations besides those already present in --graphblas-optimize, e.g.
- graphblas.convert_layout + graphblas.apply + graphblas.convert_layout => graphblas.apply + graphblas.convert_layout
- Side Note: Should all of our ops have a generic version that we can fuse with graphblas.apply? Including graphblas.apply?
- graphblas.convert_layout + graphblas.convert_layout => graphblas.convert_layout
- graphblas.transpose + graphblas.transpose => graphblas.transpose
- a_CSC @ b_CSR normally requires 2 graphblas.convert_layout ops, can do (b_CSR @ a_CSC).transpose with at most 1 graphblas.convert_layout op.
- There are many more optimizations we can do.
Lowering Pass: The very strict --graphblas-lower pass that will handle a very specific set of cases, e.g. code with a_CSC @ b_CSR that is not lowered with the previous two passes will be untouched by this pass.
- The GraphBLAS dialect will be marked as illegal after this pass (there's an example of how to do that here; CC @seibert).

Implementation details (and misc. answers to misc. questions we had during our discussion):

The verify methods for ops will be very loose and very not strict, e.g. matrix multiply will allow CSC @ CSR. However, this does not necessarily mean the rewrite patterns in --graphblas-lower will lower them. We've been previously writing the rewrite patterns by overriding the matchAndRewrite method. We should instead override the match method and the rewrite method independently (see here). The match methods will be more strict and will limit the cases handled by each rewrite pattern used in each pass.
The Convenience Pass will not do anything to the cases specifically handled by the Lowering Pass.
There will be no overlap in the cases handled by the rewrite patterns of the Lowering Pass. The Lowering Pass's rewrite patterns will strictly partition all the cases handled by the Lowering pass. Thus, there's no "fallback" or "default" rewrite pattern for any particular op.
This will help resolve issues where we trade off convenience for user-burden. We can have both using this approach.

CC @eriknw

paul-tqh-nguyen commented 2 years ago

After reading Dan Gohman's article on canonicalization (as recommended by MLIR), it turns out that there's no agreed upon strict clear line that differentiates optimization/transformation from canonicalization.

Since MLIR's canonicalization functionality is implemented as a pass, there's currently no impactful difference for us right now to implement any of the passes I've described above via canonicalization. Thus, I'm choosing to avoid using canonicalization at this time. If we decide to do so in the future, it'll be a trivial refactoring since canonicalization is implemented via rewrite patterns.

metagraph-dev / mlir-graphblas

Investigate using canonicalization passes to add graphblas.convert_layout ops when appropriate #152