Leaking of architecture/back-end related details in the canonicalization patterns

There aren't too many examples of this, the primary two that I've encountered are:

// onnx.add(onnx.matmul(%X, %Y), %Z) = onnx.Gemm(%X, %Y, %Z)

Which implicitly assumes that the underlying hardware is better suited for Gemms (and also is a fairly restrictive transform given that the ONNX Spec only details Gemm support for rank 2 matmuls).

The other example of this has since been removed, but it was related to Convolutions with padding being turned into ONNXPadOp -> ONNXConvOp.

Proposal here is to move the ONNXMatMulOp -> ONNXAddOp = ONNXGemmOp out of the canonicalization pass since this is a global pass that can't be changed by projects building on top of onnx-mlir.

onnx / onnx-mlir

Leaking of architecture/back-end related details in the canonicalization patterns #531