Rework iree-compile pipeline for ONNX

zjgarvey commented 1 month ago

Request description

The current onnx-specific intake logic in PluginRegistration.cpp is insufficient for iree-compile. Currently, this single pass is not enough to get the torch IR into an acceptable state to hand off to the torch-to-iree pipeline that follows.

I'd like to request that we do the following.

Use the following passes when lowering torch-onnx-to-torch:

"--convert-torch-onnx-to-torch"
take the relevant bits out of "--torch-lower-to-backend-contract". In particular, "--torch-shape-refinement-pipeline" and "--torch-decompose-complex-ops".
"--torch-scalarize-shapes": This pass de-tensorizes shape computations, which are horrifically profuse in models coming from ONNX.
Re-apply "--torch-shape-refinement-pipeline", since the shape scalarization might have unblocked shape inference from doing what it is supposed to. Yes, it is unfortunately necessary to do shape inference before and after shape scalarization. For some models that fail to get any meaningful shape information from onnx's shape inference tools, cycling between shape inference and scalarization seems to be the only reliable way to untangle a mess of frontend IR.
On the off-chance that the return type of a func op is still a static info cast op after shape inference, "--torch-refine-public-return" and "--canonicalize" can clean up the boundary.

Remove the following pass from torch-to-iree pipeline:

"--torch-decompose-complex-ops". Why remove this here? Well, since this pass pipeline is already missing shape and dtype refinement passes, I'm going to run under the assumption that any "torch" input IR for the torch-to-IREE pipeline satisfies the loosely defined torch-backend contract. Any pipeline like "-to-(torch-)backend-" already includes decompose complex ops. If we actually do need this for the ONNX side because some cycling between shape scalarization and inference unblocks decomposition patterns we need, then it should be included in the onnx-to-torch pipeline.

If reasonable, it would be a good idea to re-apply cycles of shape refinement, shape-scalarization, and possibly decompose complex ops, until the IR reaches a fixed state (or a max number of iterations). If doing this, please disable the unflatten.int and flatten.using_ints decompositions, since these might create a cycle with view op.

What component(s) does this issue relate to?

Frontends, Compiler, MLIR

Additional context

We are currently pre-applying torch-mlir passes in our test suite before handing off linalg IR to iree-compile. Making the suggested changes would mean we could just run iree-compile on the onnx ir and get the same results (or better, since torch-to-iree might handle the lowering to linalg more carefully).

The removal of additional pre-compile processing could resolve some compilation issues. For example, the generation of affine.apply ops that seem to be relevant to the issue https://github.com/iree-org/iree/issues/18386#issuecomment-2414985422, evidently don't get created when using torch-to-iree, but do get created when using torch-backend-to-linalg-on-tensors-backend-pipeline.

ScottTodd commented 1 month ago

We are currently pre-applying torch-mlir passes in our test suite before handing off linalg IR to iree-compile.

Oof. That's not good. Test suites should be using the same tools that we recommend to users. If tests fail when using those tools, the tools should be fixed instead of changing the tests.

The current onnx-specific intake logic in PluginRegistration.cpp is insufficient for iree-compile. Currently, this single pass is not enough to get the torch IR into an acceptable state to hand off to the torch-to-iree pipeline that follows.

Generally:

iree-import-onnx should handle .onnx --> .mlir that iree-compile can use.
iree-compile with --iree-input-type=onnx (or the default auto) should be able to compile any .mlir produced by iree-import-onnx for all backends

We can put options in the torch (onnx) plugin if there are decisions for users to make, like the existing --iree-torch-use-strict-symbolic-shapes option.

If the passes and pipelines in torch-mlir change, IREE should adapt as well. See also https://github.com/iree-org/iree/issues/17021.

I don't have specific feedback about the proposed concrete changes. They all look pretty standard for converting from one representation to another.

vivekkhandelwal1 commented 1 month ago

I have added the WIP patch here: https://github.com/llvm/torch-mlir/pull/3801 for adding the torch-onnx-to-torch-backend pipeline in Torch-MLIR.

zjgarvey commented 1 month ago

Oof. That's not good. Test suites should be using the same tools that we recommend to users. If tests fail when using those tools, the tools should be fixed instead of changing the tests.

I only disagree based on my incompetence. It's been some trial and error to figure out what passes we actually need, and where to put them in the pipeline. It would have been several iterations of sloppy iree commits to converge on something meaningful. Instead, we tried some different combinations of pre-proccessing passes in the test suite, and finally have a good understanding of what passes are actually necessary to apply to onnx-importer generated IR. It's time to upstream the correct pipeline.

ScottTodd commented 1 month ago

Fair points. It's definitely the end state that matters, not necessarily the journey. I think there are some process / development workflow changes we could make that would make that convergence less sloppy. Manually specifying pass pipelines and adding extra flags is a crutch that we've been leaning on in many places for far too long. The tools need to work (correctly and with high performance) out of the box.

vivekkhandelwal1 commented 1 month ago

For the first proposed change, I have added a PR here: https://github.com/llvm/torch-mlir/pull/3801

CC: @zjgarvey @ScottTodd

iree-org / iree