iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.59k stars 580 forks source link

Constant fold of tensor ops #9750

Open okkwon opened 2 years ago

okkwon commented 2 years ago

Request description

We may further fold constant by calculating tensor ops. Here is an example from person-detect in the IREE benchmark suite.

image

dispatch_71 is purely dependent on the constant inputs and can be folded into the <2xi32> constant value.

The implementation will be costly because we need to implement the arith operations as a library function call. (I am not aware of the feature in other dialects such as tosa or mhlo. If they provide the feature, we may fold the ops at that level.)

After folding the constant, we may further fuse dispatch 70 and 72.

What component(s) does this issue relate to?

Compiler

Additional context

No response

okkwon commented 2 years ago

The pattern is repeating in the graph. dispatch.pdf

benvanik commented 2 years ago

This should already be getting folded - we have a whole thing that does this constant evaluation and hosting - it's likely just getting missed.

If the hoist into globals pass is running and that isn't getting hoisted then that's likely the issue: https://github.com/iree-org/iree/blob/06bd50a6c69dde29a1fdb1525f21166b897f9d4e/compiler/src/iree/compiler/Dialect/Flow/Transforms/Passes.cpp#L139-L145

okkwon commented 2 years ago

Thanks Ben. I will take a look.

stellaraccident commented 2 years ago

We don't have those flags on by default as they need more testing and tuning and was waiting for good opportunities: https://github.com/iree-org/iree/blob/main/docs/website/docs/reference/optimization-options.md

You need both --iree-opt-const-expr-hoisting and --iree-opt-const-eval for maximum effect but they are useful in isolation (especially for debugging how things get done). The first will move all of the constant expressions it finds to module init time. The second will jit the initializers and replace them with static constants.

benvanik commented 2 years ago

Good point stella! iree-benchmark-module doesn't measure init time and so just doing the const-expr-hoisting and running the benchmarks should show whether it's worth it to potentially const eval something. So start with making sure the hoisting is working and then the tradeoff of binary size vs runtime init cost is something orthogonal.

(really love that you built it separable like that!)