iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.57k stars 576 forks source link

Improve tile size computation for vectorization/unrolling in quantized models using LinalgOpInfo #10363

Open dcaballe opened 2 years ago

dcaballe commented 2 years ago

Tile size computation in LLVMCPU is crying out for a refresh. The current approach is getting difficult to maintain and debug even for those familiar with the code. The goal is to refactor all the incremental tile size computation for vectorization/unrolling that happens along multiple functions in KernelDispatch.cpp to a single place and to extend and use LinalgOpInfo analysis to make a more informed decision on the tile sizes needed.

Some requirements/steps/suggestions:

There are plenty of other things we can do but I think this would be a good starting point. Other suggestions are welcome!

dcaballe commented 2 years ago

https://github.com/iree-org/iree/pull/10287#issuecomment-1241256362 shows the magnitude of the problem. When we lower a tosa.rescale operation to Arith before the vectorizer and expose its mixed-length types to it (i8, i32 and i64), we get massive regressions on x86 (85%, 63%, etc.). I think that could be lowering tosa.rescale before the vectorizer and not getting any regression could be a good metric for success here.