plaidml / tpp-mlir

TPP experimentation on MLIR for linear algebra
https://arxiv.org/abs/2404.15204
Other
111 stars 29 forks source link

Allow tiling dimensions equal to tile size #883

Closed adam-smnk closed 7 months ago

adam-smnk commented 7 months ago

Adds a new option to tile-and-fuse pass to expose control over dimension to tile size factor (ratio) when selecting eligible operations for tiling. The chosen factor requirement must be fulfilled for all dimensions.

This change allows tiling when dimensions are equal to their corresponding tile size or restricts tiling to larger workloads. For example, tile factor of 1 creates more opportunities in kernel outlining for wide and tall workloads e.g., memref<128x1024> into 128x128 tiles.

adam-smnk commented 7 months ago

2~4% regression on some MHA benchmarks due to more fine-grained tiling after relaxing heuristic. No impact on simple gemm and mlp benchmarks.

The main motivation for the change is improving GPU outlining where tile-and-fuse will be used to map workloads into block tiles and thread subtiles. Long term we might need to have more flexible way to manage tiling heuristic.

rengolin commented 7 months ago

The main motivation for the change is improving GPU outlining where tile-and-fuse will be used to map workloads into block tiles and thread subtiles. Long term we might need to have more flexible way to manage tiling heuristic.

You mean, like this? 😄 https://discourse.llvm.org/t/rfc-target-description-and-cost-model-in-mlir/76990

adam-smnk commented 7 months ago

The main motivation for the change is improving GPU outlining where tile-and-fuse will be used to map workloads into block tiles and thread subtiles. Long term we might need to have more flexible way to manage tiling heuristic.

You mean, like this? 😄 https://discourse.llvm.org/t/rfc-target-description-and-cost-model-in-mlir/76990

This will definitely help 🔥 In tile and fuse there are internally quite a few decisions made, some of that could be more exposed perhaps through optional user callbacks.

For now the performance penalty is small and I think it is worth greater flexibility.

rengolin commented 7 months ago

This will definitely help 🔥 In tile and fuse there are internally quite a few decisions made, some of that could be more exposed perhaps through optional user callbacks.

@nhasabni is working on that right now, we'll try to push that upstream, but maybe test in tpp-mlir first, not sure yet.

adam-smnk commented 7 months ago

Reverted changes to the default tiling validation. Instead, added it as an option with the default value corresponding to the previous behavior.

TODO: add tiling tests for the options and test tiling for GPU.

adam-smnk commented 7 months ago

Added missing test. Default tiling behavior is unchanged, thus, no more regressions in benchmarks.