tenstorrent / tt-mlir

Tenstorrent MLIR compiler
https://tenstorrent.github.io/tt-mlir/
Apache License 2.0
52 stars 7 forks source link

Emit correct tile_shapes for tensors #272

Open jnie-TT opened 1 month ago

jnie-TT commented 1 month ago

Currently the tile_shape in the flatbuffer executable is hardcoded to 0x0. In runtime we want to use the tile_shape to determine if we want to tilize/untilize a tensor - some ops (like relu) can only run on tilized data.

We need compiler support that can emit the correct tile_shapes for input/output tensors.

nsmithtt commented 1 month ago

Hey everyone, we're already facing our first set of op constraints :). Since we don't have an interface with TTNN yet, perhaps there is an interim thing we could do?

TTIR has the op interface that provides getOperandConstraints which actually captures tilized vs row major. I'm thinking that there are a few options we could do in the short term:

Let me know what you all think. We can schedule a follow up meeting if need be.

mbezuljTT commented 1 month ago

Hello,

I had a chat with @derdeljanTT and he reminded me our goal is to maintain cross-platform compiler. Today this means any linking to TTNN will break cross-platform compilation. While TTNN might be cross platform in the future, we should be careful about picking this dependency. @nsmithtt what do you think about it?

Going with analysis path means we have to hardcode op requirements manually.

Last time we have discussed op interface constraints, suggestion from TTNN folks was to build op constraints cache and establish straightforward way to rebuilding it. @nsmithtt are you aware of any development on this front or it's still up for a discussion?

I think we we have a cache of op constraints we could include it in the analysis path - this sounds easier than expanding TTNN.

I see this issue is blocking P0. @jnie-TT short-term is it reasonable to assume all ops can run on tilized data? if yes, are we ok to assume this on the compiler as well? It will come at perf cost.

Thanks Marko

nsmithtt commented 1 month ago

I think let's shoot for the op constraints cache (or config file) approach as that solves the cross platform issue too. @xanderchin, are you using this approach for ttnn sweep tests?