tenstorrent / tt-mlir

Tenstorrent MLIR compiler
https://tenstorrent.github.io/tt-mlir/
Apache License 2.0
59 stars 7 forks source link

Choosing conv2d sharding specification in runtime #830

Open LPanosTT opened 3 days ago

LPanosTT commented 3 days ago

Conv2d will shard the input tensor on its own, provided a sharding specification. By default it will attempt to use HEIGHT_SHARDED, but for some convolutions BLOCK_SHARDED or WIDTH_SHARDED might be required. Currently, ttnn::conv2d is not able to determine which sharding specification to use on its own. There's an issue for this: https://github.com/tenstorrent/tt-metal/issues/13107. When #818 is merged, the following metric will be used to determine wether to use BLOCK_SHARDED or stick with the default HEIGHT_SHARDED:

if (op->in_channels() / device.grid_size().y >= 32) {
    config.shard_layout = TensorMemoryLayout::BLOCK_SHARDED;
  }

See runtime/lib/ttnn/operations/conv/conv2d.cpp

sdjordjevicTT commented 3 days ago

Instead of doing this hack in the runtime, can we lower this sharding option from Forge as an override?

LPanosTT commented 3 days ago

Only some of the convolutions in resnet actually hit this if statement. So I’m n sure how an override would work. Also have an issue here to get this modelled in ttnn https://github.com/tenstorrent/tt-metal/issues/13107