tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
396 stars 49 forks source link

[MCW] MM/Conv Sharding and perf improvements of Swin_S #12081

Open saichandax opened 2 weeks ago

saichandax commented 2 weeks ago

MM/Conv sharding to improve the perf and utilisation of the ops.

punithsekar commented 1 week ago

I have kept shard MM wherever possible in mlp and patch_merging sub_module.(In some places, I was not able to use shard MM for core_grid 8x8, Will unit_test that separately).

I am not able to convert interleaved to shard for linear input tensor. I get Invalid sharding core_grid issue. Respective commit for unit_test of conversion from interleaved tensors to shard tensor.

I am attaching the perf sheet of swin_s model for the pipeline available in branch punith/ttnn_swin_s_on_wh.

swin_s_perf.csv

punithsekar commented 1 week ago

Using sharding for MM decreased the whole model pcc from 0.99 to 0.82.

saichandax commented 6 days ago

Currently we have achieved the optimisations that were possible for MM/Conv ops in Swin_S model. Need further inputs to proceed on this model.