Open saichandax opened 2 weeks ago
I have kept shard MM wherever possible in mlp and patch_merging sub_module.(In some places, I was not able to use shard MM for core_grid 8x8, Will unit_test that separately).
I am not able to convert interleaved to shard for linear input tensor. I get Invalid sharding core_grid issue. Respective commit for unit_test of conversion from interleaved tensors to shard tensor.
I am attaching the perf sheet of swin_s model for the pipeline available in branch punith/ttnn_swin_s_on_wh.
Using sharding for MM decreased the whole model pcc from 0.99 to 0.82.
Currently we have achieved the optimisations that were possible for MM/Conv ops in Swin_S model. Need further inputs to proceed on this model.
MM/Conv sharding to improve the perf and utilisation of the ops.