tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
396 stars 48 forks source link

fold op on WH -- enable and opt #10288

Closed mywoodstock closed 2 weeks ago

mywoodstock commented 1 month ago

Currently the fold op does not work on WH -- need to get it working -- to be used in RN50. Current perf on GS is at 450ns for half fold (from unit test)

mywoodstock commented 1 month ago

Since we essentially need permute and reshape operations to perform the fold operation, we will instead look into optimizing permute on device and not use fold op. I will try out the reshape/permute combination in the resnet model to make sure it is functional.

cc: @davorchap