Open eyonland opened 3 months ago
Our implementation for swapping N/C
with H/W
is reliant on an implementation of transpose_hc
. The cases where we move N into H/W require nested calls of transpose_hc/transpose_wh/transpose_nc
. If we change that implementation to call pad before the transpose and unpad after it if necessary, it should avoid the pad related logic in autoformat.
If we change transpose_HC such that: [N, C, H[Hp], W[Wp]]
- pad C
(if needed) ->[N,C[Cp],H[Hp],W[Wp]]
- transpose HC ->[N,H[Hp],C[Cp],W[Wp]]
- slice H/new C
-> [N,H,C[Cp],W[Wp]]
that item is done. We can just call the pre-existing pad/unpad(slice) implementations. We will need to move the ttlib versions of pad
and unpad
to ttnn to do this with the ttnn migration.
Are there any known cases where things spill over to host? It seems like the permute op already handles padding and unpadding when the tensor is already on device.
This work means that the reshaping of the tensor with padding will happen within the permute op itself. The goal here is to update the code such that the new height and width dimensions are pre-padded before the transpose is called. After the transpose is called. Any padding left on the N & C (batch, channel) dimensions needs to be removed.