TTNN `to_torch` fails on HC `transpose` output tensor

Summary

The following code crashes when trying to convert the output of transpose to a torch tensor:

import pytest
import ttnn
import torch

def test_untilize_hc_transpose(device, use_program_cache):
    B, C, H, W = 1, 32, 1, 32
    input_tensor = torch.range(0, B * C * H * C - 1).reshape([B, C, H, W])
    memory_config = ttnn.create_sharded_memory_config(
        [B, C, nearest_32(H), nearest_32(W)],
        ttnn.CoreGrid(x=1, y=1),
        ttnn.ShardStrategy.HEIGHT,
    )
    x = ttnn.from_torch(
        input_tensor,
        dtype=ttnn.bfloat16,
        layout=ttnn.TILE_LAYOUT,
        device=device,
    )
    x = ttnn.to_memory_config(x, memory_config)
    x = ttnn.untilize(x)  # [1, 32, 1[32], 32]
    x = ttnn.transpose(x, 1, 2)  # [ 1, 1, 32, 32]
    actual = ttnn.to_torch(x) # ERROR: RuntimeError: shape '[1, 1, 32, 32]' is invalid for input of size 32768
    expected = torch.transpose(input_tensor, 1, 2)
    assert_with_pcc(expected, actual, 0.99999)

The error reported is:

RuntimeError: shape '[1, 1, 32, 32]' is invalid for input of size 32768

tenstorrent / tt-metal

TTNN `to_torch` fails on HC `transpose` output tensor #14251

Summary