tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
https://docs.tenstorrent.com/ttnn/latest/index.html
Apache License 2.0
489 stars 81 forks source link

conv2d function help #14140

Open twist-vector opened 1 month ago

twist-vector commented 1 month ago

I'm trying to understand the functionality of conv2d and how it implements a 2D convolution. I think I have the correct tensor shapes (at least no shape or dimension errors are thrown) but I don't understand what is being computed. For a simple 3x3 kernel on a 32x32 matrix (batches=1, channels=1) I expected the following to generate a 1x1x32x32 tensor. Rather the result shape is 1x1x1024x1[32] and, given an input matrix and kernel of ones, the result tensor does not have the expected values. What is the required input geometries and values for ?

import torch
import ttnn

device_params = {"l1_small_size": 24576}
device = ttnn.CreateDevice(device_id=0, **device_params)

BATCH_SIZE = 1
CHANNELS = 1
MAT_SIZE = 32
KERN_SIZE = 3
a = torch.ones((BATCH_SIZE, CHANNELS, MAT_SIZE, MAT_SIZE), dtype=torch.float32)
b = torch.ones((BATCH_SIZE, CHANNELS, KERN_SIZE, KERN_SIZE), dtype=torch.float32)
input_tensor  = ttnn.from_torch(a, layout=ttnn.TILE_LAYOUT, device=device)
weight_tensor = ttnn.from_torch(b, layout=ttnn.TILE_LAYOUT, device=device)

res = ttnn.conv2d(input_tensor=input_tensor, 
                  weight_tensor=weight_tensor, 
                  device=device,
                  in_channels=CHANNELS,
                  out_channels=CHANNELS,
                  batch_size=BATCH_SIZE,
                  input_height=MAT_SIZE,
                  input_width=MAT_SIZE, 
                  kernel_size=(KERN_SIZE,KERN_SIZE),
                  padding=(int(KERN_SIZE/2),int(KERN_SIZE/2)),
                  stride=(1,1),
                  dilation=(1,1),
                  groups=1)
out, out_height, out_width, conv_weight_tensor, conv_bias_tensor = res

print(out_height)
print(out_width)
print(out)
print(conv_weight_tensor)
print(conv_bias_tensor)
print("")

out_torch = ttnn.to_torch(out)
print(out_torch.shape)

ttnn.close_device(device)

The printed result is

32
32
ttnn.Tensor([[[[ 3.00000,  0.00000,  ...,  0.00000,  0.00000],
               [ 3.00000,  0.00000,  ...,  0.00000,  0.00000],
               ...,
               [ 0.00000,  0.00000,  ...,  0.00000,  0.00000],
               [ 0.00000,  0.00000,  ...,  0.00000,  0.00000]]]], shape=Shape([1, 1, 1024, 1[32]]), dtype=DataType::BFLOAT16, layout=Layout::TILE)
ttnn.Tensor([[[[ 1.00000,  1.00000,  ...,  0.00000,  0.00000],
               [ 1.00000,  1.00000,  ...,  0.00000,  0.00000],
               ...,
               [ 0.00000,  0.00000,  ...,  0.00000,  0.00000],
               [ 0.00000,  0.00000,  ...,  0.00000,  0.00000]]]], shape=Shape([1, 1, 3[32], 3[32]]), dtype=DataType::FLOAT32, layout=Layout::TILE)
None

torch.Size([1, 1, 1024, 1])
dvartaniansTT commented 1 month ago

@twist-vector thanks for filing this issue. Is this built from the same commit you shared on discord? 0d1eb7d95983ad7f42faadf86dc66e186f53f264 ?

twist-vector commented 1 month ago

Yes.

On Wed, Oct 23, 2024, 7:47 PM Dalar Vartanians @.***> wrote:

@twist-vector https://github.com/twist-vector thanks for filing this issue. Is this built from the same commit you shared on discord? 0d1eb7d95983ad7f42faadf86dc66e186f53f264 ?

— Reply to this email directly, view it on GitHub https://github.com/tenstorrent/tt-metal/issues/14140#issuecomment-2433793580, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE3H3CAKCOLP5EVW73VMOTZ5AYRZAVCNFSM6AAAAABQOPFJL2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZTG44TGNJYGA . You are receiving this because you were mentioned.Message ID: @.***>

dvartaniansTT commented 1 month ago

import torch import ttnn import torch.nn.functional as F

Device setup

device_params = {"l1_small_size": 24576} device = ttnn.CreateDevice(device_id=0, **device_params)

Define the parameters

BATCH_SIZE = 1 CHANNELS = 1 MAT_SIZE = 32 KERN_SIZE = 3

Create the input tensor with all ones

input_tensor = torch.ones((BATCH_SIZE, CHANNELS, MAT_SIZE, MAT_SIZE), dtype=torch.float32)

Define the weight tensor with out_channels = in_channels (CHANNELS) and all ones

weight_tensor = torch.ones((CHANNELS, CHANNELS, KERN_SIZE, KERN_SIZE), dtype=torch.float32)

Perform 2D convolution with PyTorch and add bias

output = F.conv2d(input_tensor, weight_tensor, stride=1, padding=1)

TTNN

Convert tensors to TTNN format

input_tensor_ttnn = ttnn.from_torch(input_tensor, layout=ttnn.TILE_LAYOUT, device=device) weight_tensor_ttnn = ttnn.from_torch(weight_tensor, layout=ttnn.TILE_LAYOUT, device=device)

print("Torch output size:", output.shape)

Perform 2D convolution with TTNN

res = ttnn.conv2d( input_tensor=input_tensor_ttnn, weight_tensor=weight_tensor_ttnn, device=device, in_channels=CHANNELS, out_channels=CHANNELS, batch_size=BATCH_SIZE, input_height=MAT_SIZE, input_width=MAT_SIZE, kernel_size=(KERN_SIZE, KERN_SIZE), padding=(1, 1), stride=(1, 1), dilation=(1, 1), groups=1, ) out, out_height, out_width, conv_weight_tensor, conv_bias_tensor = res

Convert TTNN output to PyTorch tensor

out_torch = ttnn.to_torch(out).squeeze(-1).reshape(1, 1, MAT_SIZE, MAT_SIZE)

Compare the outputs

out_torch = out_torch.to(torch.float32) print("Do we get the same output from torch and TTNN?", torch.allclose(out_torch, output, atol=1e-6))

Output tensors

print("TTNN output:", out_torch) print("Torch output:", output)

Close TTNN device

ttnn.close_device(device)

@mywoodstock I have commented and slightly modified the test example above ^. Not sure my the TTNN and torch outputs are different here. Could you pls take a look?

dvartaniansTT commented 1 month ago

@mywoodstock perhaps the issue is that we need this kind of pre-processing?

def preprocess_linear_weight(weight, *, dtype, layout=ttnn.TILE_LAYOUT): weight = weight.T.contiguous() weight = ttnn.from_torch(weight, dtype=dtype, layout=layout) return weight