tenstorrent / pytorch2.0_ttnn

⭐️ TTNN Compiler for PyTorch 2.0 ⭐️ It enables running PyTorch2.0 models on Tenstorrent hardware
https://tenstorrent.github.io/tt-metal/latest/ttnn/
25 stars 6 forks source link

`aten.constant_pad_nd.default` not support pad has negative value #516

Open swimdi opened 5 days ago

swimdi commented 5 days ago

aten.constant_pad_nd.default currently not support lowering to ttnn.pad if its pad has negative value

or it will TypeError: __call__(): incompatible function arguments. The following argument types are supported: ...

These input varation are appeared in models and is related to this constraint

["Tensor<[1, 240, 29, 29]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 144, 59, 59]> self = ?", "List[int] pad = [-1, -2, -1, -2]"],
["Tensor<[1, 96, 113, 113]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 3, 225, 225]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 240, 31, 31]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 144, 63, 63]> self = ?", "List[int] pad = [-1, -2, -1, -2]"],
["Tensor<[1, 96, 121, 121]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 3, 241, 241]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 96, 131, 131]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 3, 261, 261]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 288, 39, 39]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 144, 151, 151]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 3, 301, 301]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 960, 27, 27]> self = ?", "List[int] pad = [-1, -2, -1, -2]"],
["Tensor<[1, 336, 49, 49]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 144, 191, 191]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
["Tensor<[1, 3, 381, 381]> self = ?", "List[int] pad = [0, -1, 0, -1]"],
ayerofieiev-tt commented 5 days ago

@swimdi , whats the exact full error? Can you share a matching ttnn code here in the ticket so its straightforward to reproduce?

swimdi commented 5 days ago

ok, the reproduce command is as below

import torch
import ttnn
device = ttnn.open_device(device_id=0)
input_shape = [1, 240, 29, 29]
pads = [(0, -1), (0, -1)]
torch_input_tensor = torch.rand(input_shape, dtype=torch.bfloat16)
tt_input_tensor = ttnn.from_torch(torch_input_tensor, layout=ttnn.ROW_MAJOR_LAYOUT, device=device)
tt_output_tensor = ttnn.pad(tt_input_tensor, pads, 0)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 12
     10 torch_input_tensor = torch.rand(input_shape, dtype=torch.bfloat16)
     11 tt_input_tensor = ttnn.from_torch(torch_input_tensor, layout=ttnn.ROW_MAJOR_LAYOUT, device=device)
---> 12 tt_output_tensor = ttnn.pad(tt_input_tensor, pads, 0)

File ~/venv_pt/lib/python3.8/site-packages/ttnn/decorators.py:329, in FastOperation.__call__(self, *function_args, **function_kwargs)
    328 def __call__(self, *function_args, **function_kwargs):
--> 329     return self.function(*function_args, **function_kwargs)

TypeError: __call__(): incompatible function arguments. The following argument types are supported:
    1. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, padding: List[Tuple[int, int]], value: float, *, use_multicore: bool = True, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor
    2. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, output_padded_shape: List[int[1]], input_tensor_start: List[int[1]], value: float, *, use_multicore: bool = False, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor
    3. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, output_padded_shape: List[int[2]], input_tensor_start: List[int[2]], value: float, *, use_multicore: bool = False, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor
    4. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, output_padded_shape: List[int[3]], input_tensor_start: List[int[3]], value: float, *, use_multicore: bool = False, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor
    5. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, output_padded_shape: List[int[4]], input_tensor_start: List[int[4]], value: float, *, use_multicore: bool = False, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor
    6. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, output_padded_shape: List[int[5]], input_tensor_start: List[int[5]], value: float, *, use_multicore: bool = False, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor
    7. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, output_padded_shape: List[int[6]], input_tensor_start: List[int[6]], value: float, *, use_multicore: bool = False, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor
    8. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, output_padded_shape: List[int[7]], input_tensor_start: List[int[7]], value: float, *, use_multicore: bool = False, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor
    9. (self: ttnn._ttnn.operations.data_movement.pad_t, input_tensor: ttnn._ttnn.tensor.Tensor, output_padded_shape: List[int[8]], input_tensor_start: List[int[8]], value: float, *, use_multicore: bool = False, memory_config: Optional[ttnn._ttnn.tensor.MemoryConfig] = None, queue_id: int = 0) -> ttnn._ttnn.tensor.Tensor

Invoked with: <ttnn._ttnn.operations.data_movement.pad_t object at 0x7f02bc2332f0>, ttnn.Tensor([[[[ 0.56641,  0.28906,  ...,  0.71875,  0.62109],
               [ 0.64844,  0.96875,  ...,  0.69531,  0.80078],
               ...,
               [ 0.31250,  0.72266,  ...,  0.41016,  0.38672],
               [ 0.78906,  0.83984,  ...,  0.50781,  0.66797]],

              [[ 0.76953,  0.51953,  ...,  0.15625,  0.76562],
               [ 0.74609,  0.19141,  ...,  0.95703,  0.70312],
               ...,
               [ 0.38281,  0.41016,  ...,  0.55859,  0.58984],
               [ 0.66797,  0.43750,  ...,  0.86719,  0.70312]],

              ...,

              [[ 0.34766,  0.44531,  ...,  0.83203,  0.67578],
               [ 0.41406,  0.57812,  ...,  0.24609,  0.85156],
               ...,
               [ 0.10938,  0.51953,  ...,  0.60547,  0.58203],
               [ 0.28125,  0.11719,  ...,  0.91797,  0.98438]],

              [[ 0.37109,  0.44922,  ...,  0.62891,  0.02734],
               [ 0.52344,  0.68750,  ...,  0.97266,  0.88281],
               ...,
               [ 0.87891,  0.47656,  ...,  0.13672,  0.66016],
               [ 0.41797,  0.16797,  ...,  0.43750,  0.85156]]]], shape=Shape([1, 240, 29, 29]), dtype=DataType::BFLOAT16, layout=Layout::ROW_MAJOR), [(0, -1), (0, -1)], 0
swimdi commented 9 hours ago

I found the negative value case is only appeared on the training mode model test, so this issue can be lower priority