tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
443 stars 64 forks source link

Failed to lower some tensor manipulation ops from `aten` to `ttnn` #12853

Open jdh8 opened 2 months ago

jdh8 commented 2 months ago

The following ops refuse to lower to ttnn and stay in aten:

============================================================================== short test summary info ===============================================================================
FAILED tests/lowering/tensor_manipulation/test_concat.py::test_concat[input_shapes0-1] - AssertionError: assert 0 == 1
FAILED tests/lowering/tensor_manipulation/test_expand.py::test_expand[input_shape0-new_shape0] - AssertionError: assert 0 == 1
FAILED tests/lowering/tensor_manipulation/test_expand.py::test_expand_after_op[input_shape0-new_shape0] - AssertionError: assert 0 == 1
FAILED tests/lowering/tensor_manipulation/test_expand.py::test_expand_before_op[input_shape0-new_shape0] - AssertionError: assert 0 == 1
FAILED tests/lowering/tensor_manipulation/test_expand.py::test_expand_between_ops[input_shape0-new_shape0] - AssertionError: assert 0 == 1
FAILED tests/lowering/tensor_manipulation/test_repeat.py::test_repeat[input_shape0-sizes0] - AssertionError: assert 0 == 1
FAILED tests/lowering/tensor_manipulation/test_reshape.py::test_reshape[input_shape0-new_shape0] - AssertionError: assert 0 == 1
============================================================================ 7 failed, 9 passed in 8.42s =============================================================================
ayerofieiev-tt commented 2 months ago

@jdh8 need more details so we can pass the info to the Op team.

Can you please specify individual problems with each op? Ideally we should have single ticket for single issue, with issue categorized

Op Category:

Issue Category:

Specific input: [params]. Link to an xfail test.

ayerofieiev-tt commented 1 month ago

@jdh8 please provide more details so we can fire individual tickets to Op lead

boris-drazic commented 1 month ago

@jdh8 For reshape your test (tests/lowering/tensor_manipulation/test_reshape.py) is not actually dispatching aten.reshape to compiler, but rather aten.view.

Your test module:

def forward(self, x, new_shape):
  return torch.reshape(x, new_shape)

is passed to compiler as

def forward(self, arg0_1):
    view = torch.ops.aten.view.default(arg0_1, [21, 5]);  arg0_1 = None
    return (view,)

which will yield the same result, but without using repeat.

boris-drazic commented 1 month ago

I don't see any code for processing aten OPs reshape, concat, and repeat in torch_ttnn/passes/lowering/to_tt_pass.py. That would explain why they are not being lowered to ttnn during compilation.

jdh8 commented 1 month ago

@boris-drazic, I once removed conversion for these ops to keep the size of tenstorrent/pytorch2.0_ttnn#54 reviewable. Now I make separate PR for each op.

ntarafdar commented 1 month ago

@jdh8 for each sub issue can you give us a ttnn unit test thats failing. CC @ayerofieiev-tt

jdh8 commented 1 month ago

@tarafdarTT, I've made each sub-issue as a PR, so we can experiment on tests without affecting other ops. In each PR, I 'unlocked' some tests by removing @pytest-mark-xfail, and these are the failing tests.

ntarafdar commented 1 month ago

@jdh8 oh okay, do you need anything from me (or the TM team ) right now or its that after you lower to ttnn ?