tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
430 stars 59 forks source link

Mobilenetv2 fails with "Grid is invalid for mcast matmul!" #8486

Closed keerthana-r-mcw closed 3 months ago

keerthana-r-mcw commented 4 months ago

Mobilenetv2 pipeline fails with the error "Grid is invalid for mcast matmul!"

To Reproduce Steps to reproduce the behavior:

  1. Checkout the branch keerthanar/mobilenetv2_ttnn.
  2. Run the following command, pytest tests/ttnn/integration_tests/mobilenetv2/test_ttnn_mobilenetv2.py
  3. The following error will occur, RuntimeError: TT_FATAL @ tt_eager/tt_dnn/op_library/bmm/multi_core_reuse_mcast_2d_optimized/bmm_op_multi_core_reuse_mcast_2d_optimized.cpp:1037: false E info: E Grid is invalid for mcast matmul!
  4. For unit test, checkout the branch keerthanar/mobilenetv2_grid_invalid_issue
  5. Run the following command, pytest test/ttnn/unit_tests/operations/test_conv2d.py:: test_mobilenetv2_conv_ttnn_grid_issue_case
dvartaniansTT commented 4 months ago

@keerthana-r-mcw and @saichandax please try to find out which op is causing the error? Include the unit test for the specific op that is generating this error in this issue.

keerthana-r-mcw commented 4 months ago

Hi, The error is occuring in Conv2d(576, 160, 1, 1, bias=False) at the line models/experimental/functional_mobilenetv2/tt/ttnn_mobilenetv2.py:277: in __call__ output_tensor_c42 = self.c42(output_tensor). Also pushed the unit test for that case.

saichandax commented 3 months ago

Issue is resolved on the latest main.