tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
471 stars 74 forks source link

Failed conv in Yolov7 #12785

Closed HariniMohan0102 closed 1 day ago

HariniMohan0102 commented 1 month ago

Describe the bug On unit testing the Conv2d ops of Yolov7, a Conv failed with the following error:

E       RuntimeError: TT_THROW @ ../tt_metal/impl/allocator/allocator.cpp:143: tt::exception
E       info:
E       Out of Memory: Not enough space to allocate 819200 B L1_SMALL buffer across 50 banks, where each bank needs to store 16384 B

To Reproduce Steps to reproduce the behavior:

  1. Checkout to the branch harini/yolov7_failing_convs
  2. Run the command: pytest tests/ttnn/unit_tests/operations/test_new_conv2d.py::test_yolov7_failing_convs

Expected behavior Expected to pass the failing conv.

Please complete the following environment information:

dvartaniansTT commented 1 month ago

@HariniMohan0102 why is use shallow conv variant set to False it the unit test? this is a 3 input channel conv. it is shallow.

dvartaniansTT commented 1 month ago

@HariniMohan0102 I se you listed N300, are you doing data parallel or using a single chip on N300?

HariniMohan0102 commented 1 month ago

@dvartaniansTT Unit test is done on n300 in single chip.

shwetankTT commented 3 days ago

@HariniMohan0102 I checked the issue on the mainline. The given test case passes if you override the config to --> {"act_block_h": 64}

mywoodstock commented 3 days ago

@HariniMohan0102 Please try the config @shwetankTT mentioned above and let us know if this can be closed.

HariniMohan0102 commented 3 days ago

Sure, I'll test the config and update you.

HariniMohan0102 commented 1 day ago

The test case passed by having "act_block_h": 64 in config override. Closing the issue, as it is resolved. Thank you.