tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
303 stars 26 forks source link

yolo v4 conv ops bringup #5079

Open dvartaniansTT opened 5 months ago

dvartaniansTT commented 5 months ago

Describe the bug

  1. Some conv variants with groups=1 for yolov4 are failing. (12 failed, 113 passed) on resolution 480x640!

  2. Please note, we also need to add support for groups>1 for 46 of the convs in YOLOv4. (46 failed) add support for groups>1

Please prioritize enabling groups>1 as the failing groups==1 tests are passing on lower resolution 240x320! groups>1 separate issue: 6580 We eventually need to go as high as 960x 1280 resolution. But to expedite bringup process we can start with the lower-resolutions.

To Reproduce from dvartanians/yolov4 run: pytest tests/ttnn/unit_tests/operations/test_conv2d.py::test_yolov4_conv

I also have a separate test for the groups > 1 convs for whenever we add support for it. you may run: pytest tests/ttnn/unit_tests/operations/test_conv2d.py::test_yolov4_conv_groups_larger_than_one

Expected behavior

  1. figure out the sharding configs or other parameters that would pass the failing conv tests for groups=1.
  2. figure out the optimal sharding configs that pass for the convs ideally we would like all convs to pass with block sharding or atleast minimal reshards in between the convs.
  3. add support for groups>1 convs.
  4. make all convs with groups>1 pass.
  5. make them pass with the most optimal configs.

Please complete the following environment information:

Additional context customer feature!

jliangTT commented 3 months ago

Assigning to @nsmithtt to triage - putting this as p2 while we discuss the priority of these items offline.

dvartaniansTT commented 3 months ago

I understand there are several asks in this one. I will make a separate issue for convs with groups > 1 and link it to this one.

jliangTT commented 3 months ago

@nsmithtt , should this land in the conv generality bucket?