tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
468 stars 73 forks source link

Maxpool Config Support Needed for YOLOX #8923

Open ankitmcw opened 5 months ago

ankitmcw commented 5 months ago

Maxpool variant in YoloX Dark5 submodule:

  1. Maxpool variant ((1, 384, 20, 20), (5, 5), (2, 2), (1, 1)) getting low pcc(0.04816194863771736)
  2. Maxpool variant ((1, 384, 20, 20), (9, 9), (4, 4), (1, 1)) hangs
  3. Maxpool variant ((1, 384, 20, 20), (13, 13), (6, 6), (1, 1)) throws error

Statically allocated circular buffers in program 10 clash with L1 buffers on core range [(x=0,y=0) - (x=11,y=7)]. L1 buffer allocated at 702464 and static circular buffer region ends at 846656

To reproduce, checkout to branch here and run

cd $TT_METAL_HOME
pytest tests/ttnn/unit_tests/operations/test_max_pool2d.py::test_run_max_pool_yolox

Variant ((1, 384, 20, 20), (9, 9), (4, 4), (1, 1)) is currently commented in branch due to hang issue.

punithsekar commented 4 months ago

In latest main, we are currently encountering the following issue across all three test cases.

>       output = ttl.tensor.max_pool2d_v2(
            haloed_act,
            reader_indices,
            in_n,
            in_h,
            in_w,
            window_h,
            window_w,
            stride_h,
            stride_w,
            pad_h,
            pad_w,
            output_mem_config=self.output_sharded_memory_config,
        )
E       RuntimeError: TT_FATAL @ ../tt_eager/tt_dnn/op_library/pool/max_pool.cpp:30: is_pow2
E       info:
E       Row size (nchannels * bytes = 768) should be power of 2 (false).
E       backtrace:
E        --- tt::tt_metal::MaxPool::validate(std::__1::vector<tt::tt_metal::Tensor, std::__1::allocator<tt::tt_metal::Tensor>> const&) const