tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
https://docs.tenstorrent.com/ttnn/latest/index.html
Apache License 2.0
485 stars 80 forks source link

Runtime Error: ConvTranspose2D fails with OOM issue #14882

Closed punithsekar closed 1 day ago

punithsekar commented 2 weeks ago

Describe the bug One of the Vanilla Unet configurations is failing with OOM issue. "batch_size, input_height, input_width, input_channels, output_channels, filter_height, filter_width, stride_h, stride_w, pad_h, pad_w", = (1,240,320,64,32,2,2,2,2,0,0,),

To Reproduce Steps to reproduce the behavior:

  1. Checkout to branch, punith/vanilla_unet_convtranspose_issue
  2. Run command, pytest tests/ttnn/unit_tests/operations/test_conv_transpose2d.py::test_vanilla_unet

Expected behavior Execution of unit_test without any error.

Please complete the following environment information:

Additional context I also tried setting different act_block_h, but it didn't help. Example: 32,64,96,128.

punithsekar commented 2 weeks ago

fyi @saichandax

sankarmanoj-tt commented 1 week ago

This op creates an output of size 480 x 640 x 32. This does not fit in L1, even with all memory conserving approaches enabled. The way forward would be to move the input and output tensors to DRAM.

sankarmanoj-tt commented 1 day ago

Currently it is not possible to implement this op. We would need to DRAM support for Conv2d.