tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
303 stars 26 forks source link

Functional shallow Unet PCC is low in the last conv layer #8454

Open saichandax opened 1 month ago

saichandax commented 1 month ago

Describe the bug In Unet function model, the pcc drop is observed in last conv layer of the model The harini/unet_conv shows the pcc drop till each layer below:

decoder1 (pcc = 0.9808506765509414) last conv (pcc = 0.8468204189183342) overall model (pcc = 0.8066952002315315)

To Reproduce Checkout to the branch harini/unet_conv, and run the command pytest tests/ttnn/integration_tests/unet/test_ttnn_unet.py

Testing the harini/unet_conv branch saves the decoder1 output which can be feeded to the last conv layer as input for submodule testing in this commit. To run the submodule of conv, run pytest tests/ttnn/integration_tests/unet/test_ttnn_unet.py that gives pcc = 0.9924751222494625. We can observe that the submodule test with decoder input passes but it fails while running along with whole model.

Expected behavior To improve and get the PCC>0.99 for the ttnn Unet model.

Please complete the following environment information:

dvartaniansTT commented 1 month ago

@HariniMohan0102 and @saichandax , Do we have a unit test with PCC check for the conv that results in PCC drop? Do we get low PCC for the stand alone op itself or only when integrated with the mode?

HariniMohan0102 commented 1 month ago

@dvartaniansTT We get low pcc for the conv only when integrated with the model. We get PCC = 0.99 for the standalone op, the unit test of the conv is here.

smehtaTT commented 1 month ago

@mbahnasTT @saichandax to help confirm using the Visualizer tool