Open LPanosTT opened 1 month ago
On another note. When I hardcode this interleave op I can actually get resnet50 to execute half of the model. There's a conv2d op halfway through that causes a circular buffer - L1 clash. Metal issue here: https://github.com/tenstorrent/tt-metal/issues/12790
@LPanosTT, can you link the issue that this one is blocked by?
Also I think this should be supported, let's sync with someone on TTNN side to figure out how to supply a bcast'd sharded input. This should be possible, we can take a look at their implementation of resnet for reference too, I think here: models/demos/ttnn_resnet/tt/ttnn_functional_resnet50_new_conv_api.py
can you link the issue that this one is blocked by?
@nsmithtt There isn't an issue this is blocking, more so the resnet50 bringup milestone in tt-forge: https://github.com/tenstorrent/tt-forge-fe/issues/137#issue-2475043862
In Resnet50, the first convolution is followed by a multiply op. The output tensor from conv2d is shared while the constant it is to be multiplied with is not. This causes TTNN to attempt broadcasting on the constant, I think by assuming the constant is sharded. This causes a
bad optional access
on line 73 ofThis attempts to get the shard spec of a tensor which does not have one.
To repro, run this test in forge:
There is a solution that fixes this. That is to convert the conv2d output to interleaved. I can do this by placing this code in the runtime execution function of conv2d, on the conv2d's output:
This isn't ideal because it's not necessary in all cases. For example, if a maxpool2d op immediately follows the conv2d, leaving the conv2d output as sharded works just fine. @nvukobratTT when we come up with a solution for this I might want to include some of those test cases in this PR: https://github.com/tenstorrent/tt-forge-fe/pull/304
@nsmithtt @nvukobratTT thoughts?