Closed shwetankTT closed 18 hours ago
@ntarafdar @sjameelTT @jaebaek and @yugi957 I have not tested this extensively. Please let me know if this PR does not make sense. There was an issue i was facing during optimization of yolov4 model which was failing due to this validation error and i decided to fix it.
Can you run models perf CI to ensure that nothing broke?
This doesn't seem to me like it should work in general with arbitrary padding for on-device concat; this op isn't padding aware yet afaik. If the inputs to concat have the same logical shapes but different padding, we would very likely mess stuff up. On the other hand, we only have padding for tiled tensors now and the padding should be the same for tensors of the same logical shape, so this should actually be okay until that changes.
I think we will eventually have arbitrary padding including for RM tensors, so we would have to revisit this then. @sjameelTT any thoughts? I know you've looked at concat before.
@shwetankTT mentioned to me that he is obtaining tiled tensors with identical logical shape but distinct padding from sharded to interleaved (assuming a tilize is thrown in as well). This is even worse than what I was worrying about above.
input-tensor --> shape=Shape([1, 1, 400[448], 256]), dtype=DataType::BFLOAT16, layout=Layout::TILE)
MemoryConfig(memory_layout=TensorMemoryLayout::INTERLEAVED,buffer_type=BufferType::L1,shard_spec=std::nullopt)
input_tensor_2 --> shape=Shape([1, 1, 400[416], 256]), dtype=DataType::BFLOAT16, layout=Layout::TILE)
MemoryConfig(memory_layout=TensorMemoryLayout::INTERLEAVED,buffer_type=BufferType::DRAM,shard_spec=std::nullopt)
I am not seeing this issue over the mainline. Seems like resolved. Closing this issue due same.
Ticket
15308
Problem description
The concat operation evaluates the shapes of all input tensors to determine whether the operation can be performed. However, the LegacyShape includes padding as part of the shape computation. Padding should not be considered in shape validation for the concat operation..
What's changed
CI Link: https://github.com/tenstorrent/tt-metal/actions/runs/11953331942
Checklist