Closed sdjordjevicTT closed 4 days ago
@tarafdarTT @sjameelTT It seems that the reduction operation only works when the dimensions of the input tensors are tile-aligned.
@tarafdarTT @sjameelTT I created the following PR to fix this: https://github.com/tenstorrent/tt-metal/pull/12274
Can you please review it? :)
Describe the bug While developing the TT-MLIR compiler, we encountered an issue when using ttnn.mean op. The problem occurs with ttnn.mean operation when input tensor dims aren't tile-dim aligned. The code errors out with the following error message:
To Reproduce I managed to reproduce this easily on plain TTNN:
The output of this test:
Expected behavior The expected behavior is that the ttnn.mean doesn't produce errors when a non-tiled align tensor is provided as its operand.
Screenshots If applicable, add screenshots to help explain your problem.
Please complete the following environment information:
Additional context I looked a bit at the issue. In the TTNN implementation of reduction op, there is a code that calculates the output tensor shape:
The issue is in this line:
padded_output_shape.push_back(input_shape[axis]);
We iterate through all the dims of the tensor. When we encounter the dim that we are not reducing, we want to capture the non-padded and padded parts of the input shape, but we don't capture the padded part correctly because input_shape[axis] returns the unpadded part. Instead, we should retrieve something like this:
padded_output_shape.push_back(input_shape.value[axis]);
Later, as a part of the reduction op, the reshape op is executed:
This reshape operation fails because it expects a padded shape for a tiled layout, but instead, the reduction operation supplies the unpadded shape, hence the failure occurs.