Closed TT-BrianLiu closed 7 months ago
@TT-BrianLiu is this still relevant?
@TT-BrianLiu is this still relevant?
I think so. Definitely something fishy going on when we switch between tilize and matmul. @tt-aho saw something similar when uplifting attention matmul to support mixed precision. He had to force the srca
reconfig when it wasn't needed / shouldn't be doing anything.
I have a fix here: #5536
The issue is the tilize_uninit
call reprograms the unpack config back to what was stored before tilize was executed. In this case, the default operand passed to tilize uninit is 0, when in reality it should have reverted back to the config of operand 1. I passed in the correct operand to the tilize uninit function so the test can pass. Ill debug the rest of these cases as well and try to make that operand not take in a default
Merged fix here: https://github.com/tenstorrent-metal/tt-metal/pull/5536
In
tt_eager/tt_dnn/op_library/conv/kernels/conv_bmm_tilize_col_major_out_blocks.cpp
, line 169 should work but hangs ifin1_cb
is BFLOAT16 and the in0 for matmul (here,tilized_in0_cb_id
) is BFLOAT8_B.in0_pretilize_cb_id
is ROW_MAJOR and always BFLOAT16, so I don't think we even need sraA reconfig in this case?To get the test to pass, you can force the reconfig to happen with:
unpack_reconfig_data_format_srca(in1_cb_id);
For reference, there are other convs where we don't have the pre-tilize block and instead tilize once we get input from reader. The dataformats in the llk APIs should be the same here, but we don't need to force the reconfig.
Branch:
bliu/issue-4280
Test: