tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
300 stars 23 forks source link

Question on support to uint16 tile in fp32 mode #9542

Open jeongu-moreh opened 1 week ago

jeongu-moreh commented 1 week ago

The following compute kernel just unpack a tile in CB to DST register and pack it again to CB.

namespace NAMESPACE {
void MAIN {
    constexpr uint32_t num_tiles = get_compile_time_arg_val(0);

    constexpr uint32_t cb_src = tt::CB::c_in0;
    constexpr uint32_t cb_dst = tt::CB::c_out0;

    unary_op_init_common(cb_src, cb_dst);

    for (uint32_t i = 0; i < num_tiles; i++) {
        cb_wait_front(cb_src, 1);
        cb_reserve_back(cb_dst, 1);

        tile_regs_acquire();
        unpack_reconfig_data_format_srca(cb_src);
        copy_tile_to_dst_init_short(cb_src);
        copy_tile(cb_src, 0, 0);
        tile_regs_commit();

        tile_regs_wait();
        pack_reconfig_data_format(cb_dst);
        pack_tile(0, cb_dst);
        tile_regs_release();

        cb_pop_front(cb_src, 1);
        cb_push_back(cb_dst, 1);
    }
}
}  // namespace NAMESPACE

If at least one CB is of uint16 type and fp32 mode is enabled, the output is different from the input. Is an uint16 type not supported in fp32 mode, or is any additional configuration logic required?

sangwon-chae commented 1 week ago

@davorchap Would you please assign an engineer regarding this query? Thanks.