Closed nakajee closed 9 months ago
This doesn't apply to f16 or bf16?
vgprPackTemp is not used for f16/bf16. This extra vgpr is necessary only for 8bit data packing. No need to allocate this for f16/bf16.
Previously, this was allocated in f16 case, but it was just wasting a vpgr.