apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators
https://tvm.apache.org/
Apache License 2.0
11.42k stars 3.4k forks source link

[SME] Extract gemm block correctly when fused with bias #17076

Closed lhutton1 closed 4 weeks ago

lhutton1 commented 1 month ago

Prior to this commit, the scheduling assumed the gemm block would be the second to last block in the function ("unpadding" step is the final block). However, when dense is fused with a bias or activation the gemm block is no longer the second to last block. This commit instead searches a single reduction block to use as the gemm block.

cc @ekalda @Anndrey24

lhutton1 commented 1 month ago

Yep exactly, functionality-wise nothing changes. Happy to adjust the commit message to what is being tested.