apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators
https://tvm.apache.org/
Apache License 2.0
11.42k stars 3.4k forks source link

[SME][TOPI] Add conv2d NHWC SME fp16->fp32 schedule #17048

Closed Anndrey24 closed 1 month ago

Anndrey24 commented 1 month ago

This commit extends the SME conv2d NHWC schedule to support convolutions with float16 inputs (data and kernel) and a float32 output using the tensor intrinsics added in #16981.

cc @ekalda @lhutton1

ekalda commented 1 month ago

Thanks @Anndrey24 and @lhutton1!