Closed daoxian closed 1 year ago
I wonder if there's any plan on the NC4HW4 layout? This layout can obviously improve the convolution operator's performance. So why don't compute library support it ?
Hi @daoxian
ACL aligns with the data types found major APIs like tflite and nnapi.
Could you please provide to a model or use-case where NC4HW4 brings a considerable improvement?
https://arxiv.org/pdf/2002.12418.pdf
I wonder if there's any plan on the NC4HW4 layout? This layout can obviously improve the convolution operator's performance. So why don't compute library support it ?