oneapi-src / oneDNN

oneAPI Deep Neural Network Library (oneDNN)
https://uxlfoundation.org
Apache License 2.0
3.64k stars 1.01k forks source link

[ARM] Support fp16 data type in JIT Reorder kernel #2185

Open dmitry-gorokhov opened 1 month ago

dmitry-gorokhov commented 1 month ago

Summary

The request is to support fp16 data type in jit_uni_reorder kernel on aarch64 HW.

Problem statement

Currently only fp32 and bf16 floating point data types are supported in optimized Reoder implementation on aarch64 HW. Attempt to reorder memory with fp16 data type fallbacks on reference implementation which might times slower in comparison with jitted code. Different FWs uses FP16 as default execution type on ARM HW. This is basically creates demand on highly optimized FP16 reorder to speedup model compilation/preparation time (mostly by optimizing Conv/Matmul weights reorder to blocked format) and inference time (most of the models are mixed precision and require multiple fp32<->fp16 and fp16<->u8/i8 conversion).

Preferred solution

Extend jit_uni_reoder kernel with fp16 data type to support fp32<->fp16 and fp16<->u8/i8 conversions.