microsoft / antares

Antares: an automatic engine for multi-platform kernel generation and optimization. Supporting CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL for CPU/GPU, OpenCL for AMD/NVIDIA, Android CPU/GPU backends.
Other
435 stars 45 forks source link

append __habs for float16 abs in cuda backend #362

Closed LeiWang1999 closed 1 year ago

LeiWang1999 commented 1 year ago

reference to https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH____HALF__FUNCTIONS.html

we should especially use __habs to do a half abs operation, and this pr had been tested pass under nnfusion codegen.

btw, why don't we select gen-code based on different backend instead of data type?

LeiWang1999 commented 1 year ago

seems there's another solution, closed.