Incompatibilities between bfloat16 types

Snektron commented 7 months ago

There are (for some reason) two bfloat types in hip; __hip_bfloat16 and hip_bfloat16. The former is a C type, whereas the latter is a C++ type.

Judging from hipify, hip_bfloat16 is the preferred version here: https://github.com/ROCm-Developer-Tools/HIPIFY/blob/0e353a6af8b4d4b1d63aa7706b297fb3a33a7ef0/src/CUDA2HIP_Device_types.cpp#L33

While I can understand that its now hard to change the confusing headers (cuda uses cuda_bf16.h while HIP uses hip_bfloat16.h for the preferred type, hip_bf16.h is already taken by the 'bad' bfloat16), my main problem is that there are missing overloads for hip_bfloat16. Specifically, hip_bfloat16 does not overload any of the built-ins that operate on bfloat16 in cuda, for example, the functions defined here: https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH____BFLOAT16__COMPARISON.html and here: https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH____BFLOAT16__MISC.html

This makes it quite annoying to write code that must be ported between CUDA and HIP.

In CUDA, this is implemented by having a cuda_bf16.hpp and cuda_bf16.h which operate on the same type rather than 2 different, incompatible ones.

cjatin commented 7 months ago

There is a PR to add operators internally, hopefully it should be present in the next release

cjatin commented 7 months ago

https://github.com/ROCm-Developer-Tools/clr/commit/86bd518981b364c138f9901b28a529899d8654f3

ROCm / clr

Incompatibilities between bfloat16 types #24