ROCm / clr

MIT License
104 stars 50 forks source link

[amd_hip_fp16.h] Remove anonymous namespace from __half functions #90

Closed danzimm closed 3 months ago

danzimm commented 3 months ago

When including both amd_hip_fp16.h and amd_hip_bf16.h in the same file and trying to use functions in the anonymous namespace inside amd_hip_fp16.h which are shadowed by functions in the global namespace of amd_hip_bf16.h, e.g. hlog(__half) comes from the former and is shadowed by hlog(const __hip_bfloat16) from the latter, we're met with a compiler error of the form (the top file is Half.h):

eigen/include/Eigen/src/Core/arch/Default/Half.h:530:22: error: no viable conversion from 'const half' to '__hip_bfloat16'
  530 |   return half(::hlog(a));
      |                      ^
rocm/6.0.1/include/hip/amd_detail/amd_hip_bf16.h:112:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'const half' to 'const __hip_bfloat16 &' for 1st argument
  112 | struct __hip_bfloat16 {
      |        ^~~~~~~~~~~~~~
rocm/6.0.1/include/hip/amd_detail/amd_hip_bf16.h:112:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'const half' to '__hip_bfloat16 &&' for 1st argument
  112 | struct __hip_bfloat16 {
      |        ^~~~~~~~~~~~~~
rocm/6.0.1/include/hip/amd_detail/amd_hip_bf16.h:790:53: note: passing argument to parameter 'h' here
  790 | __device__ __hip_bfloat16 hlog(const __hip_bfloat16 h) {
      |                                                     ^

This is due to the fact that the global definition hides the definition in the anonymous namespace. Looking at the git history of this file it doesn't appear there's any specific reason these functions need to be in an anonymous namespace so I went ahead and removed it so that e.g. hlog(__half) participates in overload resolution.


This patch works when using prebuilts from RPMs & patching the headers directly. I am unable to build from source to run tests locally. Let me what verification you want me to run: when I tried following the README or https://rocm.docs.amd.com/projects/HIP/en/latest/install/build.html to build & run tests I run into various errors that no docs refer to (it seems I need to clone a few other repos & possibly build another project first, but I can't quite figure out the right incantations-- I've gotten as far as a cmake error complaining about not finding amd_comgr)

cjatin commented 3 months ago

I think we made some changes to fix the ODR violation in bfloat16 header.

Can you please try that with latest HIP/ROCM.

 #include <Eigen/Core>
 #include <amd_detail/amd_hip_fp16.h>
 #include <amd_detail/amd_hip_bf16.h>

 __global__ void kernel(float* res1, float* res2) {
   auto hf1 = __float2half(2.718f);
   auto bf1 = __float2bfloat16(2.718f);
   *res1 = __half2float(hlog(hf1));
   *res2 = __bfloat162float(hlog(bf1));
 }

It seems to be working fine.

danzimm commented 3 months ago

I reproduced this with ROCm 6.2 RC3-- did you have a different version in mind? Also just to confirm, what were your compiler flags for the above file? It was compiled with hipcc right? I think we repro'd with just clang.

cjatin commented 3 months ago

Can you share your minimum reproducer here.

Regarding 6.2, It should have the fix, its just your original error, had the old hip_bf16 changes.

rocm/6.0.1/include/hip/amd_detail/amd_hip_bf16.h:790:53: note: passing argument to parameter 'h' here
  790 | __device__ __hip_bfloat16 hlog(const __hip_bfloat16 h) {

The definition of hlog is:

__BF16_DEVICE_STATIC__ __hip_bfloat16 hlog(const __hip_bfloat16 h)

Regarding compilation flags, I just used -Wall

danzimm commented 3 months ago

Yes! Give me a few hours-- we got RC4 today so I'll use that to repro the issue. Thanks for the quick response here.

danzimm commented 3 months ago

This is my third major edit, apologies-- I had myself confused a few times with my repro. I now have a minimal repro with eigen 3.3.90 & rocm 6.2 rc4:

#include <hip/hip_bf16.h>
#include <Eigen/Core>

and I get the following error:

In file included from repro.hip:2:
In file included from build/eigen/include/Eigen/Core:173:
build/eigen/include/Eigen/src/Core/arch/Default/Half.h:530:22: error: no viable conversion from 'const half' to '__hip_bfloat16'
  530 |   return half(::hlog(a));
      |                      ^
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:172:36: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'const half' to 'const __hip_bfloat16 &' for 1st argument
  172 | struct __attribute__((aligned(2))) __hip_bfloat16 {
      |                                    ^~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:172:36: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'const half' to '__hip_bfloat16 &&' for 1st argument
  172 | struct __attribute__((aligned(2))) __hip_bfloat16 {
      |                                    ^~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:265:24: note: candidate constructor not viable: no known conversion from 'const half' to 'unsigned int' for 1st argument
  265 |   __BF16_HOST_DEVICE__ __hip_bfloat16(unsigned int val)
      |                        ^              ~~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:269:24: note: candidate constructor not viable: no known conversion from 'const half' to 'int' for 1st argument
  269 |   __BF16_HOST_DEVICE__ __hip_bfloat16(int val)
      |                        ^              ~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:273:24: note: candidate constructor not viable: no known conversion from 'const half' to 'unsigned short' for 1st argument
  273 |   __BF16_HOST_DEVICE__ __hip_bfloat16(unsigned short val)
      |                        ^              ~~~~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:277:24: note: candidate constructor not viable: no known conversion from 'const half' to 'short' for 1st argument
  277 |   __BF16_HOST_DEVICE__ __hip_bfloat16(short val)
      |                        ^              ~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:281:24: note: candidate constructor not viable: no known conversion from 'const half' to 'const double' for 1st argument
  281 |   __BF16_HOST_DEVICE__ __hip_bfloat16(const double val) : __x(double_2_bfloatraw(val)) {}
      |                        ^              ~~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:284:24: note: candidate constructor not viable: no known conversion from 'const half' to 'const float' for 1st argument
  284 |   __BF16_HOST_DEVICE__ __hip_bfloat16(const float val) : __x(float_2_bfloatraw(val)) {}
      |                        ^              ~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:287:24: note: candidate constructor not viable: no known conversion from 'const half' to 'const __hip_bfloat16_raw &' for 1st argument
  287 |   __BF16_HOST_DEVICE__ __hip_bfloat16(const __hip_bfloat16_raw& val) : __x(val.x) {}
      |                        ^              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:1612:65: note: passing argument to parameter 'h' here
 1612 | __BF16_DEVICE_STATIC__ __hip_bfloat16 hlog(const __hip_bfloat16 h) {
      |                                                                 ^
In file included from buck-out/v2/gen/fbcode/2bd39149b8cbc674/scripts/danzimm/__repro_hipify_gen__/out/repro.hip:2:
In file included from build/eigen/include/Eigen/Core:173:
build/eigen/include/Eigen/src/Core/arch/Default/Half.h:675:29: error: no viable conversion from 'const Eigen::half' to '__hip_bfloat16'
  675 |   return Eigen::half(::hlog(a));
      |                             ^
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:172:36: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'const Eigen::half' to 'const __hip_bfloat16 &' for 1st argument
  172 | struct __attribute__((aligned(2))) __hip_bfloat16 {
      |                                    ^~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:172:36: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'const Eigen::half' to '__hip_bfloat16 &&' for 1st argument
  172 | struct __attribute__((aligned(2))) __hip_bfloat16 {
      |                                    ^~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:265:24: note: candidate constructor not viable: no known conversion from 'const Eigen::half' to 'unsigned int' for 1st argument
  265 |   __BF16_HOST_DEVICE__ __hip_bfloat16(unsigned int val)
      |                        ^              ~~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:269:24: note: candidate constructor not viable: no known conversion from 'const Eigen::half' to 'int' for 1st argument
  269 |   __BF16_HOST_DEVICE__ __hip_bfloat16(int val)
      |                        ^              ~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:273:24: note: candidate constructor not viable: no known conversion from 'const Eigen::half' to 'unsigned short' for 1st argument
  273 |   __BF16_HOST_DEVICE__ __hip_bfloat16(unsigned short val)
      |                        ^              ~~~~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:277:24: note: candidate constructor not viable: no known conversion from 'const Eigen::half' to 'short' for 1st argument
  277 |   __BF16_HOST_DEVICE__ __hip_bfloat16(short val)
      |                        ^              ~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:281:24: note: candidate constructor not viable: no known conversion from 'const Eigen::half' to 'const double' for 1st argument
  281 |   __BF16_HOST_DEVICE__ __hip_bfloat16(const double val) : __x(double_2_bfloatraw(val)) {}
      |                        ^              ~~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:284:24: note: candidate constructor not viable: no known conversion from 'const Eigen::half' to 'const float' for 1st argument
  284 |   __BF16_HOST_DEVICE__ __hip_bfloat16(const float val) : __x(float_2_bfloatraw(val)) {}
      |                        ^              ~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:287:24: note: candidate constructor not viable: no known conversion from 'const Eigen::half' to 'const __hip_bfloat16_raw &' for 1st argument
  287 |   __BF16_HOST_DEVICE__ __hip_bfloat16(const __hip_bfloat16_raw& val) : __x(val.x) {}
      |                        ^              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
build/rocm/6.2.0/include/hip/amd_detail/amd_hip_bf16.h:1612:65: note: passing argument to parameter 'h' here
 1612 | __BF16_DEVICE_STATIC__ __hip_bfloat16 hlog(const __hip_bfloat16 h) {
      |                                                                 ^
2 errors generated when compiling for gfx90a.

BTW I don't think I mentioned this from above-- I'm working on this for Meta

cjatin commented 3 months ago

Can you share your compile command as well.

The code is as follows?

#include <hip/hip_bf16.h>
#include <Eigen/Core>
int main() {}
cjatin commented 3 months ago

I think I can reproduce the issue, but having some trouble with my machine. I think I am missing some newer devicelibs.

Reinstalling latest version of rocm and trying, will update it here.

cjatin commented 3 months ago

I think a better way is to fix this in eigen lib itself.

hlog is the only function being referenced from global namespace.

other half functions like hsqrt or hexp are referenced without any ::

Raised https://gitlab.com/libeigen/eigen/-/merge_requests/1661/diffs to fix this on eigen, lets see what folks say on it.

danzimm commented 3 months ago

Sure, works for me (double checked with our repro). Thanks for putting up that patch. Going to go ahead and close this in hope that we can get that eigen patch upstreamed.