Closed xinyi-li7 closed 1 year ago
Hi @xinyi-li7 ,
hip_bfloat16 is a structure which can be constructed with float or converted using helper functions: documentation
The other conversion functions are in the HIP Math documentation here
Cheers!
Hello @cgmillette,
Thank you for your previous guidance. With your assistance, I've successfully transitioned a bf16 CUDA program to HIP bf16. However, I'm facing some challenges as I attempt to convert a program from half (fp16) to float16_t.
In reference to the documentation you provided, it appears that float16_t is essentially _Float16. I managed to find a _Float16 example and it looks like numerical values can be assigned directly to _Float16 without explicit conversion. Could you confirm if my understanding is correct? Apologies for the inconvenience, the documentation on _Float16 seems a bit limited.
While trying to compile a float16_t program, I encountered a compilation error:
rocwmma::fill_fragment(c_fragment, 0.0);
^~~~~~~~~~~~~~~~~~~~~~
mc_test_numerics-MI100-binary16.cpp:113:3: note: in instantiation of function template specialization 'wmma_ker<_Float16>' requested here
From the error message, it seems like the rocWMMA library does not support a filling function for float16_t. However, the rocWMMA documentation suggests that we should be able to set the output (C/D) as fp16.
Based on the README.md, it appears that you distinguish fp16 and half differently. Upon reading this discussion, it appears that they are essentially the same, with the only difference being that amd_hip_fp16.h needs to be included for usage with 'half'. Could you please clarify this?
Your continued support would be greatly appreciated.
Hi @xinyi-li7
Yes, the float16_t is natively _Float16. This type may be instantiated with implicit conversion from float type.
rocWMMA API functions all conform to type support listed in the README.
You have provided a compiler note and not the actual error, which is difficult to guess and provide guidance. One thing I noticed though is the value provided to rocwmma::fill is a double (0.0), and not a float (0.0f). I don't believe there is implicit casting to float16_t via double.
rocWMMA provides samples, some specifically for hgemm which uses float16_t here. Please have a look.
Cheers!
Hi @cgmillette,
Apologies for my delayed response, I was occupied with another project last week. I wanted to let you know that I've found a solution for the kernel filling issue.
Specifically, the fix involved using the correct data type in the template for the kernel function. There was a variable whose type was defined as returntype
, which is determined at the point of calling the kernel. I've adjusted the fill(0.0)
method to fill(returntype(0.0))
, which successfully resolved the issue.
Interestingly, the initial fill(0.0) syntax used to work with nvcuda::wmma
. It's something to keep in mind for future reference.
Thank you once again for your thoughtful and detailed response.
Hi, Since rocWMMA provided the separate datatype like
rocwmma::bfloat16
, I wondered if there is any functions which can convert float number to your rocwmma half or bfloat16 like__float2bfloat16
in NVIDIA? Thanks!