Closed fxmarty closed 6 months ago
Same here. Want to know if any solutions.
Thanks for reporting this. Will look into it.
You are right, the issue is due to half2 using unsigned short x,y
, it should be `half x, y`.
At the moment HIP is doing a cast of unsigned short to half, which is causing this issue.
Wil raise a PR to fix it, which might take some time to get to github. meanwhile to fix it you can manually cast it to __half
before calling __half2float
. Something like __half2float(*reinterpret_cast<__half*>(&scale_half2.x));
Right, for now (I'm not sure that's got to be fixed soon due to backward compatibility), reinterpret_cast<__half*>
is the only solution already supported by hipify-clang
(#801), but not by hipify-perl.
@fxmarty Can you please test with latest ROCm 6.1.0 (HIP 6.1)? If resolved, please close ticket. Thanks!
@ppanchad-amd Just tested, this is fixed thank you.
Hi,
I have a kernel where
__half2float
behaves differently with HIP vs CUDA.Reproduce with
Use this CUDA kernel:
Running this kernel on an Nvidia GPU, we rightfully get:
Running then:
on an AMD machine (here MI250), we get:
which is obviously wrong.
To me, the issue stems from the definition of the
__half2
struct in rocminclude/hip/amd_detail/amd_hip_fp16.h
:Compare to the definition in CUDA (though the
__CUDA_ALIGN__(4)
is obscure):Replacing the call
by
scale_back = (float)scale_half2.data.x;
or byscale_back = __low2float(scale_half2);
solves the issue. But I believe this is a bug, given that the CUDA kernel and HIP kernel behave differently, with no error raised. Maybe the hipifier should handle this case?Related https://github.com/ROCm-Developer-Tools/HIP/issues/3280
cc @ardfork @cjatin