Calling the h2rcp() in rocm5.6 looks like it's converting the underlying storage as a short into a float and doing the reciprocal on that. Instead of 1/4.0=0.25, it produces 0.000057.
I tested this with gfx1010 in the docker image rocm/dev-ubuntu-20.04:5.6-complete but targeting gfx1030 gives an identical kernel disassembly so the same error should happen.
Calling the h2rcp() in rocm5.6 looks like it's converting the underlying storage as a short into a float and doing the reciprocal on that. Instead of 1/4.0=0.25, it produces 0.000057.
I tested this with gfx1010 in the docker image rocm/dev-ubuntu-20.04:5.6-complete but targeting gfx1030 gives an identical kernel disassembly so the same error should happen.
https://github.com/ROCm-Developer-Tools/hipamd/blob/4209792929ddf54ba9530813b7879cfdee42df14/include/hip/amd_detail/amd_hip_fp16.h#L1677-L1680