Closed cycheng closed 5 days ago
@cycheng there is a related fix #1980 currently under review. Any chance that solves the issue for your use case too?
@svenvh , I am afraid not because we see the mismatch in
f = cl_half_to_float(cl_half_from_float(f, half_rounding));
The inputs are: 0xD6F0 (-111.000000) and 0x3B91 (0.945801) f = -111.0 * 0.945801 RTE hw produces: -105.000000 RTZ hw produces: -104.983887
But I am okay to drop this pull request if #1980 can merge my changes.
But I am okay to drop this pull request if https://github.com/KhronosGroup/OpenCL-CTS/pull/1980 can merge my changes.
Not sure what @hvdijk thinks about that, but we can also first land #1980 without any changes, and then you can rebase your changes on top. They seem to be separate issues anyway.
I agree, they look like separate issues to me too, but I'm happy with whatever you prefer, update my PR to include this, update this PR to include mine, or keep them separate, just let me know if I need to do anything.
@hvdijk , @svenvh, I am happy to close this pull request and let @hvdijk merge this change into #1980. :)
The verification code assumes the hardware uses CL_HALF_RTE, which causes a mismatch computation results when the hardware uses RTZ. Fix to use the hardware's default rounding mode.