KhronosGroup / OpenCL-CTS

The OpenCL Conformance Tests
Apache License 2.0
178 stars 187 forks source link

gl: enable cl_khr_fp16 for image write tests #1974

Closed lakshmih closed 2 weeks ago

svenvh commented 1 month ago

I don't understand why these tests should require cl_khr_fp16. Yes, they have a pointer-to-half, but this pointer is only used as an input to vload_half, which is part of the core specification and not part of cl_khr_fp16. Is there some other part of the extension that these tests are using?

I was confused about this as well initially, but the specification for halfn has a footnote that states cl_khr_fp16 must be enabled; i.e., the test cannot use half4 without also enabling cl_khr_fp16. The scalar half type does not have such a footnote.

An alternative (better?) fix could be to declare source as half *, as the test is casting source to half * anyway.

bashbaug commented 1 month ago

An alternative (better?) fix could be to declare source as half *, as the test is casting source to half * anyway.

My gut feeling is that this would be slightly better, but I honestly don't know what these half tests are trying to do. There seems to be an assumption in the tests that using a CL_HALF_FLOAT channel data type requires the input data to be half also, but this isn't the case and we could use float data just as well (indeed, the tests end up calling the same write_imagef for these images).

So, I guess my preference would be to tear out all of this unnecessary half stuff and just use float instead: this would simplify the tests and it would also allow the CL_HALF_FLOAT channel data type - which is a core channel data type and not part of the cl_khr_fp16 extension! - to be tested on more devices. I can certainly appreciate that this is bigger change, though.

Maybe we should do this:

  1. Do whatever it takes to get these tests unblocked and "correct", at least for the current test assumptions. Since it looks like they're already checking for and requiring cl_khr_fp16 for these cases, and the kernels are using half4, I'd be fine with this PR as-is.
  2. File a CTS issue to eliminate the dependency on cl_khr_fp16 and ideally fix the incorrect test assumptions. Good: remove the checks for cl_khr_fp16, the changes in this PR to enable cl_khr_fp16, and use half* instead of half4*. Better: remove all of the half kernels and use of the half type, also.
  3. File a spec issue to clarify which half types require cl_khr_fp16 and which do not. I think you're right and half4 requires cl_khr_fp16, but this could be described more clearly, especially now that the extensions are described in the main spec.
bashbaug commented 1 month ago

FWIW, the latest upstream Clang isn't behaving the way I'd expect it to, especially for half4. See: https://godbolt.org/z/M67s8rfeK

svenvh commented 1 month ago

FWIW, the latest upstream Clang isn't behaving the way I'd expect it to, especially for half4. See: https://godbolt.org/z/M67s8rfeK

Nice find, from a quick look through the clang source code this must have been present for a long time. Assuming we want to diagnose this, feel free to raise an issue in https://github.com/llvm/llvm-project/issues and assign it to me.

lakshmih commented 1 month ago

Yes upstream clang seems to be compiling this successfully however the spec does seem to require cl_khr_fp16 for functions taking or returning half types as per https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#vector-data-load-and-store-functions.

  | All functions taking or returning half types are supported only when the cl_khr_fp16 extension macro is supported.|

Can we merge this change and tackle any spec modifications separately. Thanks.

svenvh commented 1 week ago

FWIW, the latest upstream Clang isn't behaving the way I'd expect it to, especially for half4. See: https://godbolt.org/z/M67s8rfeK

I've opened https://github.com/llvm/llvm-project/pull/96640 to fix the issue of allowing halfn types while the cl_khr_fp16 extension isn't enabled.