KhronosGroup / OpenCL-Docs

OpenCL API, OpenCL C, Extensions, SPIR-V Environment Specs, Ref page, and C++ for OpenCL doc sources.
Other
349 stars 110 forks source link

Can an implementation diverge on device info query and compiler feature macro for atomic_scope_all_devices #1129

Open bcalidas opened 3 months ago

bcalidas commented 3 months ago

This issue is related to https://github.com/KhronosGroup/OpenCL-Docs/issues/1047 . We have a specific question.

Can an implementation not report CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES for the clGetDeviceInfo query CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES but still define the compiler feature macro - __opencl_c_atomic_scope_all_devices

In this case the implementation does report __opencl_c_atomic_scope_all_devices for the clGetDeviceInfo query: CL_DEVICE_OPENCL_C_FEATURES

From the spec - "When used on a fine-grained non-atomic SVM buffer, a coarse-grained SVM buffer, or a non-SVM buffer, operations parameterized with memory_scope_all_svm_devices will behave as if they were parameterized with memory_scope_device"

This would imply that it is ok for an implementation to report __opencl_c_atomic_scope_all_devices but not CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES since the behavior of kernels on such an implementation is defined in the spec.

There are some conformance tests which check for consistency between feature macro and device info queries. These will need to be adapted pending the outcome of this discussion. It does bring up the larger question of when it is appropriate to expect these queries to match.

An additional consideration is how the compiler options -cl-std=CL3.0 and -cl-std=CL2.0 should affect compiler behavior in this case,

bashbaug commented 3 months ago

In the scenario above what are the SVM capabilities for this device? Specifically, does it support fine-grain SVM with atomics?

bcalidas commented 3 months ago

The device supports coarse grain SVM only.

bcalidas commented 3 months ago

We pass with the following configuration

For clGetDeviceInfo

1) Report CL_DEVICE_SVM_COARSE_GRAIN_BUFFER under DEVICE_SVM_CAPABILITIES 2) Report CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES under CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES 3) Report __opencl_c_atomic_scope_all_devices under CL_DEVICE_OPENCL_C_FEATURES

With this combination, the meaning of __opencl_c_atomic_scope_all_devices is that the compiler supports the feature but that the behavior of the kernel could be different depending on the runtime feature support. If this is ok as per the intent of the spec, we will proceed with this solution. We could expand the underlying concept to https://github.com/KhronosGroup/OpenCL-Docs/issues/1047 as well.

It will be good to put up a spec PR against https://github.com/KhronosGroup/OpenCL-Docs/issues/1047 and confirm that the updated text reads well and aligns with existing implementations.

bashbaug commented 3 months ago

We pass with the following configuration [...]

This matches what we report for our coarse-grain SVM GPUs also, see e.g. https://opencl.gpuinfo.org/displayreport.php?id=2215.

lakshmih commented 2 months ago

Text for __opencl_c_atomic_scope_all_devices already accounts for fallback behavior when fine grained SVM is not supported. We should update the description for the runtime queries (CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES and CL_DEVICE_ATOMIC_FENCE_CAPABILITIES) to have the same behavior, specifically that when used on a fine-grained non-atomic SVM buffer, a coarse-grained SVM buffer, or a non-SVM buffer CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES would behave like CL_DEVICE_ATOMIC_SCOPE_DEVICE