Open bcalidas opened 3 months ago
In the scenario above what are the SVM capabilities for this device? Specifically, does it support fine-grain SVM with atomics?
The device supports coarse grain SVM only.
We pass with the following configuration
For clGetDeviceInfo
1) Report CL_DEVICE_SVM_COARSE_GRAIN_BUFFER under DEVICE_SVM_CAPABILITIES 2) Report CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES under CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES 3) Report __opencl_c_atomic_scope_all_devices under CL_DEVICE_OPENCL_C_FEATURES
With this combination, the meaning of __opencl_c_atomic_scope_all_devices is that the compiler supports the feature but that the behavior of the kernel could be different depending on the runtime feature support. If this is ok as per the intent of the spec, we will proceed with this solution. We could expand the underlying concept to https://github.com/KhronosGroup/OpenCL-Docs/issues/1047 as well.
It will be good to put up a spec PR against https://github.com/KhronosGroup/OpenCL-Docs/issues/1047 and confirm that the updated text reads well and aligns with existing implementations.
We pass with the following configuration [...]
This matches what we report for our coarse-grain SVM GPUs also, see e.g. https://opencl.gpuinfo.org/displayreport.php?id=2215.
Text for __opencl_c_atomic_scope_all_devices already accounts for fallback behavior when fine grained SVM is not supported. We should update the description for the runtime queries (CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES and CL_DEVICE_ATOMIC_FENCE_CAPABILITIES) to have the same behavior, specifically that when used on a fine-grained non-atomic SVM buffer, a coarse-grained SVM buffer, or a non-SVM buffer CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES would behave like CL_DEVICE_ATOMIC_SCOPE_DEVICE
This issue is related to https://github.com/KhronosGroup/OpenCL-Docs/issues/1047 . We have a specific question.
Can an implementation not report CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES for the clGetDeviceInfo query CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES but still define the compiler feature macro - __opencl_c_atomic_scope_all_devices
In this case the implementation does report __opencl_c_atomic_scope_all_devices for the clGetDeviceInfo query: CL_DEVICE_OPENCL_C_FEATURES
From the spec - "When used on a fine-grained non-atomic SVM buffer, a coarse-grained SVM buffer, or a non-SVM buffer, operations parameterized with memory_scope_all_svm_devices will behave as if they were parameterized with memory_scope_device"
This would imply that it is ok for an implementation to report __opencl_c_atomic_scope_all_devices but not CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES since the behavior of kernels on such an implementation is defined in the spec.
There are some conformance tests which check for consistency between feature macro and device info queries. These will need to be adapted pending the outcome of this discussion. It does bring up the larger question of when it is appropriate to expect these queries to match.
An additional consideration is how the compiler options -cl-std=CL3.0 and -cl-std=CL2.0 should affect compiler behavior in this case,