Closed FranklandJack closed 1 year ago
There are actually more properties defined in PI: https://github.com/intel/llvm/blob/02bc9323e860f5672386db712071fa3277d493b0/sycl/include/sycl/detail/pi.h#L597-L606
Also, similar properties are defined in Level Zero: https://github.com/oneapi-src/level-zero/blob/4ed13f327d3389285592edcf7598ec3cb2bc712e/include/ze_api.h#LL4253C42-L4253C42
I think we should probably expose all of the following in UR:
// from L0
UR_MEM_ALLOC_FLAG_BIAS_CACHED,
UR_MEM_ALLOC_FLAG_BIAS_UNCACHED
// from PI/L0
UR_MEM_ALLOC_WRITE_COMBINED,
UR_MEM_ALLOC_INITIAL_PLACEMENT_DEVICE,
UR_MEM_ALLOC_INITIAL_PLACEMENT_HOST,
UR_MEM_ALLOC_DEVICE_READ_ONLY,
UR_MEM_USM_ALLOC_BUFFER_LOCATION // ?
L0 actually has separate flags for device and host allocations: ze_device_mem_alloc_flag_t
and ze_host_mem_alloc_flag_t
- I'm not sure if we want to do the same.
The last property I listed (UR_MEM_USM_ALLOC_BUFFER_LOCATION) seems to be something similar to device heaps we discussed in https://github.com/oneapi-src/unified-runtime/issues/176 (at least that's my understanding). Here's some description from llvm repo: https://github.com/intel/llvm/blob/3b2b9ca9aef77b27709dc03f75ba8cfd4d85ebf6/sycl/doc/extensions/supported/sycl_ext_intel_buffer_location.asciidoc
@igchor
L0 actually has separate flags for device and host allocations: ze_device_mem_alloc_flag_t and ze_host_mem_alloc_flag_t - I'm not sure if we want to do the same.
I think we could have the same set of flags, but applications to those both types above. Reason in L0 there's a ze_device_mem_alloc_flags_t and a ze_host_mem_alloc_flags_t is to be able to set in shared allocation different behavior whether the allocation is in host or the device.
The last property I listed (UR_MEM_USM_ALLOC_BUFFER_LOCATION) seems to be something similar to device heaps we discussed in https://github.com/oneapi-src/unified-runtime/issues/176 (at least that's my understanding). Here's some description from llvm repo:
The documentation for the CL counterpart is here https://registry.khronos.org/OpenCL/extensions/intel/cl_intel_mem_alloc_buffer_location.html#_overview
This is i think for FPGAs devices, not for GPUs, not sure the purpose. So maybe needed, but more for an extension? At least in L0 we dont have it.
I think it would be very wise to have an extensible mechanism to pass additional allocation properties as they are required. We shouldn't be constrained by a small bitfield.
I don't have a very strong opinion how to do this - it could be an array of property-value pairs like OpenCL, or some sort of structure chaining mechanism like Vulkan and Level Zero, or something else entirely.
@bashbaug Yes, I think a chaining mechanism would fit quite well. It's actually useful for solving https://github.com/oneapi-src/unified-runtime/issues/176 as well (as described here: https://github.com/oneapi-src/unified-runtime/issues/176#issuecomment-1405681528)
Issue
PI has the
pi_usm_mem_properties
bitfield which can take the following values:PI_MEM_ALLOC_DEVICE_READ_ONLY
PI_MEM_USM_ALLOC_BUFFER_LOCATION
used here and here respectively in the DPC++ runtime.Unified Runtime has the
ur_usm_mem_flags_t
for which there is no equivalent enumeration entries.Task
Add the missing enumeration entries to
ur_usm_mem_flags_t
to meet the requirements for implementing the DPC++ runtime on Unified Runtime.