Closed zhoub closed 1 year ago
At this point this extension is supported only on Windows. There are currently no plans to implement this extension on Linux.
Hi
This is actually important as Windows platform. So many video applications require the OpenCL and OpenGL interop, and now our work is based on Intel Up board. Please consider this. Thanks !
Number of platforms 3
Platform Name Intel(R) OpenCL HD Graphics
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 1.2
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_depth_images cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_media_block_io cl_intel_driver_diagnostics cl_intel_device_side_avc_motion_estimation cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_khr_fp64 cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_motion_estimation cl_intel_advanced_motion_estimation cl_intel_va_api_media_sharing
Platform Extensions function suffix INTEL
Platform Name Intel Gen OCL Driver
Platform Vendor Intel
Platform Version OpenCL 1.2 beignet 1.3 (git-5aba95a)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing
Platform Extensions function suffix Intel
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 18.0.5
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name Intel(R) OpenCL HD Graphics
Number of devices 1
Device Name Intel(R) Gen9 HD Graphics NEO
Device Vendor Intel(R) Corporation
Device Vendor ID 0x8086
Device Version OpenCL 1.2 NEO
Driver Version 19.14.12751
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 18
Max clock frequency 750MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple 32
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 1 / 1
half 8 / 8 (cl_khr_fp16)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (cl_khr_fp16)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 32, Little-Endian
Global memory size 3435970560 (3.2GiB)
Error Correction support No
Max memory allocation 1717985280 (1.6GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 131072
Global Memory cache line 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 107374080 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 4 bytes
Pitch alignment for 2D image buffers 4 bytes
Max 2D image size 16384x16384 pixels
Max 3D image size 16384x16384x2048 pixels
Max number of read image args 128
Max number of write image args 128
Local memory type Local
Local memory size 65536 (64KiB)
Max constant buffer size 1717985280 (1.6GiB)
Max number of constant args 8
Max size of kernel argument 1024
Queue properties
Out-of-order execution Yes
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 52ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
SPIR versions 1.2
printf() buffer size 4194304 (4MiB)
Built-in kernels block_motion_estimate_intel;block_advanced_motion_estimate_check_intel;block_advanced_motion_estimate_bidirectional_check_intel;
Motion Estimation accelerator version (Intel) 2
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_depth_images cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_media_block_io cl_intel_driver_diagnostics cl_intel_device_side_avc_motion_estimation cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_khr_fp64 cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_motion_estimation cl_intel_advanced_motion_estimation cl_intel_va_api_media_sharing
Platform Name Intel Gen OCL Driver
Number of devices 1
Device Name Intel(R) HD Graphics Broxton 0
Device Vendor Intel
Device Vendor ID 0x8086
Device Version OpenCL 1.2 beignet 1.3 (git-5aba95a)
Driver Version 1.3
Device OpenCL C Version OpenCL C 1.2 beignet 1.3 (git-5aba95a)
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 18
Max clock frequency 1000MHz
Device Partition (core)
Max number of sub-devices 1
Supported partition types None, None, None
Max work item dimensions 3
Max work item sizes 512x512x512
Max work group size 512
Preferred work group size multiple 16
Preferred / native vector sizes
char 16 / 8
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 8 (cl_khr_fp16)
float 4 / 4
double 0 / 2 (n/a)
Half-precision Floating-point support (cl_khr_fp16)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 4102029312 (3.82GiB)
Error Correction support No
Max memory allocation 3076521984 (2.865GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 8192
Global Memory cache line 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 65536 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 4096 bytes
Pitch alignment for 2D image buffers 1 bytes
Max 2D image size 8192x8192 pixels
Max 3D image size 8192x8192x2048 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Local
Local memory size 65536 (64KiB)
Max constant buffer size 134217728 (128MiB)
Max number of constant args 8
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 80ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
SPIR versions 1.2
printf() buffer size 1048576 (1024KiB)
Built-in kernels __cl_copy_region_align4;__cl_copy_region_align16;__cl_cpy_region_unalign_same_offset;__cl_copy_region_unalign_dst_offset;__cl_copy_region_unalign_src_offset;__cl_copy_buffer_rect;__cl_copy_image_1d_to_1d;__cl_copy_image_2d_to_2d;__cl_copy_image_3d_to_2d;__cl_copy_image_2d_to_3d;__cl_copy_image_3d_to_3d;__cl_copy_image_2d_to_buffer;__cl_copy_image_3d_to_buffer;__cl_copy_buffer_to_image_2d;__cl_copy_buffer_to_image_3d;__cl_fill_region_unalign;__cl_fill_region_align2;__cl_fill_region_align4;__cl_fill_region_align8_2;__cl_fill_region_align8_4;__cl_fill_region_align8_8;__cl_fill_region_align8_16;__cl_fill_region_align128;__cl_fill_image_1d;__cl_fill_image_1d_array;__cl_fill_image_2d;__cl_fill_image_2d_array;__cl_fill_image_3d;
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing cl_khr_fp16
Platform Name Clover
Number of devices 0
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Intel(R) OpenCL HD Graphics
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [INTEL]
clCreateContext(NULL, ...) [default] Success [INTEL]
clCreateContext(NULL, ...) [other] Success [Intel]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Intel(R) OpenCL HD Graphics
Device Name Intel(R) Gen9 HD Graphics NEO
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Intel(R) OpenCL HD Graphics
Device Name Intel(R) Gen9 HD Graphics NEO
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.8
ICD loader Profile OpenCL 1.2
NOTE: your OpenCL library declares to support OpenCL 1.2,
but it seems to support up to OpenCL 2.1 too.
In contrast, very funny.
The device Intel(R) Gen9 HD Graphics NEO has all video-related extensions but no OpenGL sharing, another device Intel(R) HD Graphics Broxton 0 has all graphics extensions but no VAAPI sharing. It means either 1 or 2, but no possible to use OpenGL to visualize the thing from 1, really sad.
Id really like to see the CL GL context sharing aswell :+1: .
same here, I'm really disappointed that I can't port my application to linux with intel cpus because of this. cl_khr_gl_sharing seems like such a key extension to have I'm surprised you're not supporting it.
I also tried using the beignet implementation and while it does support the cl_khr_gl_sharing extension, one of the functions I required, cl_mem_new_gl_buffer was empty but for a line of code:
FATAL ("Not implemented")
tears all round.
Your arguments are convincing. We will fit this effort into our development schedule for the remainder of this year.
wow that's great to hear! your hard work is very much appreciated. best of luck to your team!
Hi. Yesterday my app seized working as it used cl_khr_gl_sharing with the beignet driver on linux and that seems to be disfunctional with my most recent update now. For my compositions it is absolutely crucial to work, so I'm quite devastated, as it is more or less impossible to implement my system on other Platforms/OSes (see: https://www.youtube.com/watch?v=2rWha1HTfFE&t=360s). I'd be extremely grateful if you'd implemented it and am absolutely willing to support funding if you let me know how!
@PiotrRozenfeld
We will fit this effort into our development schedule for the remainder of this year.
Are there any news about whether work has been done on this and whether/when it will be implemented?
This is being actively worked on. We don't have a clear trend date, but Q1 is very likely.
This is being actively worked on. We don't have a clear trend date, but Q1 is very likely.
@AdamCetnerowski: Thanks, very much appreciated!
Has there been any progress on this issue?
I would also very much like to see this feature implemented
Some foundational work has been done to enable the extension. At the this time the Q1 trend is no longer valid. We intend to provide a partial implementation early Q2 for evaluation. We’re looking forward to your feedback when it’s available.
We’re looking forward to your feedback when it’s available.
Most definitely! Right now I'm holding back any upgrades on my system as I have to keep llvm 8.0.1 for the beignet drivers.
We have made progress towards this feature, but it was slower than initially expected. We’ll update this issue when we have something that is user-testable.”
Hi. Any news on when this will be released? Could use it for my thesis. Otherwise I need to find a workaround ^^
While we did lay some groundwork for cl_khr_gl_sharing with MESA on Linux, we also had to increase our initial effort estimates, as additional work to be done in MESA and Compute before we can proceed with implementation. This does not mean we abandon the effort, but we can no longer accommodate that in the near future.
We'll post updates when the work will resume.
@PiotrRozenfeld thanks a lot for the info! Although this is very bad news for me, I now know that I'll either have to get beignet compiled with llvm 10, or resort to installing a dual boot until this is resolved. I sincerely hope this is not the final blow to this in compute-runtime.
Probably not most useful comment (I'm not a developer, just user whole likes to test something new) but I was looking at AMD's ROCM sources and found there
https://github.com/ROCm-Developer-Tools/ROCclr/blob/master/device/rocm/mesa_glinterop.h
/* Mesa OpenGL inter-driver interoperability interface designed for but not
* limited to OpenCL.
Of course this is only smallest part of whole work .
@PiotrRozenfeld any ETA of having a cl_khr_gl_sharing solution going? This extension missing causes a major problem since most of our linux machines are using Intel integrated GPUs.
While we did lay some groundwork for cl_khr_gl_sharing with MESA on Linux, we also had to increase our initial effort estimates, as additional work to be done in MESA and Compute before we can proceed with implementation. This does not mean we abandon the effort, but we can no longer accommodate that in the near future.
We'll post updates when the work will resume.
Any news on that ? This extension is really a must have for many applications.
Also curious how this is going? We are building and deploying Intel NUC based video acquisition/processing solutions and the lack of this extension is the only reason we're still on Beignet drivers
The work on this extension has stalled due to other work that our team is facing, unfortunately.
@smlehbleh - could you share a sample workload that you are looking to enable? What is your platform of interest? You are indicating this already works on Beignet - implying no additional work in MESA is needed.
Hi Piotr, Our use case is an Intel NUC powered cinema camera/recording device. The RAW image processing (debayering and colour transformation etc...) is performed in OpenCL kernels and the user interface is rendered with OpenGL ES. We use Beignet OpenCL drivers which support the cl_khr_gl_sharing extension to share the processed RAW image from an OpenCL buffer with OpenGL ES to draw it to the monitor screen. We are using Ubuntu 18.04 with an Intel NUC8v7PNB and (soon) an NUC11TNBv7. The cl_khr_gl_sharing extension implementation in Beignet has been present for quite a while (Maybe a couple of years...). For an experiment we tried the NEO CL driver and replaced our 'cl_khr_gl_sharing' GL/CL interop code with a manual additional copy from a CL buffer into a GL texture - the total CPU usage appeared to go from about 10% to 15-20% and a noticeable frame delay was introduced. The buffer was a UHD RGBA32 image (~33mb).
Hi @PiotrRozenfeld , I was just wondering if Beignet's existing cl_khr_gl_sharing support/implementation could be ported or copied over into this driver - or if it could generally reduce the amount of work as you mentioned?
As I can see in Beignet documentation cl_khr_gl_sharing is partially supported, and allow to create memory objects from OpenGL buffers or 2D textures. If I understand your use case correctly you're using cl/gl sharing differently, and you share OpenCL buffers with OpenGL, so memory for buffers is allocated by OpenCL not by OpenGL. Could you clarify?
Hi Jacek, We are using the Beignet implementation of cl_khr_gl_sharing in the standard intended use case - we create an OpenGL texture and then create an OpenCL buffer from it using 'clCreateFromGLTexture'. I realise in the previous comment it seems like we're directly sharing an OpenCL buffer with OpenGL, but our currently active implementation is an OpenCL buffer created from an OpenGL texture (clCreateFromGLTexture). This allows us to draw the GL texture to the screen with OpenGL after an OpenCL kernel writes to the underlying buffer without any additional 'copy'.
Hi Jacek,
Am Mittwoch, den 01. September 2021 um 05:26:00 Uhr (-0700) schrieb Jacek Danecki:
As I can see in Beignet documentation cl_khr_gl_sharing is partially supported, and allow to create memory objects from OpenGL buffers or 2D textures.
If I understand your use case correctly you're using cl/gl sharing differently, and you share OpenCL buffers with OpenGL, so memory for buffers is allocated by OpenCL not by OpenGL. Could you clarify?
it's unclear whether you address me in your mail.
The way I used it with the Beignet driver was to first allocate the buffers in OpenGL (using "gen-buffer") and then "recreating" them in OpenCL (within the opengl context with the allocated OpenGL buffer mapped) by calling the OpenCL routine "create-buffer" with the USE-HOST-PTR flag.
Is that what you were asking?
Best, Orm
I'm going to chime in, even though it has taken so long to get any traction that I've change jobs so no longer care. GL and CL are supposed to be able to interop with calls like clCreateFromGLTexture and stay in the GPU address space.
This allows for work to be done in CL, but presented in GL in realtime -- which any copy to host techniques will quash. This has been as baseline feature in nVidia and AMD's implementation for years.
The Intel stack on Linux does not match what the Intel stack on Windows can do, that is the baseline issue of what is wrong. They have to be in parity otherwise software is trapped on Windows since it can't be ported.
In Beignet code there is example: https://github.com/intel/beignet/blob/master/examples/gl_buffer_sharing/gl_buffer_sharing.cpp, and it looks similar to your usages. After small changes in Beignet code I was able to run this example on Ubuntu 18.04 and it worked correctly. Looking at sharing implementation in Beignet I can see it uses https://www.khronos.org/registry/EGL/extensions/MESA/EGL_MESA_image_dma_buf_export.txt, so I suppose it should be enough on mesa side to implement basic sharing in Neo.
Hi @JacekDanecki, that's great and sounds promising! Would you happen to have a rough estimate for when it could get into a Neo release?
I've created new repository cl-gl-tests where I've pushed modified example from Beignet. I'll use this repo to test cl/gl implementation in Neo, so if you see any potential issues with this test in comparison to your usage scenario, please let me know. In first step I'll prepare some changes in Neo to run only tests from this repo. When I've any working Neo code I'll push it to my fork: https://github.com/JacekDanecki/compute-runtime on clgl branch.
With commit https://github.com/JacekDanecki/compute-runtime/commit/b95bc31fd7547cf5da7560bea822dca159c9e07d test gl_buffer_sharing from: https://github.com/JacekDanecki/cl-gl-tests is working now. This is still very experimental code, under development, there are some hard coded values, and some parts of code are stubbed. I've only executed gl_buffer_sharing test so far.
If you have any tests, you can share, I can focus on enabling them.
@smlehbleh @ormf any feedback on experimental code/test I've provided?
Hi @JacekDanecki, sorry was pulled onto other things, but I'm back on this now. I'm not too familiar with building the necessary .deb packages from the compute-runtime repo. Are you able to enable CI on your forked repo to make some .deb packages show up in the releases pane? Then I'll give them a go with some gl_cl interop samples I've got. Cheers, Russell
What Ubuntu version are you using?
Ubuntu 20.04.3 (kernel 5.13) on an 11th Gen Intel NUC
I'll rebase my current code to the latest Neo release and push to my fork, so you can use igc and gmmlib packages from release.
Sounds great, thanks, let me know when to grab the packages
"Right now, Intel does not support surface sharing on Linux* for example. Customer requests can drive changes to this decision." Last Updated: 12/15/2014
By Adam T Lake, Robert M Ioffe
Is this still the case today?
@Abdob, this is my understanding/experience:
It depends on the driver. Beignet OCL driver (1.3.2-6) supports the gl/cl sharing extension. The NEO Compute runtime OCL driver does not. However, @JacekDanecki has an experimental/wip branch with the feature using I think an approach similar to the Beignet driver.
Unfortunately the Beignet driver doesn't appear to support 11th Gen Intel GPUs (Xe) so for the latest Intel GPU hardware, cl/gl sharing support is not in any official driver release.
Hi @smlehbleh
My intel GPU is: glxinfo | grep Mesa client glx vendor string: Mesa Project and SGI Device: Mesa Intel(R) UHD Graphics 630 (CML GT2) (0x9bc5) OpenGL renderer string: Mesa Intel(R) UHD Graphics 630 (CML GT2) OpenGL core profile version string: 4.6 (Core Profile) Mesa 21.0.3 OpenGL version string: 4.6 (Compatibility Profile) Mesa 21.0.3 OpenGL ES profile version string: OpenGL ES 3.2 Mesa 21.0.3
I install beignet-opencl-icd and clinfo and I get this:
wave@wave:~$ clinfo Number of platforms 1 Platform Name Intel Gen OCL Driver Platform Vendor Intel Platform Version OpenCL 2.0 beignet 1.3 Platform Profile FULL_PROFILE Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short Platform Extensions function suffix Intel beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware (If you have multiple ICDs installed and OpenCL works, you can ignore this message)
Platform Name Intel Gen OCL Driver Number of devices 0
NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform clCreateContext(NULL, ...) [default] No platform clCreateContext(NULL, ...) [other] No platform beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware (If you have multiple ICDs installed and OpenCL works, you can ignore this message) clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware (If you have multiple ICDs installed and OpenCL works, you can ignore this message) clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware (If you have multiple ICDs installed and OpenCL works, you can ignore this message) clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No devices found in platform
Is cl_khr_gl_sharing suppose to be listed as a device extension?
Thanks, Abdo
@Abdob cl_khr_gl_sharing is listed on my setup with Beignet In case of CML support in Beignet see: https://bugs.launchpad.net/ubuntu/+source/beignet/+bug/1905340
glxinfo grep Mesa
client glx vendor string: Mesa Project and SGI
Device: Mesa Intel(R) Iris(R) Pro Graphics 580 (SKL GT4) (0x193b)
OpenGL renderer string: Mesa Intel(R) Iris(R) Pro Graphics 580 (SKL GT4)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 21.0.3
OpenGL version string: 4.6 (Compatibility Profile) Mesa 21.0.3
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 21.0.3
clinfo | grep cl_khr_gl
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_intel_required_subgroup_size cl_intel_media_block_io cl_intel_planar_yuv cl_khr_gl_sharing
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_intel_required_subgroup_size cl_intel_media_block_io cl_intel_planar_yuv cl_khr_gl_sharing cl_khr_fp16 cl_intel_device_side_avc_motion_estimation
clinfo
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_intel_required_subgroup_size cl_intel_media_block_io cl_intel_planar_yuv cl_khr_gl_sharing
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_intel_required_subgroup_size cl_intel_media_block_io cl_intel_planar_yuv cl_khr_gl_sharing cl_khr_fp16 cl_intel_device_side_avc_motion_estimation
root@gubuntuPC:/home/jdanecki# cat clinfo.txt
Number of platforms 1
Platform Name Intel Gen OCL Driver
Platform Vendor Intel
Platform Version OpenCL 2.0 beignet 1.4 (git-419c0417)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_intel_required_subgroup_size cl_intel_media_block_io cl_intel_planar_yuv cl_khr_gl_sharing
Platform Extensions function suffix Intel
Platform Name Intel Gen OCL Driver
Number of devices 1
Device Name Intel(R) HD Graphics Skylake Halo GT4
Device Vendor Intel
Device Vendor ID 0x8086
Device Version OpenCL 2.0 beignet 1.4 (git-419c0417)
Driver Version 1.4
Device OpenCL C Version OpenCL C 2.0 beignet 1.4 (git-419c0417)
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 72
Max clock frequency 1000MHz
Device Partition (core)
Max number of sub-devices 1
Supported partition types None, None, None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 512x512x512
Max work group size 256
Preferred work group size multiple 16
Sub-group sizes (Intel) 8, 16
Preferred / native vector sizes
char 16 / 8
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 8 (cl_khr_fp16)
float 4 / 4
double 0 / 2 (n/a)
Half-precision Floating-point support (cl_khr_fp16)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 4294967296 (4GiB)
Error Correction support No
Max memory allocation 3221225472 (3GiB)
Unified memory for Host and Device Yes
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing No
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 0 bytes
Global 0 bytes
Local 0 bytes
Max size for global variable 65536 (64KiB)
Preferred total size of global vars 65536 (64KiB)
Global Memory cache type Read/Write
Global Memory cache size 8192 (8KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 65536 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 4096 bytes
Pitch alignment for 2D image buffers 1 pixels
Max 2D image size 8192x8192 pixels
Max planar YUV image size 8192x8192 pixels
Max 3D image size 8192x8192x2048 pixels
Max number of read image args 128
Max number of write image args 8
Max number of read/write image args 8
Max number of pipe args 16
Max active pipe reservations 1
Max pipe packet size 1024
Local memory type Local
Local memory size 65536 (64KiB)
Max number of constant args 8
Max constant buffer size 134217728 (128MiB)
Max size of kernel argument 1024
Queue properties (on host)
Out-of-order execution No
Profiling Yes
Queue properties (on device)
Out-of-order execution Yes
Profiling Yes
Preferred size 16384 (16KiB)
Max size 262144 (256KiB)
Max queues on device 1
Max events on device 1024
Prefer user sync for interop Yes
Profiling timer resolution 80ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
SPIR versions 1.2
printf() buffer size 1048576 (1024KiB)
Built-in kernels __cl_copy_region_align4;__cl_copy_region_align16;__cl_copy_region_unalign_same_offset;__cl_copy_region_unalign_dst_offset;__cl_copy_region_unalign_src_offset;__cl_copy_buffer_rect;__cl_copy_buffer_rect_align4;__cl_copy_image_1d_to_1d;__cl_copy_image_2d_to_2d;__cl_copy_image_3d_to_2d;__cl_copy_image_2d_to_3d;__cl_copy_image_3d_to_3d;__cl_copy_image_2d_to_buffer;__cl_copy_image_2d_to_buffer_align4;__cl_copy_image_2d_to_buffer_align16;__cl_copy_image_3d_to_buffer;__cl_copy_image_3d_to_buffer_align4;__cl_copy_image_3d_to_buffer_align16;__cl_copy_buffer_to_image_2d;__cl_copy_buffer_to_image_2d_align4;__cl_copy_buffer_to_image_2d_align16;__cl_copy_buffer_to_image_3d;__cl_copy_buffer_to_image_3d_align4;__cl_copy_buffer_to_image_3d_align16;__cl_copy_image_1d_array_to_1d_array;__cl_copy_image_2d_array_to_2d_array;__cl_copy_image_2d_array_to_2d;__cl_copy_image_2d_array_to_3d;__cl_copy_image_2d_to_2d_array;__cl_copy_image_3d_to_2d_array;__cl_fill_region_unalign;__cl_fill_region_align2;__cl_fill_region_align4;__cl_fill_region_align8_2;__cl_fill_region_align8_4;__cl_fill_region_align8_8;__cl_fill_region_align8_16;__cl_fill_region_align128;__cl_fill_image_1d;__cl_fill_image_1d_array;__cl_fill_image_2d;__cl_fill_image_2d_array;__cl_fill_image_3d;
Device-side AVC Motion Estimation version <printDeviceInfo:165: get CL_DEVICE_AVC_ME_VERSION_INTEL : error -30>
Supports texture sampler use <printDeviceInfo:166: get CL_DEVICE_AVC_ME_SUPPORTS_TEXTURE_SAMPLER_USE_INTEL : error -30>
Supports preemption <printDeviceInfo:167: get CL_DEVICE_AVC_ME_SUPPORTS_PREEMPTION_INTEL : error -30>
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_intel_required_subgroup_size cl_intel_media_block_io cl_intel_planar_yuv cl_khr_gl_sharing cl_khr_fp16 cl_intel_device_side_avc_motion_estimation
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Intel Gen OCL Driver
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [Intel]
clCreateContext(NULL, ...) [default] Success [Intel]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Intel Gen OCL Driver
Device Name Intel(R) HD Graphics Skylake Halo GT4
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Intel Gen OCL Driver
Device Name Intel(R) HD Graphics Skylake Halo GT4
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Intel Gen OCL Driver
Device Name Intel(R) HD Graphics Skylake Halo GT4
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.11
ICD loader Profile OpenCL 2.1
Hi @JacekDanecki
Thank you for the follow up.
According to the link you provided beignet 1.3.2-8 fixed the issue. I installed 1.4 from source and it still doesn't show. We decided to move forward with another API and will no longer attempt to get the OpenCL interoperability to work.
wave@wave:~/Documents/ocl/beignet/build$ clinfo Number of platforms 1 Platform Name Intel Gen OCL Driver Platform Vendor Intel Platform Version OpenCL 2.0 beignet 1.4 (git-419c0417) Platform Profile FULL_PROFILE Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_intel_required_subgroup_size cl_intel_media_block_io cl_intel_planar_yuv Platform Extensions function suffix Intel cl_get_gt_device(): error, unknown device: 9bc5
Platform Name Intel Gen OCL Driver Number of devices 0
NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform clCreateContext(NULL, ...) [default] No platform clCreateContext(NULL, ...) [other] No platform cl_get_gt_device(): error, unknown device: 9bc5 clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform cl_get_gt_device(): error, unknown device: 9bc5 clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform cl_get_gt_device(): error, unknown device: 9bc5 clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No devices found in platform
According to the link you provided beignet 1.3.2-8 fixed the issue.
@Abdob That link was to Ubuntu bug tracker, and that fix was into a Ubuntu (and Debian) 1.3.2-8 package of beignet, not to beignet upstream project.
I installed 1.4 from source and it still doesn't show. ... Platform Version OpenCL 2.0 beignet 1.4 (git-419c0417) cl_get_gt_device(): error, unknown device: 9bc5
If you built it from upstream, it's missing CML support: https://github.com/intel/beignet/commits/master/src/cl_device_data.h
As the PR for that has not been merged: https://github.com/intel/beignet/pull/20
You would need to build e.g. the Debian version (with Debian tools, so that CML support patch gets applied on top of upstream sources): https://salsa.debian.org/opencl-team/beignet
Or just use the already built beignet packages from Ubuntu 21.04 or Debian testing (or newer distro version):
EDIT: added links to packages.
Hi @JacekDanecki, I was wondering if you were able to build and generate the NEO compute runtime .deb files for Ubuntu 20.04from your fork with the experimental/wip cl/gl sharing?
@smlehbleh I've pushed current experimental implementation rebased on release https://github.com/intel/compute-runtime/releases/tag/22.06.22433 to clgl-fork branch in repository: https://github.com/JacekDanecki/compute-runtime/tree/clgl-fork You can use gmmlib and igc packages from Neo release.
Neo can be compiled with commands:
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Debug -DDISABLE_WDDM_LINUX=1 -DBUILD_WITH_L0=1 -DNEO_DISABLE_LD_GOLD=1 ..
make igdrcl_dll
Once compiled you can create file neo.icd in /etc/OpenCL/vendors directory containing full path to libigdrcl.so library.
I'm leaving compute team, so will not work on cl/gl sharing anymore.
Hi
I found that this driver doesn't support cl_khr_gl_sharing, is there any plan for this ? Thanks !