I am experiencing a segmentation fault when running an OpenCL workload using the Intel Compute Runtime version 23.35.27191.9-1 on an Intel Gen12LP architecture. The fault occurs when the program attempts an enqueue write buffer operation.
Steps to Reproduce
Compile the main.c program provided, which sets up a GBM buffer and an OpenCL context.
Run the resulting executable on a system with the Intel Compute Runtime version specified above.
Observe the segmentation fault upon the enqueue write buffer operation.
Expected Behavior
The program should run without encountering a segmentation fault, allowing for successful OpenCL operations.
Actual Behavior
The program crashes with a segmentation fault during an enqueue operation. The full backtrace indicates the issue occurs within the Intel Compute Runtime's internal kernel enqueue operations.
Backtrace
#0 NEO::Kernel::requiresWaDisableRccRhwoOptimization (this=this@entry=0x5555563f1060)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/kernel/kernel.cpp:2170
#1 0x00007fffe23332b1 in NEO::GpgpuWalkerHelper<NEO::Gen12LpFamily>::getSizeForWaDisableRccRhwoOptimization (
pKernel=pKernel@entry=0x5555563f1060)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/gen12lp/gpgpu_walker_gen12lp.cpp:47
#2 0x00007fffe2335e78 in NEO::EnqueueOperation<NEO::Gen12LpFamily>::getSizeRequiredCSKernel (
reserveProfilingCmdsSpace=reserveProfilingCmdsSpace@entry=false, reservePerfCounters=reservePerfCounters@entry=false,
commandQueue=..., pKernel=0x5555563f1060, dispatchInfo=...)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/gpgpu_walker_bdw_and_later.inl:91
#3 0x00007fffe2336071 in NEO::EnqueueOperation<NEO::Gen12LpFamily>::getSizeRequiredCS (dispatchInfo=...,
pKernel=<optimized out>, commandQueue=..., reservePerfCounters=false, reserveProfilingCmdsSpace=200,
cmdType=<optimized out>)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/gpgpu_walker_base.inl:250
#4 NEO::EnqueueOperation<NEO::Gen12LpFamily>::getTotalSizeRequiredCS (eventType=eventType@entry=4596, csrDeps=...,
reserveProfilingCmdsSpace=reserveProfilingCmdsSpace@entry=false, reservePerfCounters=reservePerfCounters@entry=false,
blitEnqueue=blitEnqueue@entry=false, commandQueue=..., multiDispatchInfo=..., isMarkerWithProfiling=false,
eventsInWaitlist=false, resolveDependenciesByPipecontrol=false, outEvent=0x0)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/gpgpu_walker_base.inl:186
#5 0x00007fffe2309875 in NEO::getCommandStream<NEO::Gen12LpFamily, 4596u> (outEvent=<optimized out>,
resolveDependenciesByPipecontrol=false, eventsInWaitList=<optimized out>, isMarkerWithProfiling=false,
numSurfaces=<optimized out>, surfaces=<optimized out>, multiDispatchInfo=..., blitEnqueue=false,
reservePerfCounterCmdsSpace=false, reserveProfilingCmdsSpace=false, csrDeps=..., commandQueue=...)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/gpgpu_walker.h:105
#6 NEO::CommandQueueHw<NEO::Gen12LpFamily>::obtainCommandStream<4596u> (surfaces=0x7fffffffc1f0, numSurfaces=2,
resolveDependenciesByPipecontrol=false, isMarkerWithProfiling=false,
blockedCommandsData=std::unique_ptr<NEO::KernelOperation> = {...}, eventsRequest=..., multiDispatchInfo=...,
blockedQueue=<optimized out>, blitEnqueue=false, csrDependencies=..., this=0x5555563b6e70)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/command_queue_hw.h:484
#7 NEO::CommandQueueHw<NEO::Gen12LpFamily>::enqueueHandler<4596u> (this=this@entry=0x5555563b6e70,
surfacesForResidency=surfacesForResidency@entry=0x7fffffffc1f0, numSurfaceForResidency=numSurfaceForResidency@entry=2,
blocking=<optimized out>, blocking@entry=true, multiDispatchInfo=..., numEventsInWaitList=<optimized out>,
numEventsInWaitList@entry=0, eventWaitList=<optimized out>, event=0x0)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/enqueue_common.h:235
#8 0x00007fffe23230da in NEO::CommandQueueHw<NEO::Gen12LpFamily>::enqueueHandler<4596u, 2ul> (event=0x0,
eventWaitList=0x0, numEventsInWaitList=0, dispatchInfo=..., blocking=true, surfacesForResidency=...,
this=0x5555563b6e70)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/command_queue_hw.h:339
#9 NEO::CommandQueueHw<NEO::Gen12LpFamily>::dispatchBcsOrGpgpuEnqueue<4596u, 2ul> (this=this@entry=0x5555563b6e70,
dispatchInfo=..., surfaces=..., builtInOperation=builtInOperation@entry=1,
numEventsInWaitList=numEventsInWaitList@entry=0, eventWaitList=eventWaitList@entry=0x0, event=0x0, blocking=true,
csr=...)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/enqueue_common.h:1525
#10 0x00007fffe232e0ae in NEO::CommandQueueHw<NEO::Gen12LpFamily>::enqueueWriteBuffer (this=0x5555563b6e70,
buffer=0x5555555fd860, blockingWrite=1, offset=0, size=<optimized out>, ptr=<optimized out>,
mapAllocation=<optimized out>, numEventsInWaitList=0, eventWaitList=0x0, event=0x0)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/enqueue_write_buffer.h:105
#11 0x00007fffe20b89b2 in clEnqueueWriteBuffer (commandQueue=<optimized out>, buffer=<optimized out>,
blockingWrite=<optimized out>, offset=<optimized out>, cb=<optimized out>, ptr=<optimized out>,
numEventsInWaitList=<optimized out>, eventWaitList=<optimized out>, event=<optimized out>)
at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/api/api.cpp:2520
#12 0x00007ffff7f6254b in clEnqueueWriteBuffer () from /opt/intel/oneapi/compiler/latest/linux/lib/libOpenCL.so.1
#13 0x00005555555556e6 in main () at main.c:91
Additional Information
I've also tested this when exporting from Vulkan and encountered the same issue. The code snippet from main.c is attached below for reference.
main.c.txt
Environment
OS: Arch Linux Hardware: Intel Gen12LP (i7-1165G7) Driver Version: Intel Compute Runtime 23.35.27191.9-1 Kernel Version: 6.6.9-2-cachyos
Description
I am experiencing a segmentation fault when running an OpenCL workload using the Intel Compute Runtime version 23.35.27191.9-1 on an Intel Gen12LP architecture. The fault occurs when the program attempts an enqueue write buffer operation.
Steps to Reproduce
Expected Behavior
The program should run without encountering a segmentation fault, allowing for successful OpenCL operations.
Actual Behavior
The program crashes with a segmentation fault during an enqueue operation. The full backtrace indicates the issue occurs within the Intel Compute Runtime's internal kernel enqueue operations.
Backtrace
Additional Information
I've also tested this when exporting from Vulkan and encountered the same issue. The code snippet from main.c is attached below for reference. main.c.txt