intel / compute-runtime

Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
MIT License
1.1k stars 229 forks source link

Segmentation fault on Gen12LP when operating on DMABUF shared memory #695

Closed Programmerino closed 5 months ago

Programmerino commented 5 months ago

Environment

OS: Arch Linux Hardware: Intel Gen12LP (i7-1165G7) Driver Version: Intel Compute Runtime 23.35.27191.9-1 Kernel Version: 6.6.9-2-cachyos

Description

I am experiencing a segmentation fault when running an OpenCL workload using the Intel Compute Runtime version 23.35.27191.9-1 on an Intel Gen12LP architecture. The fault occurs when the program attempts an enqueue write buffer operation.

Steps to Reproduce

  1. Compile the main.c program provided, which sets up a GBM buffer and an OpenCL context.
  2. Run the resulting executable on a system with the Intel Compute Runtime version specified above.
  3. Observe the segmentation fault upon the enqueue write buffer operation.

    Expected Behavior

    The program should run without encountering a segmentation fault, allowing for successful OpenCL operations.

Actual Behavior

The program crashes with a segmentation fault during an enqueue operation. The full backtrace indicates the issue occurs within the Intel Compute Runtime's internal kernel enqueue operations.

Backtrace

#0  NEO::Kernel::requiresWaDisableRccRhwoOptimization (this=this@entry=0x5555563f1060)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/kernel/kernel.cpp:2170
#1  0x00007fffe23332b1 in NEO::GpgpuWalkerHelper<NEO::Gen12LpFamily>::getSizeForWaDisableRccRhwoOptimization (
    pKernel=pKernel@entry=0x5555563f1060)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/gen12lp/gpgpu_walker_gen12lp.cpp:47
#2  0x00007fffe2335e78 in NEO::EnqueueOperation<NEO::Gen12LpFamily>::getSizeRequiredCSKernel (
    reserveProfilingCmdsSpace=reserveProfilingCmdsSpace@entry=false, reservePerfCounters=reservePerfCounters@entry=false, 
    commandQueue=..., pKernel=0x5555563f1060, dispatchInfo=...)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/gpgpu_walker_bdw_and_later.inl:91
#3  0x00007fffe2336071 in NEO::EnqueueOperation<NEO::Gen12LpFamily>::getSizeRequiredCS (dispatchInfo=..., 
    pKernel=<optimized out>, commandQueue=..., reservePerfCounters=false, reserveProfilingCmdsSpace=200, 
    cmdType=<optimized out>)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/gpgpu_walker_base.inl:250
#4  NEO::EnqueueOperation<NEO::Gen12LpFamily>::getTotalSizeRequiredCS (eventType=eventType@entry=4596, csrDeps=..., 
    reserveProfilingCmdsSpace=reserveProfilingCmdsSpace@entry=false, reservePerfCounters=reservePerfCounters@entry=false, 
    blitEnqueue=blitEnqueue@entry=false, commandQueue=..., multiDispatchInfo=..., isMarkerWithProfiling=false, 
    eventsInWaitlist=false, resolveDependenciesByPipecontrol=false, outEvent=0x0)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/gpgpu_walker_base.inl:186
#5  0x00007fffe2309875 in NEO::getCommandStream<NEO::Gen12LpFamily, 4596u> (outEvent=<optimized out>, 
    resolveDependenciesByPipecontrol=false, eventsInWaitList=<optimized out>, isMarkerWithProfiling=false, 
    numSurfaces=<optimized out>, surfaces=<optimized out>, multiDispatchInfo=..., blitEnqueue=false, 
    reservePerfCounterCmdsSpace=false, reserveProfilingCmdsSpace=false, csrDeps=..., commandQueue=...)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/gpgpu_walker.h:105
#6  NEO::CommandQueueHw<NEO::Gen12LpFamily>::obtainCommandStream<4596u> (surfaces=0x7fffffffc1f0, numSurfaces=2, 
    resolveDependenciesByPipecontrol=false, isMarkerWithProfiling=false, 
    blockedCommandsData=std::unique_ptr<NEO::KernelOperation> = {...}, eventsRequest=..., multiDispatchInfo=..., 
    blockedQueue=<optimized out>, blitEnqueue=false, csrDependencies=..., this=0x5555563b6e70)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/command_queue_hw.h:484
#7  NEO::CommandQueueHw<NEO::Gen12LpFamily>::enqueueHandler<4596u> (this=this@entry=0x5555563b6e70, 
    surfacesForResidency=surfacesForResidency@entry=0x7fffffffc1f0, numSurfaceForResidency=numSurfaceForResidency@entry=2, 
    blocking=<optimized out>, blocking@entry=true, multiDispatchInfo=..., numEventsInWaitList=<optimized out>, 
    numEventsInWaitList@entry=0, eventWaitList=<optimized out>, event=0x0)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/enqueue_common.h:235
#8  0x00007fffe23230da in NEO::CommandQueueHw<NEO::Gen12LpFamily>::enqueueHandler<4596u, 2ul> (event=0x0, 
    eventWaitList=0x0, numEventsInWaitList=0, dispatchInfo=..., blocking=true, surfacesForResidency=..., 
    this=0x5555563b6e70)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/command_queue_hw.h:339
#9  NEO::CommandQueueHw<NEO::Gen12LpFamily>::dispatchBcsOrGpgpuEnqueue<4596u, 2ul> (this=this@entry=0x5555563b6e70, 
    dispatchInfo=..., surfaces=..., builtInOperation=builtInOperation@entry=1, 
    numEventsInWaitList=numEventsInWaitList@entry=0, eventWaitList=eventWaitList@entry=0x0, event=0x0, blocking=true, 
    csr=...)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/enqueue_common.h:1525
#10 0x00007fffe232e0ae in NEO::CommandQueueHw<NEO::Gen12LpFamily>::enqueueWriteBuffer (this=0x5555563b6e70, 
    buffer=0x5555555fd860, blockingWrite=1, offset=0, size=<optimized out>, ptr=<optimized out>, 
    mapAllocation=<optimized out>, numEventsInWaitList=0, eventWaitList=0x0, event=0x0)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/command_queue/enqueue_write_buffer.h:105
#11 0x00007fffe20b89b2 in clEnqueueWriteBuffer (commandQueue=<optimized out>, buffer=<optimized out>, 
    blockingWrite=<optimized out>, offset=<optimized out>, cb=<optimized out>, ptr=<optimized out>, 
    numEventsInWaitList=<optimized out>, eventWaitList=<optimized out>, event=<optimized out>)
    at /usr/src/debug/intel-compute-runtime/compute-runtime-23.35.27191.9/opencl/source/api/api.cpp:2520
#12 0x00007ffff7f6254b in clEnqueueWriteBuffer () from /opt/intel/oneapi/compiler/latest/linux/lib/libOpenCL.so.1
#13 0x00005555555556e6 in main () at main.c:91

Additional Information

I've also tested this when exporting from Vulkan and encountered the same issue. The code snippet from main.c is attached below for reference. main.c.txt

Programmerino commented 5 months ago

Nevermind, I'm not using the API correctly...