Closed airMeng closed 5 months ago
Hi, stepping into ZeCommandQueueExecuteCommandLists will go into ze_loader and L0 driver implementation of that function. This will not step directly into the kernel.
What format is the module input you are using? SPIRV or native? You will need to set a breakpoint inside kernel before execution to debug it
Hello,
it is possible to debug level zero kernels. For that setup environment has to be setup similarly to what is done for SYCL application debugging (https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-debugging-dpcpp-linux/top.html), especially:
export ZET_ENABLE_PROGRAM_DEBUGGING=1 export IGC_EnableGTLocationDebugging=1
Then, start application under gdb. Set breakpoint in HOST before zeModuleCreate() call ( gdb-oneapi -ex "b example.cpp:LINE" ) When HOST hits the breakpoint - set stop on loading libraries:
(gdb) set stop-on-solib-event 1
Continue execution on HOST, when module is created - message should be printed:
Stopped due to shared library event: Inferior loaded in-memory-0x5555569e35c0-0x5555569e7508
Now, dump memory :
(gdb) dump memory module.elf 0x5555569e35c0 0x5555569e7508
Read elf to find out entry point address:
readelf -a module.elf
Elf file should list entries in symbol table, something similar to:
Symbol table '.symtab' contains 12 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 00008000fff40000 3760 FUNC LOCAL DEFAULT 1 mykernel 2: 00008000fff400c0 3568 FUNC LOCAL DEFAULT 1 _entry
The address to set BP from above is 0x8000fff400c0:
(gdb) b *0x8000fff400c0 Breakpoint 6 at 0x8000fff400c0: file main.cl, line 17.
continue
Now debugger should stop in the kernel
Thread 5.1 hit Breakpoint 6, with SIMD lanes [0-15], 0x00008000fff400c0 in mykernel ( ....
When stopped on GPU thread - it is possible to disassemble binary and single step.
Regards, Mateusz
seems our address might be wrong
There are no relocations in this file.
The decoding of unwind sections for machine type Intel Graphics Technology is not currently supported.
Symbol table '.symtab' contains 3 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: ffff8000fffa0000 104 FUNC LOCAL DEFAULT 1 _
2: ffff8000fffa0020 72 FUNC LOCAL DEFAULT 1 _entry
set BP at ffff8000fffa0020
Cannot insert breakpoint 2.
Cannot access memory at address 0xffff8000fffa0020
set BP at 0x8000fffa0020
Cannot insert breakpoint 2.
Cannot access memory at address 0x8000fffa0020
@HoppeMateusz @bmyates any advices?
@airMeng - do you see gdb event like this one:
Stopped due to shared library event: Inferior loaded in-memory-0x5555569e35c0-0x5555569e7508
it is only possible to set BP after zeModuleCreate() creates and loads module binary to GPU. Have you tried breaking just before zeCommandQueueExecuteCommandLists() and setting BP in the GPU module at that point ?
Hi, I learned from here about how to debug sycl application even per assembly lines based on
gdb-oneapi
. I wonder there is any way to debug level zero kernels similarly.The following pictures shows I tried to stop at where level zero execute kernels but I can't step in or get any thread information.
BTW, I found
gdb-oneapi
say only the Intel® oneAPI Level Zero (Level Zero) backend is supported for debug so I think debugging assembly in level zero is possible.