IntelPython / numba-dpex

Data Parallel Extension for Numba
https://intelpython.github.io/numba-dpex/
Apache License 2.0
75 stars 33 forks source link

Investigating IGC generated kernel binary and dump info #419

Open 1e-to opened 3 years ago

1e-to commented 3 years ago

When set to “1”, IGC will write number of dumps into /tmp/IntelIGC.

$ export IGC_ShaderDumpEnable=1 

To read the DWARF of a kernel, we first need a copy of the IGC generated kernel binary. To do that, run the Python script in a debugger, and set a breakpoint in the kernel:

$ gdb-oneapi -q python 
(gdb) break sum.py:13     # Assumes the kernel is in file sum.py, at line 13 
(gdb) run sum.py 

Once the breakpoint hits, the kernel has been generated and offloaded. At that point, the IGFX driver (i.e. our debugger driver) has copied the kernel into a file, and saved it at /tmp. All files saved at /tmp/IntelIGC/pytho_xxxxxx/

Then, to read the DWARF in that kernel binary (elf), use tool llvm-dwarfdump. The app outputs the contents of the DWARF. So for example, to create a text file DWARF.dump with the contents of the DWARF data:

llvm-dwarfdump xxxx.elf > dwarf

!Nested function information is missing here!

```python OCL_asmefeefdc15e2ad3fe_simd8_dppyPy_dppy_py_devfn__5F__5F_main_5F__5F__2E_data_5F_parallel_5F_sum_24_1_2E_array_28_float32_2C__20_1d_2C__20_C_29__2E_array_28_float32_2C__20_1d_2C__20_C_29__2E_array_28_float32_.elf: file format ELF64-unknown .debug_info contents: 0x00000000: Compile Unit: length = 0x000002a1 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x000002a5) 0x0000000b: DW_TAG_compile_unit DW_AT_producer ("spirv") DW_AT_language (DW_LANG_Python) DW_AT_name ("sum.py") DW_AT_low_pc (0x0000000000000000) DW_AT_high_pc (0x0000000000000810) DW_AT_stmt_list (0x00000000) DW_AT_unknown_2400 (0x0008) DW_AT_comp_dir ("/localdisk/work/etotmeni/stepping") 0x0000004d: DW_TAG_subprogram DW_AT_linkage_name ("dppy_py_devfn__5F__5F_main_5F__5F__2E_data_5F_parallel_5F_sum_24_1_2E_array_28_float32_2C__20_1d_2C__20_C_29__2E_array_28_float32_2C__20_1d_2C__20_C_29__2E_array_28_float32_2C__20_1d_2C__20_C_29_") DW_AT_name ("data_parallel_sum") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (15) DW_AT_unknown_2400 (0x0008) DW_AT_external (true) DW_AT_low_pc (0x0000000000000000) DW_AT_high_pc (0x0000000000000810) 0x00000138: DW_TAG_variable DW_AT_name ("__ocl_dbg_gid0") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_WASM_location 0x32 +37, DW_OP_plus_uconst 0x22, ed 33 1a 0a 40 00 1e 0a 40 00 ef) 0x00000160: DW_TAG_variable DW_AT_name ("__ocl_dbg_gid1") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_WASM_location 0x32 +37, DW_OP_plus_uconst 0x25, ed 33 1a 0a 40 00 1e 0a 40 00 ef) 0x00000188: DW_TAG_variable DW_AT_name ("__ocl_dbg_gid2") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_WASM_location 0x32 +37, DW_OP_plus_uconst 0x28, ed 33 1a 0a 40 00 1e 0a 40 00 ef) 0x000001b0: DW_TAG_variable DW_AT_name ("__ocl_dbg_lid0") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_WASM_location 0x32 +37, DW_OP_plus_uconst 0x2b, ed 33 1a 0a 40 00 1e 0a 40 00 ef) 0x000001d8: DW_TAG_variable DW_AT_name ("__ocl_dbg_lid1") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_WASM_location 0x32 +37, DW_OP_plus_uconst 0x2e, ed 33 1a 0a 40 00 1e 0a 40 00 ef) 0x00000200: DW_TAG_variable DW_AT_name ("__ocl_dbg_lid2") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_WASM_location 0x32 +37, DW_OP_plus_uconst 0x31, ed 33 1a 0a 40 00 1e 0a 40 00 ef) 0x00000228: DW_TAG_variable DW_AT_name ("__ocl_dbg_grid0") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_reg26, DW_OP_bit_piece 0x40 0x0) 0x00000244: DW_TAG_variable DW_AT_name ("__ocl_dbg_grid1") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_regx 0x33, DW_OP_bit_piece 0x40 0x40) 0x00000261: DW_TAG_variable DW_AT_name ("__ocl_dbg_grid2") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (1) DW_AT_type (0x00000290 "long long") DW_AT_location (DW_OP_regx 0x34, DW_OP_bit_piece 0x40 0x0) 0x0000027e: DW_TAG_variable DW_AT_name ("i") DW_AT_decl_file ("/localdisk/work/etotmeni/stepping/sum.py") DW_AT_decl_line (17) DW_AT_type (0x0000029d "i64") DW_AT_const_value (0x0000000000000000) 0x0000028f: NULL 0x00000290: DW_TAG_base_type DW_AT_name ("long long") DW_AT_encoding (DW_ATE_signed) DW_AT_byte_size (0x08) 0x0000029d: DW_TAG_base_type DW_AT_name ("i64") DW_AT_encoding (DW_ATE_unsigned) DW_AT_byte_size (0x08) 0x000002a4: NULL ```
1e-to commented 3 years ago

Also in the .asm file at /tmp/IntelIGC/pytho_xxxxxx/ there is nothing interesting about nested functions and debugging information, only the code

```python //.kernel dppyPy_dppy_py_devfn__5F__5F_main_5F__5F__2E_data_5F_parallel_5F_sum_24_1_2E_array_28_float32_2C__20_1d_2C__20_C_29__2E_array_28_float32_2C__20_1d_2C__20_C_29__2E_array_28_float32_2C__20_1d_2C__20_C_29_ //.platform SKL //.thread_config numGRF=128, numAcc=2 //.options_string "-debug " //.full_options "-debug -emitLocation -enableStructurizer -enablePreemption -noStitchExternFunc -TotalGRFNum 128 4 -nopresched -nodpsendreorder -dontUseMultiThreadedLatencies -SBIDDepLoc -output -binary -dumpcommonisa -dumpvisa -noverifyCISA -generateDebugInfo -setstartbp -addKernelID " //.instCount 129 //.RA type TRIVIAL_RA //.declare BuiltinR0 rf=r size=32 type=ud align=16 words (r0.0) //.declare rf=r size=32 type=ud align=16 words (r11.0) //.declare BuiltinA0 rf=a size=4 type=ud align=1 words (a0.0) //.declare BuiltinA0Dot2 rf=a size=4 type=ud align=1 words (a0.2) //.declare hw_tid rf=r size=4 type=ud align=2 words (r5.0) //.declare %null rf=r size=4 type=ud align=2 words //.declare %local_id_x rf=r size=4 type=ud align=2 words (r3.3) //.declare %local_id_y rf=r size=4 type=ud align=2 words (r3.4) //.declare %local_size_x rf=r size=4 type=ud align=2 words (r2.7) //.declare %local_size_y rf=r size=4 type=ud align=2 words (r3.0) //.declare %group_id_x rf=r size=4 type=ud align=2 words (r0.1) //.declare %group_id_y rf=r size=4 type=ud align=2 words (r0.6) //.declare %group_id_z rf=r size=4 type=ud align=2 words (r0.7) //.declare %group_count_x rf=r size=4 type=ud align=2 words (r3.1) //.declare %group_count_y rf=r size=4 type=ud align=2 words (r3.2) //.declare %tsc rf=r size=20 type=ud align=2 words //.declare %arg rf=r size=0 type=ud align=16 words (r26.0) //.declare %retval rf=r size=0 type=ud align=16 words (r26.0) Output //.declare %sp rf=r size=8 type=uq align=4 words (r125.3) //.declare %fp rf=r size=8 type=uq align=4 words (r125.2) //.declare %sr0 rf=r size=16 type=ud align=2 words //.declare %cr0 rf=r size=12 type=ud align=2 words //.declare %ce0 rf=r size=4 type=ud align=2 words //.declare %dbg0 rf=r size=8 type=ud align=2 words //.declare V0033 rf=r size=32 type=d alias=+0 align=16 words (r11.0) //.declare V0034 rf=r size=8 type=uq align=4 words (r5.2) //.declare V0035 rf=r size=8 type=q align=4 words (r7.3) //.declare V0036 rf=r size=8 type=uq align=4 words (r6.1) //.declare V0037 rf=r size=8 type=q align=4 words (r8.3) //.declare V0038 rf=r size=8 type=uq align=4 words (r7.0) //.declare V0039 rf=r size=8 type=q align=4 words (r9.3) //.declare V0041 rf=r size=32 type=d alias=+0 align=16 words (r11.0) //.declare V0042 rf=r size=32 type=d align=16 words (r4.0) //.declare V0043 rf=r size=12 type=d align=2 words (r10.4) //.declare V0044 rf=r size=32 type=w align=16 words (r1.0) //.declare V0045 rf=r size=32 type=w align=16 words (r2.0) //.declare V0046 rf=r size=32 type=w align=16 words (r3.0) //.declare V0047 rf=r size=8 type=uq align=4 words (r10.1) Output //.declare V0057 rf=r size=16 type=w align=16 words (r8.0) //.declare V0058 rf=r size=4 type=d align=2 words (r5.1) //.declare V0060 rf=r size=4 type=ud alias=hw_tid+0 align=16 words (r5.0) //.declare V0061 rf=r size=4 type=ud alias=V0058+0 align=2 words (r5.1) //.declare V0062 rf=r size=4 type=d align=2 words (r5.2) Output //.declare V0063 rf=r size=4 type=d align=2 words (r5.3) //.declare V0064 rf=r size=32 type=d align=16 words (r12.0) //.declare V0065 rf=r size=16 type=uw alias=V0057+0 align=16 words (r8.0) //.declare V0066 rf=r size=8 type=q alias=V0047+0 align=4 words (r10.1) //.declare V0067 rf=r size=64 type=q align=16 words (r13.0) //.declare V0068 rf=r size=64 type=uq alias=V0067+0 align=16 words (r13.0) //.declare V0069 rf=r size=64 type=uq align=16 words (r15.0) //.declare P01 rf=f size=2 type=uw align=1 words (f0.0) //.declare V0070 rf=r size=2 type=b align=1 words (r5.24) //.declare V0071 rf=r size=4 type=d align=2 words (r5.7) //.declare V0072 rf=r size=2 type=ub alias=V0070+0 align=1 words (r5.24) //.declare V0073 rf=r size=4 type=d align=2 words (r6.0) //.declare V0074 rf=r size=4 type=d align=2 words (r6.1) //.declare V0075 rf=r size=4 type=uw alias=V0073+0 align=2 words (r6.0) //.declare V0076 rf=r size=2 type=uw align=1 words (r5.13) //.declare A00 rf=a size=2 type=uw align=1 words (a0.0) //.declare V0077 rf=r size=4 type=d align=2 words (r6.4) //.declare V0078 rf=r size=32 type=d align=16 words (r17.0) //.declare V0079 rf=r size=32 type=uw alias=V0044+0 align=16 words (r1.0) //.declare V0080 rf=r size=64 type=q align=16 words (r18.0) //.declare V0081 rf=r size=32 type=ud alias=V0078+0 align=16 words (r17.0) //.declare P02 rf=f size=2 type=uw align=1 words (f0.0) //.declare V0082 rf=r size=2 type=b align=1 words (r6.20) //.declare V0083 rf=r size=4 type=d align=2 words (r6.6) //.declare V0084 rf=r size=2 type=ub alias=V0082+0 align=1 words (r6.20) //.declare V0085 rf=r size=4 type=d align=2 words (r6.7) //.declare V0086 rf=r size=4 type=d align=2 words (r7.2) //.declare V0087 rf=r size=4 type=uw alias=V0085+0 align=2 words (r6.14) //.declare V0088 rf=r size=2 type=uw align=1 words (r6.11) //.declare A01 rf=a size=2 type=uw align=1 words (a0.0) //.declare V0089 rf=r size=4 type=d align=2 words (r7.3) //.declare V0090 rf=r size=32 type=d align=16 words (r20.0) //.declare V0091 rf=r size=32 type=uw alias=V0045+0 align=16 words (r2.0) //.declare V0092 rf=r size=64 type=q align=16 words (r21.0) //.declare V0093 rf=r size=32 type=ud alias=V0090+0 align=16 words (r20.0) //.declare P03 rf=f size=2 type=uw align=1 words (f0.0) //.declare V0094 rf=r size=2 type=b align=1 words (r7.16) //.declare V0095 rf=r size=4 type=d align=2 words (r7.5) //.declare V0096 rf=r size=2 type=ub alias=V0094+0 align=1 words (r7.16) //.declare V0097 rf=r size=4 type=d align=2 words (r8.4) //.declare V0098 rf=r size=4 type=d align=2 words (r8.5) //.declare V0099 rf=r size=4 type=uw alias=V0097+0 align=2 words (r8.8) //.declare V0100 rf=r size=2 type=uw align=1 words (r7.9) //.declare A02 rf=a size=2 type=uw align=1 words (a0.0) //.declare V0101 rf=r size=4 type=d align=2 words (r9.0) //.declare V0102 rf=r size=32 type=d align=16 words (r23.0) //.declare V0103 rf=r size=32 type=uw alias=V0046+0 align=16 words (r3.0) //.declare V0104 rf=r size=64 type=q align=16 words (r24.0) //.declare V0105 rf=r size=32 type=ud alias=V0102+0 align=16 words (r23.0) //.declare V0106 rf=r size=32 type=d align=16 words (r26.0) //.declare V0107 rf=r size=64 type=q align=16 words (r27.0) //.declare V0108 rf=r size=32 type=ud alias=V0106+0 align=16 words (r26.0) //.declare V0109 rf=r size=32 type=d align=16 words (r29.0) //.declare V0110 rf=r size=64 type=q align=16 words (r30.0) //.declare V0111 rf=r size=32 type=ud alias=V0109+0 align=16 words (r29.0) //.declare V0112 rf=r size=32 type=d align=16 words (r32.0) //.declare V0113 rf=r size=64 type=q align=16 words (r33.0) //.declare V0114 rf=r size=32 type=ud alias=V0112+0 align=16 words (r32.0) //.declare P04 rf=f size=2 type=uw align=1 words (f0.0) //.declare V0115 rf=r size=2 type=b align=1 words (r9.4) //.declare V0116 rf=r size=4 type=d align=2 words (r9.2) //.declare V0117 rf=r size=2 type=ub alias=V0115+0 align=1 words (r9.4) //.declare V0118 rf=r size=4 type=d align=2 words (r9.3) //.declare V0119 rf=r size=4 type=d align=2 words (r9.4) //.declare V0120 rf=r size=4 type=uw alias=V0118+0 align=2 words (r9.6) //.declare V0121 rf=r size=2 type=uw align=1 words (r9.3) //.declare A03 rf=a size=2 type=uw align=1 words (a0.0) //.declare V0122 rf=r size=8 type=q align=4 words (r10.0) //.declare V0123 rf=r size=4 type=ud alias=V0119+0 align=2 words (r9.4) //.declare P05 rf=f size=2 type=uw align=1 words (f0.0) //.declare V0124 rf=r size=2 type=b align=1 words (r9.20) //.declare V0125 rf=r size=4 type=d align=2 words (r10.7) //.declare V0126 rf=r size=2 type=ub alias=V0124+0 align=1 words (r9.20) //.declare V0127 rf=r size=4 type=d align=2 words (r35.0) //.declare V0128 rf=r size=4 type=d align=2 words (r35.1) //.declare V0129 rf=r size=4 type=uw alias=V0127+0 align=2 words (r35.0) //.declare V0130 rf=r size=2 type=uw align=1 words (r9.11) //.declare A04 rf=a size=2 type=uw align=1 words (a0.0) //.declare V0131 rf=r size=8 type=q align=4 words (r35.1) //.declare V0132 rf=r size=4 type=ud alias=V0128+0 align=2 words (r35.1) //.declare P06 rf=f size=2 type=uw align=1 words (f0.0) //.declare V0133 rf=r size=2 type=b align=1 words (r35.16) //.declare V0134 rf=r size=4 type=d align=2 words (r35.5) //.declare V0135 rf=r size=2 type=ub alias=V0133+0 align=1 words (r35.16) //.declare V0136 rf=r size=4 type=d align=2 words (r35.6) //.declare V0137 rf=r size=4 type=d align=2 words (r35.7) //.declare V0138 rf=r size=4 type=uw alias=V0136+0 align=2 words (r35.12) //.declare V0139 rf=r size=2 type=uw align=1 words (r35.9) //.declare A05 rf=a size=2 type=uw align=1 words (a0.0) //.declare V0140 rf=r size=8 type=q align=4 words (r36.0) //.declare V0141 rf=r size=4 type=ud alias=V0137+0 align=2 words (r35.7) //.declare P07 rf=f size=2 type=uw align=1 words (f0.0) //.declare V0142 rf=r size=2 type=b align=1 words (r36.8) //.declare V0143 rf=r size=4 type=d align=2 words (r36.3) //.declare V0144 rf=r size=2 type=ub alias=V0142+0 align=1 words (r36.8) //.declare V0145 rf=r size=4 type=d align=2 words (r36.4) //.declare V0146 rf=r size=4 type=d align=2 words (r36.5) //.declare V0147 rf=r size=4 type=uw alias=V0145+0 align=2 words (r36.8) //.declare V0148 rf=r size=2 type=uw align=1 words (r36.5) //.declare A06 rf=a size=2 type=uw align=1 words (a0.0) //.declare V0149 rf=r size=8 type=q align=4 words (r36.3) //.declare V0150 rf=r size=12 type=ud alias=V0043+0 align=2 words (r10.4) //.declare V0151 rf=r size=4 type=ud alias=V0146+0 align=2 words (r36.5) //.declare V0152 rf=r size=32 type=d align=16 words (r37.0) //.declare V0153 rf=r size=64 type=q align=16 words (r38.0) //.declare V0154 rf=r size=32 type=ud alias=V0152+0 align=16 words (r37.0) //.declare V0155 rf=r size=8 type=q align=4 words (r40.0) //.declare V0156 rf=r size=32 type=ud alias=V0042+0 align=16 words (r4.0) //.declare V0157 rf=r size=64 type=q align=16 words (r41.0) //.declare P08 rf=f size=2 type=uw align=1 words (f0.0) //.declare V0158 rf=r size=64 type=q align=16 words (r43.0) //.declare V0159 rf=r size=8 type=q alias=V0034+0 align=4 words (r5.2) //.declare V0160 rf=r size=64 type=uq alias=V0158+0 align=16 words (r43.0) //.declare V0161 rf=r size=32 type=f align=16 words (r45.0) //.declare V0162 rf=r size=64 type=q align=16 words (r46.0) //.declare V0163 rf=r size=8 type=q alias=V0036+0 align=4 words (r6.1) //.declare V0164 rf=r size=64 type=uq alias=V0162+0 align=16 words (r46.0) //.declare V0165 rf=r size=32 type=f align=16 words (r48.0) //.declare V0166 rf=r size=64 type=q align=16 words (r49.0) //.declare V0167 rf=r size=8 type=q alias=V0038+0 align=4 words (r7.0) //.declare V0168 rf=r size=64 type=uq alias=V0166+0 align=16 words (r49.0) //.declare V0169 rf=r size=64 type=uq align=16 words (r51.0) //.declare V0170 rf=r size=8 type=uq align=4 words (r5.0) //.declare V0171 rf=r size=8 type=uq align=4 words (r5.1) //.declare V0172 rf=r size=8 type=uq align=4 words (r5.3) //.declare V0173 rf=r size=8 type=uq align=4 words (r6.0) //.declare V0174 rf=r size=8 type=uq align=4 words (r6.2) //.declare V0175 rf=r size=8 type=uq align=4 words (r6.3) //.declare V0176 rf=r size=8 type=q align=4 words (r7.1) //.declare V0177 rf=r size=8 type=q align=4 words (r7.2) //.declare V0178 rf=r size=8 type=q align=4 words (r8.0) //.declare V0179 rf=r size=8 type=q align=4 words (r8.1) //.declare V0180 rf=r size=8 type=q align=4 words (r8.2) //.declare V0181 rf=r size=8 type=q align=4 words (r9.0) //.declare V0182 rf=r size=8 type=q align=4 words (r9.1) //.declare V0183 rf=r size=8 type=q align=4 words (r9.2) //.declare V0184 rf=r size=8 type=q align=4 words (r10.0) //.declare rf=r size=32 type=ud align=16 words (r127.0) //.declare rf=r size=2 type=w align=1 words (r40.4) //.declare rf=r size=2 type=w align=1 words (r40.5) //.declare rf=r size=2 type=w align=1 words (r40.6) //.declare rf=r size=2 type=w align=1 words (r40.7) //.declare rf=r size=2 type=w align=1 words (r40.8) //.declare rf=r size=2 type=w align=1 words (r40.9) //.declare rf=r size=2 type=w align=1 words (r40.10) //.declare rf=r size=2 type=w align=1 words (r40.11) //.declare rf=r size=2 type=w align=1 words (r40.12) //.declare rf=r size=2 type=w align=1 words (r40.13) //.declare rf=r size=2 type=w align=1 words (r40.14) //.declare rf=r size=2 type=w align=1 words (r53.0) //.declare rf=r size=2 type=w align=1 words (r53.1) //.declare rf=r size=2 type=w align=1 words (r53.2) // .inputs // +----------+----------+--------+----------+------------+ // | id | type | bytes | at | class | // +----------+----------+--------+----------+------------+ // | V0044 | :w x 16 | 32 | r1 | general | // | V0045 | :w x 16 | 32 | r2 | general | // | V0046 | :w x 16 | 32 | r3 | general | // | V0042 | :d x 8 | 32 | r4 | general | // | V0170 | :uq | 8 | r5 | general | // | V0171 | :uq | 8 | r5+8 | general | // | V0034 | :uq | 8 | r5+16 | general | // | V0172 | :uq | 8 | r5+24 | general | // | V0173 | :uq | 8 | r6 | general | // | V0036 | :uq | 8 | r6+8 | general | // | V0174 | :uq | 8 | r6+16 | general | // | V0175 | :uq | 8 | r6+24 | general | // | V0038 | :uq | 8 | r7 | general | // | V0176 | :q | 8 | r7+8 | general | // | V0177 | :q | 8 | r7+16 | general | // | V0035 | :q | 8 | r7+24 | general | // | V0178 | :q | 8 | r8 | general | // | V0179 | :q | 8 | r8+8 | general | // | V0180 | :q | 8 | r8+16 | general | // | V0037 | :q | 8 | r8+24 | general | // | V0181 | :q | 8 | r9 | general | // | V0182 | :q | 8 | r9+8 | general | // | V0183 | :q | 8 | r9+16 | general | // | V0039 | :q | 8 | r9+24 | general | // | V0184 | :q | 8 | r10 | general | // | V0047 | :uq | 8 | r10+8 | general | // | V0043 | :d x 3 | 12 | r10+16 | general | // +----------+----------+--------+----------+------------+ //_main: L0: mov (1|M0) null<1>:ud 0x76948161:ud {Breakpoint} // [00000] (W) mov (8|M0) r11.0<1>:ud r0.0<1;1,0>:ud // [00010] (W) and (1|M0) r5.0<1>:ud r0.5<0;1,0>:ud 0x1FF:ud // &1:[00020] (W) or (1|M0) cr0.0<1>:ud cr0.0<0;1,0>:ud 0x4C0:uw {Switch} // $1:&2:[00030] (W) mov (8|M0) r8.0<1>:w 0x76543210:v // $2:&3:[00040] (W) mov (1|M0) r5.1<1>:ud r5.0<0;1,0>:ud // $3:&4:[00050] (W) mul (1|M0) r5.2<1>:d r5.1<0;1,0>:d 64:w // $4:&5:[00060] (W) add (1|M0) r5.3<1>:d r5.2<0;1,0>:d 0:w // $5:&6:[00070] mov (8|M0) r12.0<1>:d r8.0<8;8,1>:uw // $6:&7:[00080] mul (8|M0) r12.0<1>:d r12.0<8;8,1>:d 8:w // $7:&8:[00090] add (8|M0) r12.0<1>:d r5.3<0;1,0>:d r12.0<8;8,1>:d // $8:&9:[000a0] mov (8|M0) r13.0<1>:q r12.0<8;8,1>:d // $9:&10:[000b0] add (8|M0) r13.0<1>:q r10.1<0;1,0>:q r13.0<4;4,1>:q // $10:&11:[000c0] mov (8|M0) r15.0<1>:uq 0x0:uw // $11:&12:[000d0] sends (8|M0) null:ud r13 r15 0x8C 0x040682FF // wr:2+2, rd:0; hdc.dc1; a64 qword scattering write // $12:&13:[000e0] (W) mov (1|M0) r40.4<1>:w 0:w // $13:&14:[000f0] (W) cmp (1|M0) (eq)f0.0 null<1>:w r40.4<0;1,0>:w 0:w // $13:&15:[00100] (W) mov (1|M0) r40.5<1>:w 1:w // $14:&16:[00110] (W&f0.0) sel (1|M0) r5.24<2>:b r40.5<0;1,0>:w 5:w // $14:&17:[00120] (W) mov (1|M0) r5.7<1>:d r5.24<0;1,0>:ub // $15:&18:[00130] (W) add (1|M0) r6.0<1>:d r5.7<0;1,0>:d 0:w // $16:&19:[00140] (W) mul (1|M0) r5.13<1>:uw r6.0<0;1,0>:uw 0x4:uw // $17:&20:[00150] (W) add (1|M0) a0.0<1>:uw r5.13<0;1,0>:uw 0x160:uw // $18:&21:[00160] (W) mov (1|M0) r6.1<1>:d r[a0.0]<0;1,0>:d // $19:&22:[00170] (W) mul (1|M0) r6.4<1>:d r10.4<0;1,0>:d r6.1<0;1,0>:d // $20:&23:[00180] mov (8|M0) r17.0<1>:d r1.0<8;8,1>:uw // $21:&24:[00190] add (8|M0) r17.0<1>:d r17.0<8;8,1>:d r6.4<0;1,0>:d // $22:&25:[001a0] add (8|M0) r17.0<1>:d r17.0<8;8,1>:d r4.0<0;1,0>:d // $23:&26:[001b0] mov (8|M0) r18.0<1>:q r17.0<8;8,1>:ud // $24:&27:[001c0] (W) mov (1|M0) r40.6<1>:w 1:w // $25:&28:[001d0] (W) cmp (1|M0) (eq)f0.0 null<1>:w r40.6<0;1,0>:w 0:w // $25:&29:[001e0] (W) mov (1|M0) r40.7<1>:w 1:w // $26:&30:[001f0] (W&f0.0) sel (1|M0) r6.20<2>:b r40.7<0;1,0>:w 5:w // $26:&31:[00200] (W) mov (1|M0) r6.6<1>:d r6.20<0;1,0>:ub // $27:&32:[00210] (W) add (1|M0) r6.7<1>:d r6.6<0;1,0>:d 1:w // $28:&33:[00220] (W) mul (1|M0) r6.11<1>:uw r6.14<0;1,0>:uw 0x4:uw // $29:&34:[00230] (W) add (1|M0) a0.0<1>:uw r6.11<0;1,0>:uw 0x160:uw // $30:&35:[00240] (W) mov (1|M0) r7.2<1>:d r[a0.0]<0;1,0>:d // $31:&36:[00250] (W) mul (1|M0) r7.3<1>:d r10.5<0;1,0>:d r7.2<0;1,0>:d // $32:&37:[00260] mov (8|M0) r20.0<1>:d r2.0<8;8,1>:uw // $33:&38:[00270] add (8|M0) r20.0<1>:d r20.0<8;8,1>:d r7.3<0;1,0>:d // $34:&39:[00280] add (8|M0) r20.0<1>:d r20.0<8;8,1>:d r4.1<0;1,0>:d // $35:&40:[00290] mov (8|M0) r21.0<1>:q r20.0<8;8,1>:ud // $36:&41:[002a0] (W) mov (1|M0) r40.8<1>:w 2:w // $37:&42:[002b0] (W) cmp (1|M0) (eq)f0.0 null<1>:w r40.8<0;1,0>:w 0:w // $37:&43:[002c0] (W) mov (1|M0) r40.9<1>:w 1:w // $38:&44:[002d0] (W&f0.0) sel (1|M0) r7.16<2>:b r40.9<0;1,0>:w 5:w // $38:&45:[002e0] (W) mov (1|M0) r7.5<1>:d r7.16<0;1,0>:ub // $39:&46:[002f0] (W) add (1|M0) r8.4<1>:d r7.5<0;1,0>:d 2:w // $40:&47:[00300] (W) mul (1|M0) r7.9<1>:uw r8.8<0;1,0>:uw 0x4:uw // $41:&48:[00310] (W) add (1|M0) a0.0<1>:uw r7.9<0;1,0>:uw 0x160:uw // $42:&49:[00320] (W) mov (1|M0) r8.5<1>:d r[a0.0]<0;1,0>:d // $43:&50:[00330] (W) mul (1|M0) r9.0<1>:d r10.6<0;1,0>:d r8.5<0;1,0>:d // $44:&51:[00340] mov (8|M0) r23.0<1>:d r3.0<8;8,1>:uw // $45:&52:[00350] add (8|M0) r23.0<1>:d r23.0<8;8,1>:d r9.0<0;1,0>:d // $46:&53:[00360] add (8|M0) r23.0<1>:d r23.0<8;8,1>:d r4.2<0;1,0>:d // $47:&54:[00370] mov (8|M0) r24.0<1>:q r23.0<8;8,1>:ud // $48:&55:[00380] // File: /localdisk/work/etotmeni/stepping/sum.py // Line 15: @dppy.kernel mov (8|M0) r26.0<1>:d r1.0<8;8,1>:uw // #15:$51:&56:[00390] mov (8|M0) r27.0<1>:q r26.0<8;8,1>:ud // #15:$52:&57:[003a0] mov (8|M0) r29.0<1>:d r2.0<8;8,1>:uw // #15:$53:&58:[003b0] mov (8|M0) r30.0<1>:q r29.0<8;8,1>:ud // #15:$54:&59:[003c0] mov (8|M0) r32.0<1>:d r3.0<8;8,1>:uw // #15:$55:&60:[003d0] mov (8|M0) r33.0<1>:q r32.0<8;8,1>:ud // #15:$56:&61:[003e0] (W) mov (1|M0) r40.10<1>:w 0:w // #15:$57:&62:[003f0] (W) cmp (1|M0) (eq)f0.0 null<1>:w r40.10<0;1,0>:w 0:w // #15:$57:&63:[00400] (W) mov (1|M0) r40.11<1>:w 1:w // #15:$58:&64:[00410] (W&f0.0) sel (1|M0) r9.4<2>:b r40.11<0;1,0>:w 5:w // #15:$58:&65:[00420] (W) mov (1|M0) r9.2<1>:d r9.4<0;1,0>:ub // #15:$59:&66:[00430] (W) add (1|M0) r9.3<1>:d r9.2<0;1,0>:d 0:w // #15:$60:&67:[00440] (W) mul (1|M0) r9.3<1>:uw r9.6<0;1,0>:uw 0x4:uw // #15:$61:&68:[00450] (W) add (1|M0) a0.0<1>:uw r9.3<0;1,0>:uw 0x160:uw // #15:$62:&69:[00460] (W) mov (1|M0) r9.4<1>:d r[a0.0]<0;1,0>:d // #15:$63:&70:[00470] (W) mov (1|M0) r10.0<1>:q r9.4<0;1,0>:ud // #15:$64:&71:[00480] (W) mov (1|M0) r40.12<1>:w 1:w // #15:$65:&72:[00490] (W) cmp (1|M0) (eq)f0.0 null<1>:w r40.12<0;1,0>:w 0:w // #15:$65:&73:[004a0] (W) mov (1|M0) r40.13<1>:w 1:w // #15:$66:&74:[004b0] (W&f0.0) sel (1|M0) r9.20<2>:b r40.13<0;1,0>:w 5:w // #15:$66:&75:[004c0] (W) mov (1|M0) r10.7<1>:d r9.20<0;1,0>:ub // #15:$67:&76:[004d0] (W) add (1|M0) r35.0<1>:d r10.7<0;1,0>:d 1:w // #15:$68:&77:[004e0] (W) mul (1|M0) r9.11<1>:uw r35.0<0;1,0>:uw 0x4:uw // #15:$69:&78:[004f0] (W) add (1|M0) a0.0<1>:uw r9.11<0;1,0>:uw 0x160:uw // #15:$70:&79:[00500] (W) mov (1|M0) r35.1<1>:d r[a0.0]<0;1,0>:d // #15:$71:&80:[00510] (W) mov (1|M0) r35.1<1>:q r35.1<0;1,0>:ud // #15:$72:&81:[00520] (W) mov (1|M0) r40.14<1>:w 2:w // #15:$73:&82:[00530] (W) cmp (1|M0) (eq)f0.0 null<1>:w r40.14<0;1,0>:w 0:w // #15:$73:&83:[00540] (W) mov (1|M0) r53.0<1>:w 1:w // #15:$74:&84:[00550] (W&f0.0) sel (1|M0) r35.16<2>:b r53.0<0;1,0>:w 5:w // #15:$74:&85:[00560] (W) mov (1|M0) r35.5<1>:d r35.16<0;1,0>:ub // #15:$75:&86:[00570] (W) add (1|M0) r35.6<1>:d r35.5<0;1,0>:d 2:w // #15:$76:&87:[00580] (W) mul (1|M0) r35.9<1>:uw r35.12<0;1,0>:uw 0x4:uw // #15:$77:&88:[00590] (W) add (1|M0) a0.0<1>:uw r35.9<0;1,0>:uw 0x160:uw // #15:$78:&89:[005a0] (W) mov (1|M0) r35.7<1>:d r[a0.0]<0;1,0>:d // #15:$79:&90:[005b0] (W) mov (1|M0) r36.0<1>:q r35.7<0;1,0>:ud // #15:$80:&91:[005c0] // Line 17: i = dppy.get_global_id(0) (W) mov (1|M0) r53.1<1>:w 0:w // #17:$83:&92:[005d0] (W) cmp (1|M0) (eq)f0.0 null<1>:w r53.1<0;1,0>:w 0:w // #17:$83:&93:[005e0] (W) mov (1|M0) r53.2<1>:w 1:w // #17:$84:&94:[005f0] (W&f0.0) sel (1|M0) r36.8<2>:b r53.2<0;1,0>:w 5:w // #17:$84:&95:[00600] (W) mov (1|M0) r36.3<1>:d r36.8<0;1,0>:ub // #17:$85:&96:[00610] (W) add (1|M0) r36.4<1>:d r36.3<0;1,0>:d 0:w // #17:$86:&97:[00620] (W) mul (1|M0) r36.5<1>:uw r36.8<0;1,0>:uw 0x4:uw // #17:$87:&98:[00630] (W) add (1|M0) a0.0<1>:uw r36.5<0;1,0>:uw 0x160:uw // #17:$88:&99:[00640] (W) mov (1|M0) r36.5<1>:d r[a0.0]<0;1,0>:d // #17:$89:&100:[00650] (W) mul (1|M0) r36.3<1>:q r10.4<0;1,0>:ud r36.5<0;1,0>:ud // #17:$90:&101:[00660] mov (8|M0) r37.0<1>:d r1.0<8;8,1>:uw // #17:$91:&102:[00670] mov (8|M0) r38.0<1>:q r37.0<8;8,1>:ud // #17:$92:&103:[00680] add (8|M0) r38.0<1>:q r38.0<4;4,1>:q r36.3<0;1,0>:q // #17:$93:&104:[00690] (W) mov (1|M0) r40.0<1>:q r4.0<0;1,0>:ud // #17:$94:&105:[006a0] add (8|M0) r41.0<1>:q r38.0<4;4,1>:q r40.0<0;1,0>:q // #17:$95:&106:[006b0] // Line 18: c[i] = kernel_sum(a[i], b[i]) cmp (8|M0) (lt)f0.0 null<1>:q r41.0<4;4,1>:q 0:w // #18:$98:&107:[006c0] (f0.0) sel (8|M0) r43.0<1>:q r7.3<0;1,0>:q 0:w // #18:$99:&108:[006d0] add (8|M0) r43.0<1>:q r43.0<4;4,1>:q r41.0<4;4,1>:q // #18:$100:&109:[006e0] shl (8|M0) r43.0<1>:q r43.0<4;4,1>:q 2:w // #18:$101:&110:[006f0] add (8|M0) r43.0<1>:q r5.2<0;1,0>:q r43.0<4;4,1>:q // #18:$102:&111:[00700] send (8|M0) r45:f r43:uq 0xC 0x041401FF // wr:2+0, rd:1; hdc.dc1; a64 dword gathering read // #18:$103:&112:[00710] (f0.0) sel (8|M0) r46.0<1>:q r8.3<0;1,0>:q 0:w // #18:$104:&113:[00720] add (8|M0) r46.0<1>:q r46.0<4;4,1>:q r41.0<4;4,1>:q // #18:$105:&114:[00730] shl (8|M0) r46.0<1>:q r46.0<4;4,1>:q 2:w // #18:$106:&115:[00740] add (8|M0) r46.0<1>:q r6.1<0;1,0>:q r46.0<4;4,1>:q // #18:$107:&116:[00750] send (8|M0) r48:f r46:uq 0xC 0x041401FF // wr:2+0, rd:1; hdc.dc1; a64 dword gathering read // #18:$108:&117:[00760] add (8|M0) r45.0<1>:f r45.0<8;8,1>:f r48.0<8;8,1>:f // #18:$109:&118:[00770] (f0.0) sel (8|M0) r49.0<1>:q r9.3<0;1,0>:q 0:w // #18:$110:&119:[00780] add (8|M0) r49.0<1>:q r49.0<4;4,1>:q r41.0<4;4,1>:q // #18:$111:&120:[00790] shl (8|M0) r49.0<1>:q r49.0<4;4,1>:q 2:w // #18:$112:&121:[007a0] add (8|M0) r49.0<1>:q r7.0<0;1,0>:q r49.0<4;4,1>:q // #18:$113:&122:[007b0] sends (8|M0) null:ud r49 r45 0x4C 0x040681FF // wr:2+1, rd:0; hdc.dc1; a64 dword scattering write // #18:$114:&123:[007c0] mov (8|M0) r51.0<1>:uq 0x0:uw // #18:$117:&124:[007d0] sends (8|M0) null:ud r13 r51 0x8C 0x040682FF // wr:2+2, rd:0; hdc.dc1; a64 qword scattering write // #18:$118:&125:[007e0] (W) mov (8|M0) r127.0<1>:ud r11.0<8;8,1>:ud // #18:$119:&126:[007f0] (W) send (8|M0) null r127 0x27 0x02000010 {EOT} // wr:1+0, rd:0; spawner; end of thread // #18:$119:&127:[00800]// Bank Conflict Statistics: // -- GOOD: 0 // -- BAD: 0 // -- OK: 0 ```
1e-to commented 3 years ago

Also I checked the dbg file (type data), but there is nothing there either

hexdump -C xxxx.dbg > dump.txt

Output

```python 00000000 10 d0 ad de 01 00 ca 00 64 70 70 79 50 79 5f 64 |........dppyPy_d| 00000010 70 70 79 5f 70 79 5f 64 65 76 66 6e 5f 5f 35 46 |ppy_py_devfn__5F| 00000020 5f 5f 35 46 5f 6d 61 69 6e 5f 35 46 5f 5f 35 46 |__5F_main_5F__5F| 00000030 5f 5f 32 45 5f 64 61 74 61 5f 35 46 5f 70 61 72 |__2E_data_5F_par| 00000040 61 6c 6c 65 6c 5f 35 46 5f 73 75 6d 5f 32 34 5f |allel_5F_sum_24_| 00000050 31 5f 32 45 5f 61 72 72 61 79 5f 32 38 5f 66 6c |1_2E_array_28_fl| 00000060 6f 61 74 33 32 5f 32 43 5f 5f 32 30 5f 31 64 5f |oat32_2C__20_1d_| 00000070 32 43 5f 5f 32 30 5f 43 5f 32 39 5f 5f 32 45 5f |2C__20_C_29__2E_| 00000080 61 72 72 61 79 5f 32 38 5f 66 6c 6f 61 74 33 32 |array_28_float32| 00000090 5f 32 43 5f 5f 32 30 5f 31 64 5f 32 43 5f 5f 32 |_2C__20_1d_2C__2| 000000a0 30 5f 43 5f 32 39 5f 5f 32 45 5f 61 72 72 61 79 |0_C_29__2E_array| 000000b0 5f 32 38 5f 66 6c 6f 61 74 33 32 5f 32 43 5f 5f |_28_float32_2C__| 000000c0 32 30 5f 31 64 5f 32 43 5f 5f 32 30 5f 43 5f 32 |20_1d_2C__20_C_2| 000000d0 39 5f 00 00 00 00 00 00 00 00 70 00 00 00 01 00 |9_........p.....| 000000e0 00 00 30 00 00 00 02 00 00 00 40 00 00 00 03 00 |..0.......@.....| 000000f0 00 00 50 00 00 00 04 00 00 00 60 00 00 00 05 00 |..P.......`.....| 00000100 00 00 70 00 00 00 06 00 00 00 80 00 00 00 07 00 |..p.............| 00000110 00 00 90 00 00 00 08 00 00 00 a0 00 00 00 09 00 |................| 00000120 00 00 b0 00 00 00 0a 00 00 00 c0 00 00 00 0b 00 |................| 00000130 00 00 d0 00 00 00 0c 00 00 00 e0 00 00 00 0d 00 |................| 00000140 00 00 f0 00 00 00 0e 00 00 00 10 01 00 00 0f 00 |................| 00000150 00 00 30 01 00 00 10 00 00 00 40 01 00 00 11 00 |..0.......@.....| 00000160 00 00 50 01 00 00 12 00 00 00 60 01 00 00 13 00 |..P.......`.....| 00000170 00 00 70 01 00 00 14 00 00 00 80 01 00 00 15 00 |..p.............| 00000180 00 00 90 01 00 00 16 00 00 00 a0 01 00 00 17 00 |................| 00000190 00 00 b0 01 00 00 18 00 00 00 c0 01 00 00 19 00 |................| 000001a0 00 00 d0 01 00 00 1a 00 00 00 f0 01 00 00 1b 00 |................| 000001b0 00 00 10 02 00 00 1c 00 00 00 20 02 00 00 1d 00 |.......... .....| 000001c0 00 00 30 02 00 00 1e 00 00 00 40 02 00 00 1f 00 |..0.......@.....| 000001d0 00 00 50 02 00 00 20 00 00 00 60 02 00 00 21 00 |..P... ...`...!.| 000001e0 00 00 70 02 00 00 22 00 00 00 80 02 00 00 23 00 |..p...".......#.| 000001f0 00 00 90 02 00 00 24 00 00 00 a0 02 00 00 25 00 |......$.......%.| 00000200 00 00 b0 02 00 00 26 00 00 00 d0 02 00 00 27 00 |......&.......'.| 00000210 00 00 f0 02 00 00 28 00 00 00 00 03 00 00 29 00 |......(.......).| 00000220 00 00 10 03 00 00 2a 00 00 00 20 03 00 00 2b 00 |......*... ...+.| 00000230 00 00 30 03 00 00 2c 00 00 00 40 03 00 00 2d 00 |..0...,...@...-.| 00000240 00 00 50 03 00 00 2e 00 00 00 60 03 00 00 2f 00 |..P.......`.../.| 00000250 00 00 70 03 00 00 30 00 00 00 80 03 00 00 33 00 |..p...0.......3.| 00000260 00 00 90 03 00 00 34 00 00 00 a0 03 00 00 35 00 |......4.......5.| 00000270 00 00 b0 03 00 00 36 00 00 00 c0 03 00 00 37 00 |......6.......7.| 00000280 00 00 d0 03 00 00 38 00 00 00 e0 03 00 00 39 00 |......8.......9.| 00000290 00 00 f0 03 00 00 3a 00 00 00 10 04 00 00 3b 00 |......:.......;.| 000002a0 00 00 30 04 00 00 3c 00 00 00 40 04 00 00 3d 00 |..0...<...@...=.| 000002b0 00 00 50 04 00 00 3e 00 00 00 60 04 00 00 3f 00 |..P...>...`...?.| 000002c0 00 00 70 04 00 00 40 00 00 00 80 04 00 00 41 00 |..p...@.......A.| 000002d0 00 00 90 04 00 00 42 00 00 00 b0 04 00 00 43 00 |......B.......C.| 000002e0 00 00 d0 04 00 00 44 00 00 00 e0 04 00 00 45 00 |......D.......E.| 000002f0 00 00 f0 04 00 00 46 00 00 00 00 05 00 00 47 00 |......F.......G.| 00000300 00 00 10 05 00 00 48 00 00 00 20 05 00 00 49 00 |......H... ...I.| 00000310 00 00 30 05 00 00 4a 00 00 00 50 05 00 00 4b 00 |..0...J...P...K.| 00000320 00 00 70 05 00 00 4c 00 00 00 80 05 00 00 4d 00 |..p...L.......M.| 00000330 00 00 90 05 00 00 4e 00 00 00 a0 05 00 00 4f 00 |......N.......O.| 00000340 00 00 b0 05 00 00 50 00 00 00 c0 05 00 00 53 00 |......P.......S.| 00000350 00 00 d0 05 00 00 54 00 00 00 f0 05 00 00 55 00 |......T.......U.| 00000360 00 00 10 06 00 00 56 00 00 00 20 06 00 00 57 00 |......V... ...W.| 00000370 00 00 30 06 00 00 58 00 00 00 40 06 00 00 59 00 |..0...X...@...Y.| 00000380 00 00 50 06 00 00 5a 00 00 00 60 06 00 00 5b 00 |..P...Z...`...[.| 00000390 00 00 70 06 00 00 5c 00 00 00 80 06 00 00 5d 00 |..p...\.......].| 000003a0 00 00 90 06 00 00 5e 00 00 00 a0 06 00 00 5f 00 |......^......._.| 000003b0 00 00 b0 06 00 00 62 00 00 00 c0 06 00 00 63 00 |......b.......c.| 000003c0 00 00 d0 06 00 00 64 00 00 00 e0 06 00 00 65 00 |......d.......e.| 000003d0 00 00 f0 06 00 00 66 00 00 00 00 07 00 00 67 00 |......f.......g.| 000003e0 00 00 10 07 00 00 68 00 00 00 20 07 00 00 69 00 |......h... ...i.| 000003f0 00 00 30 07 00 00 6a 00 00 00 40 07 00 00 6b 00 |..0...j...@...k.| 00000400 00 00 50 07 00 00 6c 00 00 00 60 07 00 00 6d 00 |..P...l...`...m.| 00000410 00 00 70 07 00 00 6e 00 00 00 80 07 00 00 6f 00 |..p...n.......o.| 00000420 00 00 90 07 00 00 70 00 00 00 a0 07 00 00 71 00 |......p.......q.| 00000430 00 00 b0 07 00 00 72 00 00 00 c0 07 00 00 75 00 |......r.......u.| 00000440 00 00 d0 07 00 00 76 00 00 00 e0 07 00 00 77 00 |......v.......w.| 00000450 00 00 f0 07 00 00 78 00 00 00 10 08 00 00 9c 00 |......x.........| 00000460 00 00 03 00 56 33 33 01 00 00 00 77 00 02 02 0b |....V33....w....| 00000470 00 00 00 03 00 56 33 34 01 00 00 00 77 00 02 02 |.....V34....w...| 00000480 05 00 10 00 03 00 56 33 35 01 00 00 00 77 00 02 |......V35....w..| 00000490 02 07 00 18 00 03 00 56 33 36 01 00 00 00 77 00 |.......V36....w.| 000004a0 02 02 06 00 08 00 03 00 56 33 37 01 00 00 00 77 |........V37....w| 000004b0 00 02 02 08 00 18 00 03 00 56 33 38 01 00 00 00 |.........V38....| 000004c0 77 00 02 02 07 00 00 00 03 00 56 33 39 01 00 00 |w.........V39...| 000004d0 00 77 00 02 02 09 00 18 00 03 00 56 34 31 01 00 |.w.........V41..| 000004e0 00 00 77 00 02 02 0b 00 00 00 03 00 56 34 32 01 |..w.........V42.| 000004f0 00 00 00 77 00 02 02 04 00 00 00 03 00 56 34 33 |...w.........V43| 00000500 01 00 00 00 77 00 02 02 0a 00 10 00 03 00 56 34 |....w.........V4| 00000510 34 01 00 00 00 77 00 02 02 01 00 00 00 03 00 56 |4....w.........V| 00000520 34 35 01 00 00 00 77 00 02 02 02 00 00 00 03 00 |45....w.........| 00000530 56 34 36 01 00 00 00 77 00 02 02 03 00 00 00 03 |V46....w........| 00000540 00 56 34 37 01 00 00 00 77 00 02 02 0a 00 08 00 |.V47....w.......| 00000550 03 00 56 35 37 01 00 00 00 77 00 02 02 08 00 00 |..V57....w......| 00000560 00 03 00 56 35 38 01 00 00 00 77 00 02 02 05 00 |...V58....w.....| 00000570 04 00 03 00 56 36 30 01 00 00 00 77 00 02 02 05 |....V60....w....| 00000580 00 00 00 03 00 56 36 31 01 00 00 00 77 00 02 02 |.....V61....w...| 00000590 05 00 04 00 03 00 56 36 32 01 00 00 00 77 00 02 |......V62....w..| 000005a0 02 05 00 08 00 03 00 56 36 33 01 00 00 00 77 00 |.......V63....w.| 000005b0 02 02 05 00 0c 00 03 00 56 36 34 01 00 00 00 77 |........V64....w| 000005c0 00 02 02 0c 00 00 00 03 00 56 36 35 01 00 00 00 |.........V65....| 000005d0 77 00 02 02 08 00 00 00 03 00 56 36 36 01 00 00 |w.........V66...| 000005e0 00 77 00 02 02 0a 00 08 00 03 00 56 36 37 01 00 |.w.........V67..| 000005f0 00 00 77 00 02 02 0d 00 00 00 03 00 56 36 38 01 |..w.........V68.| 00000600 00 00 00 77 00 02 02 0d 00 00 00 03 00 56 36 39 |...w.........V69| 00000610 01 00 00 00 77 00 02 02 0f 00 00 00 02 00 50 31 |....w.........P1| 00000620 01 00 0e 00 0e 00 01 01 00 00 00 00 03 00 56 37 |..............V7| 00000630 30 01 00 00 00 77 00 02 02 05 00 18 00 03 00 56 |0....w.........V| 00000640 37 31 01 00 00 00 77 00 02 02 05 00 1c 00 03 00 |71....w.........| 00000650 56 37 32 01 00 00 00 77 00 02 02 05 00 18 00 03 |V72....w........| 00000660 00 56 37 33 01 00 00 00 77 00 02 02 06 00 00 00 |.V73....w.......| 00000670 03 00 56 37 34 01 00 00 00 77 00 02 02 06 00 04 |..V74....w......| 00000680 00 03 00 56 37 35 01 00 00 00 77 00 02 02 06 00 |...V75....w.....| 00000690 00 00 03 00 56 37 36 01 00 00 00 77 00 02 02 05 |....V76....w....| 000006a0 00 1a 00 02 00 41 30 01 00 13 00 13 00 00 00 00 |.....A0.........| 000006b0 00 00 00 03 00 56 37 37 01 00 00 00 77 00 02 02 |.....V77....w...| 000006c0 06 00 10 00 03 00 56 37 38 01 00 00 00 77 00 02 |......V78....w..| 000006d0 02 11 00 00 00 03 00 56 37 39 01 00 00 00 77 00 |.......V79....w.| 000006e0 02 02 01 00 00 00 03 00 56 38 30 01 00 00 00 77 |........V80....w| 000006f0 00 02 02 12 00 00 00 03 00 56 38 31 01 00 00 00 |.........V81....| 00000700 77 00 02 02 11 00 00 00 02 00 50 32 01 00 1a 00 |w.........P2....| 00000710 1a 00 01 01 00 00 00 00 03 00 56 38 32 01 00 00 |..........V82...| 00000720 00 77 00 02 02 06 00 14 00 03 00 56 38 33 01 00 |.w.........V83..| 00000730 00 00 77 00 02 02 06 00 18 00 03 00 56 38 34 01 |..w.........V84.| 00000740 00 00 00 77 00 02 02 06 00 14 00 03 00 56 38 35 |...w.........V85| 00000750 01 00 00 00 77 00 02 02 06 00 1c 00 03 00 56 38 |....w.........V8| 00000760 36 01 00 00 00 77 00 02 02 07 00 08 00 03 00 56 |6....w.........V| 00000770 38 37 01 00 00 00 77 00 02 02 06 00 1c 00 03 00 |87....w.........| 00000780 56 38 38 01 00 00 00 77 00 02 02 06 00 16 00 02 |V88....w........| 00000790 00 41 31 01 00 1f 00 1f 00 00 00 00 00 00 00 03 |.A1.............| 000007a0 00 56 38 39 01 00 00 00 77 00 02 02 07 00 0c 00 |.V89....w.......| 000007b0 03 00 56 39 30 01 00 00 00 77 00 02 02 14 00 00 |..V90....w......| 000007c0 00 03 00 56 39 31 01 00 00 00 77 00 02 02 02 00 |...V91....w.....| 000007d0 00 00 03 00 56 39 32 01 00 00 00 77 00 02 02 15 |....V92....w....| 000007e0 00 00 00 03 00 56 39 33 01 00 00 00 77 00 02 02 |.....V93....w...| 000007f0 14 00 00 00 02 00 50 33 01 00 26 00 26 00 01 01 |......P3..&.&...| 00000800 00 00 00 00 03 00 56 39 34 01 00 00 00 77 00 02 |......V94....w..| 00000810 02 07 00 10 00 03 00 56 39 35 01 00 00 00 77 00 |.......V95....w.| 00000820 02 02 07 00 14 00 03 00 56 39 36 01 00 00 00 77 |........V96....w| 00000830 00 02 02 07 00 10 00 03 00 56 39 37 01 00 00 00 |.........V97....| 00000840 77 00 02 02 08 00 10 00 03 00 56 39 38 01 00 00 |w.........V98...| 00000850 00 77 00 02 02 08 00 14 00 03 00 56 39 39 01 00 |.w.........V99..| 00000860 00 00 77 00 02 02 08 00 10 00 04 00 56 31 30 30 |..w.........V100| 00000870 01 00 00 00 77 00 02 02 07 00 12 00 02 00 41 32 |....w.........A2| 00000880 01 00 2b 00 2b 00 00 00 00 00 00 00 04 00 56 31 |..+.+.........V1| 00000890 30 31 01 00 00 00 77 00 02 02 09 00 00 00 04 00 |01....w.........| 000008a0 56 31 30 32 01 00 00 00 77 00 02 02 17 00 00 00 |V102....w.......| 000008b0 04 00 56 31 30 33 01 00 00 00 77 00 02 02 03 00 |..V103....w.....| 000008c0 00 00 04 00 56 31 30 34 01 00 00 00 77 00 02 02 |....V104....w...| 000008d0 18 00 00 00 04 00 56 31 30 35 01 00 00 00 77 00 |......V105....w.| 000008e0 02 02 17 00 00 00 04 00 56 31 30 36 01 00 00 00 |........V106....| 000008f0 77 00 02 02 1a 00 00 00 04 00 56 31 30 37 01 00 |w.........V107..| 00000900 00 00 77 00 02 02 1b 00 00 00 04 00 56 31 30 38 |..w.........V108| 00000910 01 00 00 00 77 00 02 02 1a 00 00 00 04 00 56 31 |....w.........V1| 00000920 30 39 01 00 00 00 77 00 02 02 1d 00 00 00 04 00 |09....w.........| 00000930 56 31 31 30 01 00 00 00 77 00 02 02 1e 00 00 00 |V110....w.......| 00000940 04 00 56 31 31 31 01 00 00 00 77 00 02 02 1d 00 |..V111....w.....| 00000950 00 00 04 00 56 31 31 32 01 00 00 00 77 00 02 02 |....V112....w...| 00000960 20 00 00 00 04 00 56 31 31 33 01 00 00 00 77 00 | .....V113....w.| 00000970 02 02 21 00 00 00 04 00 56 31 31 34 01 00 00 00 |..!.....V114....| 00000980 77 00 02 02 20 00 00 00 02 00 50 34 01 00 3a 00 |w... .....P4..:.| 00000990 3a 00 01 01 00 00 00 00 04 00 56 31 31 35 01 00 |:.........V115..| 000009a0 00 00 77 00 02 02 09 00 04 00 04 00 56 31 31 36 |..w.........V116| 000009b0 01 00 00 00 77 00 02 02 09 00 08 00 04 00 56 31 |....w.........V1| 000009c0 31 37 01 00 00 00 77 00 02 02 09 00 04 00 04 00 |17....w.........| 000009d0 56 31 31 38 01 00 00 00 77 00 02 02 09 00 0c 00 |V118....w.......| 000009e0 04 00 56 31 31 39 01 00 00 00 77 00 02 02 09 00 |..V119....w.....| 000009f0 10 00 04 00 56 31 32 30 01 00 00 00 77 00 02 02 |....V120....w...| 00000a00 09 00 0c 00 04 00 56 31 32 31 01 00 00 00 77 00 |......V121....w.| 00000a10 02 02 09 00 06 00 02 00 41 33 01 00 3f 00 3f 00 |........A3..?.?.| 00000a20 00 00 00 00 00 00 04 00 56 31 32 32 01 00 00 00 |........V122....| 00000a30 77 00 02 02 0a 00 00 00 04 00 56 31 32 33 01 00 |w.........V123..| 00000a40 00 00 77 00 02 02 09 00 10 00 02 00 50 35 01 00 |..w.........P5..| 00000a50 42 00 42 00 01 01 00 00 00 00 04 00 56 31 32 34 |B.B.........V124| 00000a60 01 00 00 00 77 00 02 02 09 00 14 00 04 00 56 31 |....w.........V1| 00000a70 32 35 01 00 00 00 77 00 02 02 0a 00 1c 00 04 00 |25....w.........| 00000a80 56 31 32 36 01 00 00 00 77 00 02 02 09 00 14 00 |V126....w.......| 00000a90 04 00 56 31 32 37 01 00 00 00 77 00 02 02 23 00 |..V127....w...#.| 00000aa0 00 00 04 00 56 31 32 38 01 00 00 00 77 00 02 02 |....V128....w...| 00000ab0 23 00 04 00 04 00 56 31 32 39 01 00 00 00 77 00 |#.....V129....w.| 00000ac0 02 02 23 00 00 00 04 00 56 31 33 30 01 00 00 00 |..#.....V130....| 00000ad0 77 00 02 02 09 00 16 00 02 00 41 34 01 00 47 00 |w.........A4..G.| 00000ae0 47 00 00 00 00 00 00 00 04 00 56 31 33 31 01 00 |G.........V131..| 00000af0 00 00 77 00 02 02 23 00 08 00 04 00 56 31 33 32 |..w...#.....V132| 00000b00 01 00 00 00 77 00 02 02 23 00 04 00 02 00 50 36 |....w...#.....P6| 00000b10 01 00 4a 00 4a 00 01 01 00 00 00 00 04 00 56 31 |..J.J.........V1| 00000b20 33 33 01 00 00 00 77 00 02 02 23 00 10 00 04 00 |33....w...#.....| 00000b30 56 31 33 34 01 00 00 00 77 00 02 02 23 00 14 00 |V134....w...#...| 00000b40 04 00 56 31 33 35 01 00 00 00 77 00 02 02 23 00 |..V135....w...#.| 00000b50 10 00 04 00 56 31 33 36 01 00 00 00 77 00 02 02 |....V136....w...| 00000b60 23 00 18 00 04 00 56 31 33 37 01 00 00 00 77 00 |#.....V137....w.| 00000b70 02 02 23 00 1c 00 04 00 56 31 33 38 01 00 00 00 |..#.....V138....| 00000b80 77 00 02 02 23 00 18 00 04 00 56 31 33 39 01 00 |w...#.....V139..| 00000b90 00 00 77 00 02 02 23 00 12 00 02 00 41 35 01 00 |..w...#.....A5..| 00000ba0 4f 00 4f 00 00 00 00 00 00 00 04 00 56 31 34 30 |O.O.........V140| 00000bb0 01 00 00 00 77 00 02 02 24 00 00 00 04 00 56 31 |....w...$.....V1| 00000bc0 34 31 01 00 00 00 77 00 02 02 23 00 1c 00 02 00 |41....w...#.....| 00000bd0 50 37 01 00 54 00 54 00 01 01 00 00 00 00 04 00 |P7..T.T.........| 00000be0 56 31 34 32 01 00 00 00 77 00 02 02 24 00 08 00 |V142....w...$...| 00000bf0 04 00 56 31 34 33 01 00 00 00 77 00 02 02 24 00 |..V143....w...$.| 00000c00 0c 00 04 00 56 31 34 34 01 00 00 00 77 00 02 02 |....V144....w...| 00000c10 24 00 08 00 04 00 56 31 34 35 01 00 00 00 77 00 |$.....V145....w.| 00000c20 02 02 24 00 10 00 04 00 56 31 34 36 01 00 00 00 |..$.....V146....| 00000c30 77 00 02 02 24 00 14 00 04 00 56 31 34 37 01 00 |w...$.....V147..| 00000c40 00 00 77 00 02 02 24 00 10 00 04 00 56 31 34 38 |..w...$.....V148| 00000c50 01 00 00 00 77 00 02 02 24 00 0a 00 02 00 41 36 |....w...$.....A6| 00000c60 01 00 59 00 59 00 00 00 00 00 00 00 04 00 56 31 |..Y.Y.........V1| 00000c70 34 39 01 00 00 00 77 00 02 02 24 00 18 00 04 00 |49....w...$.....| 00000c80 56 31 35 30 01 00 00 00 77 00 02 02 0a 00 10 00 |V150....w.......| 00000c90 04 00 56 31 35 31 01 00 00 00 77 00 02 02 24 00 |..V151....w...$.| 00000ca0 14 00 04 00 56 31 35 32 01 00 00 00 77 00 02 02 |....V152....w...| 00000cb0 25 00 00 00 04 00 56 31 35 33 01 00 00 00 77 00 |%.....V153....w.| 00000cc0 02 02 26 00 00 00 04 00 56 31 35 34 01 00 00 00 |..&.....V154....| 00000cd0 77 00 02 02 25 00 00 00 04 00 56 31 35 35 01 00 |w...%.....V155..| 00000ce0 00 00 77 00 02 02 28 00 00 00 04 00 56 31 35 36 |..w...(.....V156| 00000cf0 01 00 00 00 77 00 02 02 04 00 00 00 04 00 56 31 |....w.........V1| 00000d00 35 37 01 00 00 00 77 00 02 02 29 00 00 00 02 00 |57....w...).....| 00000d10 50 38 01 00 63 00 6e 00 01 01 00 00 00 00 04 00 |P8..c.n.........| 00000d20 56 31 35 38 01 00 00 00 77 00 02 02 2b 00 00 00 |V158....w...+...| 00000d30 04 00 56 31 35 39 01 00 00 00 77 00 02 02 05 00 |..V159....w.....| 00000d40 10 00 04 00 56 31 36 30 01 00 00 00 77 00 02 02 |....V160....w...| 00000d50 2b 00 00 00 04 00 56 31 36 31 01 00 00 00 77 00 |+.....V161....w.| 00000d60 02 02 2d 00 00 00 04 00 56 31 36 32 01 00 00 00 |..-.....V162....| 00000d70 77 00 02 02 2e 00 00 00 04 00 56 31 36 33 01 00 |w.........V163..| 00000d80 00 00 77 00 02 02 06 00 08 00 04 00 56 31 36 34 |..w.........V164| 00000d90 01 00 00 00 77 00 02 02 2e 00 00 00 04 00 56 31 |....w.........V1| 00000da0 36 35 01 00 00 00 77 00 02 02 30 00 00 00 04 00 |65....w...0.....| 00000db0 56 31 36 36 01 00 00 00 77 00 02 02 31 00 00 00 |V166....w...1...| 00000dc0 04 00 56 31 36 37 01 00 00 00 77 00 02 02 07 00 |..V167....w.....| 00000dd0 00 00 04 00 56 31 36 38 01 00 00 00 77 00 02 02 |....V168....w...| 00000de0 31 00 00 00 04 00 56 31 36 39 01 00 00 00 77 00 |1.....V169....w.| 00000df0 02 02 33 00 00 00 04 00 56 31 37 30 01 00 00 00 |..3.....V170....| 00000e00 77 00 02 02 05 00 00 00 04 00 56 31 37 31 01 00 |w.........V171..| 00000e10 00 00 77 00 02 02 05 00 08 00 04 00 56 31 37 32 |..w.........V172| 00000e20 01 00 00 00 77 00 02 02 05 00 18 00 04 00 56 31 |....w.........V1| 00000e30 37 33 01 00 00 00 77 00 02 02 06 00 00 00 04 00 |73....w.........| 00000e40 56 31 37 34 01 00 00 00 77 00 02 02 06 00 10 00 |V174....w.......| 00000e50 04 00 56 31 37 35 01 00 00 00 77 00 02 02 06 00 |..V175....w.....| 00000e60 18 00 04 00 56 31 37 36 01 00 00 00 77 00 02 02 |....V176....w...| 00000e70 07 00 08 00 04 00 56 31 37 37 01 00 00 00 77 00 |......V177....w.| 00000e80 02 02 07 00 10 00 04 00 56 31 37 38 01 00 00 00 |........V178....| 00000e90 77 00 02 02 08 00 00 00 04 00 56 31 37 39 01 00 |w.........V179..| 00000ea0 00 00 77 00 02 02 08 00 08 00 04 00 56 31 38 30 |..w.........V180| 00000eb0 01 00 00 00 77 00 02 02 08 00 10 00 04 00 56 31 |....w.........V1| 00000ec0 38 31 01 00 00 00 77 00 02 02 09 00 00 00 04 00 |81....w.........| 00000ed0 56 31 38 32 01 00 00 00 77 00 02 02 09 00 08 00 |V182....w.......| 00000ee0 04 00 56 31 38 33 01 00 00 00 77 00 02 02 09 00 |..V183....w.....| 00000ef0 10 00 04 00 56 31 38 34 01 00 00 00 77 00 02 02 |....V184....w...| 00000f00 0a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |...............| 00000f0f ```
diptorupd commented 1 year ago

@mingjie-intel the information in the ticket are useful if we want to go one level lower than SPIR-V and look at the IGC compiled binaries.