Closed dvrogozh closed 3 weeks ago
Initially reported at https://github.com/intel/llvm/issues/15082. Offline debugged by @paigeale. Below is deduced reproducer (.spv file generated by compiling ReduceArgMaxKernel_preproc.ii from 15082 with export IGC_ShaderDumpEnableAll=1 && export IGC_DumpToCurrentDir=1).
export IGC_ShaderDumpEnableAll=1 && export IGC_DumpToCurrentDir=1
Link to input .spv file: https://github.com/dvrogozh/pytorch/tree/intel-llvm/repro (see .spv and options .txt files in this folder)
$ ocloc -spirv_input -file OCL_asm43414108f97d5215.spv -device xe-lpg Building with options: -ze-fp64-gen-conv-emu -cl-poison-unsupported-fp64-kernels -cl-intel-enable-auto-large-GRF-mode -cl-fp32-correctly-rounded-divide-sqrt Building with internal options: -ocl-version=300 -cl-ext=-all,+cl_khr_byte_addressable_store,+cl_khr_device_uuid,+cl_khr_fp16,+cl_khr_global_int32_base_atomics,+cl_khr_global_int32_extended_atomics,+cl_khr_icd,+cl_khr_local_int32_base_atomics,+cl_khr_local_int32_extended_atomics,+cl_intel_command_queue_families,+cl_intel_subgroups,+cl_intel_required_subgroup_size,+cl_intel_subgroups_short,+cl_khr_spir,+cl_intel_accelerator,+cl_intel_driver_diagnostics,+cl_khr_priority_hints,+cl_khr_throttle_hints,+cl_khr_create_command_queue,+cl_intel_subgroups_char,+cl_intel_subgroups_long,+cl_khr_il_program,+cl_intel_mem_force_host_memory,+cl_khr_subgroup_extended_types,+cl_khr_subgroup_non_uniform_vote,+cl_khr_subgroup_ballot,+cl_khr_subgroup_non_uniform_arithmetic,+cl_khr_subgroup_shuffle,+cl_khr_subgroup_shuffle_relative,+cl_khr_subgroup_clustered_reduce,+cl_intel_device_attribute_query,+cl_khr_suggested_local_work_size,+cl_intel_split_work_group_barrier,+cl_khr_fp64,+cl_intel_spirv_media_block_io,+cl_intel_spirv_subgroups,+cl_khr_spirv_linkonce_odr,+cl_khr_spirv_no_integer_wrap_decoration,+cl_intel_unified_shared_memory,+cl_khr_mipmap_image,+cl_khr_mipmap_image_writes,+cl_ext_float_atomics,+cl_khr_external_memory,+cl_intel_planar_yuv,+cl_intel_packed_yuv,+cl_khr_int64_base_atomics,+cl_khr_int64_extended_atomics,+cl_khr_image2d_from_buffer,+cl_khr_depth_images,+cl_khr_3d_image_writes,+cl_intel_media_block_io,+cl_intel_create_buffer_with_properties,+cl_intel_subgroup_local_block_io,+cl_khr_integer_dot_product -ze-exclude-ir-from-zebin -D__IMAGE_SUPPORT__=1 -cl-store-cache-default=2 -cl-load-cache-default=4 -cl-intel-has-buffer-offset-arg Compilation from IR - skipping loading of FCL Binary Instruction seen with illegal int type. Legalization support missing. Inst opcode:25[0]: /lib/x86_64-linux-gnu/libocloc.so(+0xc1a64) [0x7fcf6464fa64] [1]: /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7fcf643a7520] [2]: /lib/x86_64-linux-gnu/libigc.so.1(+0x96912f) [0x7fcf5f83912f] [3]: /lib/x86_64-linux-gnu/libigc.so.1(+0xd014c9) [0x7fcf5fbd14c9] [4]: /lib/x86_64-linux-gnu/libigc.so.1(+0xd08bad) [0x7fcf5fbd8bad] [5]: /lib/x86_64-linux-gnu/libigc.so.1(_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE+0x2be) [0x7fcf606771ae] [6]: /lib/x86_64-linux-gnu/libigc.so.1(_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE+0x34) [0x7fcf606774d4] [7]: /lib/x86_64-linux-gnu/libigc.so.1(_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE+0x32c) [0x7fcf6067826c] [8]: /lib/x86_64-linux-gnu/libigc.so.1(+0xc861b2) [0x7fcf5fb561b2] [9]: /lib/x86_64-linux-gnu/libigc.so.1(+0x90a55e) [0x7fcf5f7da55e] [10]: /lib/x86_64-linux-gnu/libigc.so.1(+0xb6b61b) [0x7fcf5fa3b61b] [11]: /lib/x86_64-linux-gnu/libigc.so.1(+0x90cf27) [0x7fcf5f7dcf27] [12]: /lib/x86_64-linux-gnu/libigc.so.1(+0x984ccd) [0x7fcf5f854ccd] [13]: /lib/x86_64-linux-gnu/libigc.so.1(+0x9861de) [0x7fcf5f8561de] [14]: /lib/x86_64-linux-gnu/libocloc.so(+0x9a386) [0x7fcf64628386] [15]: /lib/x86_64-linux-gnu/libocloc.so(+0xc3acf) [0x7fcf64651acf] [16]: /lib/x86_64-linux-gnu/libocloc.so(+0xc1cc8) [0x7fcf6464fcc8] [17]: /lib/x86_64-linux-gnu/libocloc.so(+0x89ca9) [0x7fcf64617ca9] [18]: /lib/x86_64-linux-gnu/libocloc.so(oclocInvoke+0x8ee) [0x7fcf6461981e] [19]: ocloc(+0x637) [0x564e4aeca637] [20]: /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fcf6438ed90] [21]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7fcf6438ee40] [22]: ocloc(+0x665) [0x564e4aeca665] Segmentation fault (core dumped)
Fixed by https://github.com/intel/intel-graphics-compiler/commit/66d001e52c8e496f51c2572acc2377ca8f4e9e50
Initially reported at https://github.com/intel/llvm/issues/15082. Offline debugged by @paigeale. Below is deduced reproducer (.spv file generated by compiling ReduceArgMaxKernel_preproc.ii from 15082 with
export IGC_ShaderDumpEnableAll=1 && export IGC_DumpToCurrentDir=1
).Link to input .spv file: https://github.com/dvrogozh/pytorch/tree/intel-llvm/repro (see .spv and options .txt files in this folder)