intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
126 stars 36 forks source link

[torchbench] basic_gnn_gcn failed with spirv error on llvm-target branch #568

Closed weishi-deng closed 5 months ago

weishi-deng commented 6 months ago

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

data type | model | error -- | -- | -- fp16_inference | basic_gnn_gcn | Invalid SPIR-V module: input SPIR-V module uses unknown extension 'SPV_EXT_shader_atomic_float16_add' fp16_training | basic_gnn_gcn | Invalid SPIR-V module: input SPIR-V module uses unknown extension 'SPV_EXT_shader_atomic_float16_add' amp fp16 training | dlrm | Invalid SPIR-V module: input SPIR-V module uses unknown extension 'SPV_EXT_shader_atomic_float16_add' fp16_training | dlrm | Invalid SPIR-V module: input SPIR-V module uses unknown extension 'SPV_EXT_shader_atomic_float16_add' amp fp16 training | hf_Longformer | Invalid SPIR-V module: input SPIR-V module uses unknown extension 'SPV_EXT_shader_atomic_float16_add' fp16_training | hf_Longformer | Invalid SPIR-V module: input SPIR-V module uses unknown extension 'SPV_EXT_shader_atomic_float16_add'

env: GPU: Max 1100 Driver version:  803.29 oneAPI: 2024.1

Triton:  890b7402b6a4cbada9cd94484237b0416686cec4 (Tue Feb 20) PyTorch: 0f6d72ce16bd4b30402dcad97144d17cd7bc53ed (Based on PyTorch 2.1) IPEX: f496a32aed6d9ad4c70fcbbf2d3321a385affccb

HUGGINGFACE_PIN_COMMIT: 4.27.4

reproducer: inductor_xpu_test.sh torchbench float16 training performance xpu 0 static 1 0 dlrm

tdeng5 commented 6 months ago

@etiotto , there is a workaround in spirv-path, which functionality works, need we port this workaround to LLVM-target?

etiotto commented 5 months ago

@ienkovich another duplicate of the atomix fp16 issues.

vlad-penkin commented 5 months ago

Duplicates