CHIP-SPV / chipStar

chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.
Other
166 stars 27 forks source link

PoCL - Device library link step failed #784

Open pvelesko opened 4 months ago

pvelesko commented 4 months ago

The run: https://github.com/CHIP-SPV/chipStar/actions/runs/8030196087

These don't fail for me locally

list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rd_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rn_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_ru_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rz_double") # Failedlist(APPEND CPU_POCL_FAILED_TESTS "fp16_math") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "fp16_half2_math") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_hipGraphAddMemcpyNodeToSymbol_MemcpyToSymbolNodeWithKernel") # Failed
CHIP error [TID 36443] [1708774709.417000043] : hipErrorNotInitialized (Device library link step failed.) in /home/runner/work/chipStar/chipStar/src/backend/OpenCL/CHIPBackendOpenCL.cc:840:compile

CHIP error [TID 36443] [1708774709.417225324] : Caught Error: hipErrorNotInitialized
pjaaskel commented 4 months ago

Do you mean there could be a bug in PoCL or the CI?

pvelesko commented 4 months ago

from UnitTests.cmake

# The following tests fail for LLVM 15 Debug & Release : Cannot find symbol _Z4sqrtDh in kernel library
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rd_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rn_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_ru_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rz_double") # Failed

# Fails for LLVM 15 Debug: SPIR-V Parser: Failed to find size for type id 83
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest_rnorm_double") # Failed
franz commented 1 month ago

PoCL CPU device has FP16 support only when compiled with LLVM 16 and higher (and that support is quite incomplete).