CHIP-SPV / chipStar

chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.
Other
182 stars 29 forks source link

1.1-RC3: Intel iGPU failures on level zero #733

Closed pjaaskel closed 7 months ago

pjaaskel commented 8 months ago
...
98% tests passed, 20 tests failed out of 999
...

The following tests FAILED:
    359 - Unit_hipGraphAddEventRecordNode_MultipleRun (Failed)
    404 - Unit_hipStreamBeginCapture_BasicFunctional (Failed)
    519 - Unit_hipHostRegister_Memcpy - int (Failed)
    520 - Unit_hipHostRegister_Memcpy - float (Failed)
    532 - Unit_hipMallocPitch_ValidatePitch (Failed)
    564 - Unit_hipMemset2D_BasicFunctional (Failed)
    565 - Unit_hipMemset2DAsync_BasicFunctional (Failed)
    566 - Unit_hipMemset2D_UniqueWidthHeight (Failed)
    568 - Unit_hipMemset3D_MemsetWithExtent (Failed)
    569 - Unit_hipMemset3DAsync_MemsetWithExtent (Failed)
    572 - Unit_hipMemset3D_SeekSetSlice (Failed)
    574 - Unit_hipMemset3D_SeekSetArrayPortion (Failed)
    575 - Unit_hipMemset3DAsync_SeekSetArrayPortion (Failed)
    578 - Unit_hipMemset3DAsync_ConcurrencyMthread (Subprocess aborted)
    609 - Unit_hipMemset2DAsync_WithKernel (Failed)
    668 - Unit_hipMemsetFunctional_PartialSet_2D (Failed)
    672 - Unit_hipMemsetFunctional_PartialSet_3D (Failed)
    837 - Unit_hipMultiThreadStreams2 (Subprocess aborted)
    970 - sycl_chip_interop (Failed)
    971 - sycl_chip_interop_usm (Failed)
Errors while running CTest
pvelesko commented 8 months ago

Please attach your neo and igc versions

pjaaskel commented 8 months ago

igc 1.0.13822.8-647. I'll check if a driver update helps.

pvelesko commented 8 months ago

also the runtime version? @pjaaskel

pjaaskel commented 8 months ago

With regular cmdlists only the sycl interop tests fail.

pjaaskel commented 8 months ago

Sycl interop:


968/997 Test #968: sycl_chip_interop .........................................................***Failed  Required regular expression not found. Regex=[PASSED
]  1.11 sec
CHIP error [TID 313225] [1703091543.212875408] : hipErrorTbd (Can't compile module. Level zero does not support multi-input compilation.) in /home/pjaaskel/Downloads/chipStar-1.1/src/backend/Level0/CHIPBackendLevel0.cc:2520:compile

CHIP error [TID 313225] [1703091543.212935820] : Caught Error: hipErrorTbd
HIP API error
FAIL: 773876736 failures 
``
pjaaskel commented 8 months ago

I've installed these packages from the compute-runtime repo and there's actually one extra fail.

intel-igc-core_1.0.15136.4_amd64.deb        intel-opencl-icd_23.35.27191.9_amd64.deb
intel-igc-opencl_1.0.15136.4_amd64.deb      libigdgmm12_22.3.11.ci17747749_amd64.deb
intel-level-zero-gpu_1.3.27191.9_amd64.deb
98% tests passed, 21 tests failed out of 999
...
The following tests FAILED:
    359 - Unit_hipGraphAddEventRecordNode_MultipleRun (Failed)
    404 - Unit_hipStreamBeginCapture_BasicFunctional (Failed)
    519 - Unit_hipHostRegister_Memcpy - int (Failed)
    520 - Unit_hipHostRegister_Memcpy - float (Failed)
    532 - Unit_hipMallocPitch_ValidatePitch (Failed)
    564 - Unit_hipMemset2D_BasicFunctional (Failed)
    565 - Unit_hipMemset2DAsync_BasicFunctional (Failed)
    566 - Unit_hipMemset2D_UniqueWidthHeight (Failed)
    568 - Unit_hipMemset3D_MemsetWithExtent (Failed)
    569 - Unit_hipMemset3DAsync_MemsetWithExtent (Failed)
    573 - Unit_hipMemset3DAsync_SeekSetSlice (Failed)
    574 - Unit_hipMemset3D_SeekSetArrayPortion (Failed)
    575 - Unit_hipMemset3DAsync_SeekSetArrayPortion (Failed)
    578 - Unit_hipMemset3DAsync_ConcurrencyMthread (Subprocess aborted)
    609 - Unit_hipMemset2DAsync_WithKernel (Failed)
    664 - Unit_hipMemsetFunctional_PartialSet_1D (Failed)
    668 - Unit_hipMemsetFunctional_PartialSet_2D (Failed)
    672 - Unit_hipMemsetFunctional_PartialSet_3D (Failed)
    837 - Unit_hipMultiThreadStreams2 (Subprocess aborted)
    970 - sycl_chip_interop (Failed)
    971 - sycl_chip_interop_usm (Failed)
Errors while running CTest
linehill commented 8 months ago

CHIP error [TID 313225] [1703091543.212875408] : hipErrorTbd (Can't compile module. Level zero does not support multi-input compilation.) in /home/pjaaskel/Downloads/chipStar-1.1/src/backend/Level0/CHIPBackendLevel0.cc:2520:compile

This is an odd sight. For this error to show up should also mean that every test with kernel launches should fail too on Level0. Are the interop tests using different level0 driver than the rest of the tests?

pvelesko commented 8 months ago
The following tests FAILED:
    359 - Unit_hipGraphAddEventRecordNode_MultipleRun (Failed)
    404 - Unit_hipStreamBeginCapture_BasicFunctional (Failed)
    519 - Unit_hipHostRegister_Memcpy - int (Failed)
    520 - Unit_hipHostRegister_Memcpy - float (Failed)
    521 - Unit_hipHostRegister_Memcpy - double (Failed)
    532 - Unit_hipMallocPitch_ValidatePitch (Failed)
    564 - Unit_hipMemset2D_BasicFunctional (Failed)
    565 - Unit_hipMemset2DAsync_BasicFunctional (Failed)
    566 - Unit_hipMemset2D_UniqueWidthHeight (Failed)
    568 - Unit_hipMemset3D_MemsetWithExtent (Failed)
    569 - Unit_hipMemset3DAsync_MemsetWithExtent (Failed)
    572 - Unit_hipMemset3D_SeekSetSlice (Failed)
    574 - Unit_hipMemset3D_SeekSetArrayPortion (Failed)
    575 - Unit_hipMemset3DAsync_SeekSetArrayPortion (Failed)
    578 - Unit_hipMemset3DAsync_ConcurrencyMthread (Subprocess aborted)
    609 - Unit_hipMemset2DAsync_WithKernel (Failed)
    664 - Unit_hipMemsetFunctional_PartialSet_1D (Failed)
    668 - Unit_hipMemsetFunctional_PartialSet_2D (Failed)
    672 - Unit_hipMemsetFunctional_PartialSet_3D (Failed)
    833 - Unit_hipMultiThreadDevice_NearZero (Subprocess aborted)
    837 - Unit_hipMultiThreadStreams2 (Subprocess aborted)

The following tests failed for ICL on iGPU with most failing due to ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY. But we don't support ICL on iGPU. Running RCL tests now

pvelesko commented 8 months ago
100% tests passed, 0 tests failed out of 993

Level Zero iGPU RCL

pjaaskel commented 8 months ago

Yes, RCL works (expect #735). I thought ICL on iGPU worked earlier. If not, then we can ignore that.