CHIP-SPV / chipStar

chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.
Other
182 stars 29 forks source link

ARM iGPU Mali-G52 Failures #736

Open pvelesko opened 8 months ago

pvelesko commented 8 months ago
84% tests passed, 154 tests failed out of 991

Label Time Summary:
cuda        = 120.09 sec*proc (26 tests)
internal    = 245.41 sec*proc (75 tests)

Total Test time (real) = 2918.52 sec

The following tests did not run:
    544 - Unit_hipMallocManaged_HostDeviceConcurrent (Skipped)
    545 - Unit_hipMallocManaged_MultiChunkSingleDevice (Skipped)
    546 - Unit_hipMallocManaged_MultiChunkMultiDevice (Skipped)
    547 - Unit_hipMallocManaged_OverSubscription (Skipped)
    548 - Unit_hipMallocManaged_TwoPointers - int (Skipped)
    549 - Unit_hipMallocManaged_TwoPointers - float (Skipped)
    550 - Unit_hipMallocManaged_TwoPointers - double (Skipped)
    551 - Unit_hipMallocManaged_DeviceContextChange - unsigned char (Skipped)
    552 - Unit_hipMallocManaged_DeviceContextChange - int (Skipped)
    553 - Unit_hipMallocManaged_DeviceContextChange - float (Skipped)
    554 - Unit_hipMallocManaged_DeviceContextChange - double (Skipped)
    579 - Unit_hipMallocManaged_FlgParam (Skipped)
    580 - Unit_hipMallocManaged_AccessMultiStream (Skipped)
    611 - Unit_hipMallocManaged_Advanced (Skipped)
    685 - Unit_hipMalloc3DArray_MaxTexture - int (Skipped)
    686 - Unit_hipMalloc3DArray_MaxTexture - uint4 (Skipped)
    687 - Unit_hipMalloc3DArray_MaxTexture - short (Skipped)
    688 - Unit_hipMalloc3DArray_MaxTexture - ushort2 (Skipped)
    689 - Unit_hipMalloc3DArray_MaxTexture - unsigned char (Skipped)
    690 - Unit_hipMalloc3DArray_MaxTexture - float (Skipped)
    691 - Unit_hipMalloc3DArray_MaxTexture - float4 (Skipped)
    702 - Unit_hipMemsetSync (Skipped)
    703 - Unit_hipMemsetDSync - int8_t (Skipped)
    704 - Unit_hipMemsetDSync - int16_t (Skipped)
    705 - Unit_hipMemsetDSync - uint32_t (Skipped)
    706 - Unit_hipMemset2DSync (Skipped)
    707 - Unit_hipMemset3DSync (Skipped)
    777 - Unit_hipDeviceTotalMem_NonSelectedDevice (Skipped)
    782 - Unit_hipGetDeviceCount_HideDevices (Skipped)
    789 - Unit_hipSetGetDevice_Positive_Threaded_Basic (Skipped)
    791 - Unit_hipDeviceGetP2PAttribute_Basic (Skipped)
    792 - Unit_hipDeviceGetP2PAttribute_Negative (Skipped)
    793 - Unit_hipDeviceCanAccessPeer_positive (Skipped)
    794 - Unit_hipDeviceCanAccessPeer_negative (Skipped)
    795 - Unit_hipDeviceEnableDisablePeerAccess_positive (Skipped)
    796 - Unit_hipDeviceEnablePeerAccess_negative (Skipped)
    797 - Unit_hipDeviceDisablePeerAccess_negative (Skipped)

The following tests FAILED:
     70 - Unit_deviceFunctions_CompileTest___fma_rd_double (Failed)
     71 - Unit_deviceFunctions_CompileTest___fma_rn_double (Failed)
     72 - Unit_deviceFunctions_CompileTest___fma_ru_double (Failed)
     73 - Unit_deviceFunctions_CompileTest___fma_rz_double (Failed)
    175 - Unit_deviceFunctions_CompileTest_signbit_float (Failed)
    189 - Unit_deviceFunctions_CompileTest_acos_double (Failed)
    190 - Unit_deviceFunctions_CompileTest_acosh_double (Failed)
    191 - Unit_deviceFunctions_CompileTest_asin_double (Failed)
    192 - Unit_deviceFunctions_CompileTest_asinh_double (Failed)
    193 - Unit_deviceFunctions_CompileTest_atan_double (Failed)
    194 - Unit_deviceFunctions_CompileTest_atan2_double (Failed)
    195 - Unit_deviceFunctions_CompileTest_atanh_double (Failed)
    196 - Unit_deviceFunctions_CompileTest_cbrt_double (Failed)
    197 - Unit_deviceFunctions_CompileTest_ceil_double (Failed)
    198 - Unit_deviceFunctions_CompileTest_copysign_double (Failed)
    199 - Unit_deviceFunctions_CompileTest_cos_double (Failed)
    200 - Unit_deviceFunctions_CompileTest_cosh_double (Failed)
    201 - Unit_deviceFunctions_CompileTest_cospi_double (Failed)
    204 - Unit_deviceFunctions_CompileTest_erfc_double (Failed)
    205 - Unit_deviceFunctions_CompileTest_erf_double (Failed)
    209 - Unit_deviceFunctions_CompileTest_exp_double (Failed)
    210 - Unit_deviceFunctions_CompileTest_exp10_double (Failed)
    211 - Unit_deviceFunctions_CompileTest_exp2_double (Failed)
    212 - Unit_deviceFunctions_CompileTest_expm1_double (Failed)
    213 - Unit_deviceFunctions_CompileTest_fabs_double (Failed)
    214 - Unit_deviceFunctions_CompileTest_fdim_double (Failed)
    215 - Unit_deviceFunctions_CompileTest_floor_double (Failed)
    216 - Unit_deviceFunctions_CompileTest_fma_double (Failed)
    217 - Unit_deviceFunctions_CompileTest_fmax_double (Failed)
    218 - Unit_deviceFunctions_CompileTest_fmin_double (Failed)
    219 - Unit_deviceFunctions_CompileTest_fmod_double (Failed)
    220 - Unit_deviceFunctions_CompileTest_frexp_double (Failed)
    221 - Unit_deviceFunctions_CompileTest_hypot_double (Failed)
    222 - Unit_deviceFunctions_CompileTest_ilogb_double (Failed)
    223 - Unit_deviceFunctions_CompileTest_isfinite_double (Failed)
    224 - Unit_deviceFunctions_CompileTest_isinf_double (Failed)
    225 - Unit_deviceFunctions_CompileTest_isnan_double (Failed)
    229 - Unit_deviceFunctions_CompileTest_ldexp_double (Failed)
    230 - Unit_deviceFunctions_CompileTest_lgamma_double (Failed)
    233 - Unit_deviceFunctions_CompileTest_log_double (Failed)
    234 - Unit_deviceFunctions_CompileTest_log10_double (Failed)
    235 - Unit_deviceFunctions_CompileTest_log1p_double (Failed)
    236 - Unit_deviceFunctions_CompileTest_log2_double (Failed)
    237 - Unit_deviceFunctions_CompileTest_logb_double (Failed)
    240 - Unit_deviceFunctions_CompileTest_max_double (Failed)
    241 - Unit_deviceFunctions_CompileTest_min_double (Failed)
    242 - Unit_deviceFunctions_CompileTest_modf_double (Failed)
    243 - Unit_deviceFunctions_CompileTest_nan_double (Failed)
    245 - Unit_deviceFunctions_CompileTest_nextafter_double (Failed)
    251 - Unit_deviceFunctions_CompileTest_pow_double (Failed)
    253 - Unit_deviceFunctions_CompileTest_remainder_double (Failed)
    254 - Unit_deviceFunctions_CompileTest_remquo_double (Failed)
    256 - Unit_deviceFunctions_CompileTest_rint_double (Failed)
    260 - Unit_deviceFunctions_CompileTest_round_double (Failed)
    261 - Unit_deviceFunctions_CompileTest_rsqrt_double (Failed)
    264 - Unit_deviceFunctions_CompileTest_signbit_double (Failed)
    265 - Unit_deviceFunctions_CompileTest_sin_double (Failed)
    266 - Unit_deviceFunctions_CompileTest_sincos_double (Failed)
    267 - Unit_deviceFunctions_CompileTest_sincospi_double (Failed)
    268 - Unit_deviceFunctions_CompileTest_sinh_double (Failed)
    269 - Unit_deviceFunctions_CompileTest_sinpi_double (Failed)
    270 - Unit_deviceFunctions_CompileTest_sqrt_double (Failed)
    271 - Unit_deviceFunctions_CompileTest_tan_double (Failed)
    272 - Unit_deviceFunctions_CompileTest_tanh_double (Failed)
    273 - Unit_deviceFunctions_CompileTest_tgamma_double (Failed)
    274 - Unit_deviceFunctions_CompileTest_trunc_double (Failed)
    284 - Unit_deviceFunctions_CompileTest_max_int (Failed)
    285 - Unit_deviceFunctions_CompileTest_min_int (Failed)
    292 - Unit_deviceFunctions_CompileTest_atomicAdd_int (Failed)
    293 - Unit_deviceFunctions_CompileTest_atomicAdd_usigned_int (Failed)
    294 - Unit_deviceFunctions_CompileTest_atomicAdd_unsigned_long_long (Failed)
    295 - Unit_deviceFunctions_CompileTest_atomicAdd_float (Failed)
    296 - Unit_deviceFunctions_CompileTest_atomicAdd_double (Failed)
    297 - Unit_deviceFunctions_CompileTest_atomicAdd_system_int (Failed)
    298 - Unit_deviceFunctions_CompileTest_atomicAdd_system_usigned_int (Failed)
    299 - Unit_deviceFunctions_CompileTest_atomicAdd_system_unsigned_long_long (Failed)
    300 - Unit_deviceFunctions_CompileTest_atomicAdd_system_float (Failed)
    301 - Unit_deviceFunctions_CompileTest_atomicAdd_system_double (Failed)
    365 - Unit_hipGraph_SimpleGraphWithKernel (Failed)
    399 - Unit_hipGraphMemsetNodeSetParams_Functional (Failed)
    411 - Unit_hipGraphLaunch_Functional_hipStreamPerThread (Failed)
    412 - Unit_hipGraphLaunch_Functional_multidevice_test (Failed)
    516 - Unit_hipMemcpy_HalfMemCopy (Timeout)
    524 - Unit_hipHostRegister_Negative - int (Subprocess killed)
    525 - Unit_hipHostRegister_Negative - float (Subprocess killed)
    526 - Unit_hipHostRegister_Negative - double (Subprocess killed)
    531 - Unit_hipMallocPitch_ValidatePitch (Failed)
    559 - Unit_hipMemsetAsync_SetMemoryWithOffset (Failed)
    562 - Unit_hipMemsetAsync_QueueJobsMultithreaded (SEGFAULT)
    564 - Unit_hipMemset3DAsync_BasicFunctional (Failed)
    565 - Unit_hipMemset2D_BasicFunctional (Failed)
    566 - Unit_hipMemset2DAsync_BasicFunctional (Failed)
    567 - Unit_hipMemset2D_UniqueWidthHeight (Failed)
    569 - Unit_hipMemset3D_MemsetWithExtent (Failed)
    571 - Unit_hipMemset3DAsync_MemsetMaxValue (Failed)
    572 - Unit_hipMemset3D_SeekSetSlice (Failed)
    573 - Unit_hipMemset3DAsync_SeekSetSlice (Failed)
    574 - Unit_hipMemset3D_SeekSetArrayPortion (Failed)
    575 - Unit_hipMemset3DAsync_SeekSetArrayPortion (Failed)
    578 - Unit_hipMemset3DAsync_ConcurrencyMthread (Subprocess aborted)
    599 - Unit_hipMemcpyWithStream_TestwithTwoStream (Timeout)
    606 - Unit_hipMemcpyWithStream_TestDtoDonSameDevice (Timeout)
    607 - Unit_hipMemcpyWithStream_MultiThread (Timeout)
    608 - Unit_hipMemsetAsync_VerifyExecutionWithKernel (Failed)
    609 - Unit_hipMemset2DAsync_WithKernel (Failed)
    622 - Unit_hipHostMalloc_Basic (Failed)
    627 - Unit_hipMemcpy_KernelLaunch - int (Failed)
    628 - Unit_hipMemcpy_KernelLaunch - float (Failed)
    629 - Unit_hipMemcpy_KernelLaunch - double (Failed)
    633 - Unit_hipMemcpy_MultiThreadWithSerialization (Subprocess aborted)
    637 - Unit_hipMemcpyAsync_KernelLaunch - int (Failed)
    638 - Unit_hipMemcpyAsync_KernelLaunch - float (Failed)
    639 - Unit_hipMemcpyAsync_KernelLaunch - double (Failed)
    644 - Unit_hipMemcpyAsync_hipMultiMemcpyMultiThread - int (Subprocess aborted)
    645 - Unit_hipMemcpyAsync_hipMultiMemcpyMultiThread - float (Subprocess aborted)
    646 - Unit_hipMemcpyAsync_hipMultiMemcpyMultiThread - double (Subprocess aborted)
    647 - Unit_hipMemcpyAsync_hipMultiMemcpyMultiThreadMultiStream - int (Subprocess aborted)
    648 - Unit_hipMemcpyAsync_hipMultiMemcpyMultiThreadMultiStream - float (Subprocess aborted)
    649 - Unit_hipMemcpyAsync_hipMultiMemcpyMultiThreadMultiStream - double (Subprocess aborted)
    657 - Unit_hipMemsetFunctional_SmallSize_hipMemset (Failed)
    658 - Unit_hipMemsetFunctional_SmallSize_hipMemsetD32 (Failed)
    659 - Unit_hipMemsetFunctional_SmallSize_hipMemsetD16 (Failed)
    660 - Unit_hipMemsetFunctional_SmallSize_hipMemsetD8 (Failed)
    665 - Unit_hipMemsetFunctional_PartialSet_1D (Failed)
    666 - Unit_hipMemsetFunctional_ZeroValue_2D (Failed)
    667 - Unit_hipMemsetFunctional_SmallSize_2D (Failed)
    668 - Unit_hipMemsetFunctional_ZeroSize_2D (Failed)
    669 - Unit_hipMemsetFunctional_PartialSet_2D (Failed)
    670 - Unit_hipMemsetFunctional_ZeroValue_3D (Failed)
    671 - Unit_hipMemsetFunctional_SmallSize_3D (Failed)
    672 - Unit_hipMemsetFunctional_ZeroSize_3D (Failed)
    737 - Unit_hipStreamDestroy_WithFinishedWork (Timeout)
    738 - Unit_hipStreamCreate_MultistreamBasicFunctionalities (Timeout)
    834 - Unit_hipMultiThreadStreams2 (Subprocess aborted)
    841 - Unit_hipClassKernel_Value (Subprocess aborted)
    846 - ABM_AddKernel_MultiTypeMultiSize - int (Failed)
    847 - ABM_AddKernel_MultiTypeMultiSize - long (Failed)
    848 - ABM_AddKernel_MultiTypeMultiSize - float (Failed)
    849 - ABM_AddKernel_MultiTypeMultiSize - long long (Failed)
    850 - ABM_AddKernel_MultiTypeMultiSize - double (Failed)
    894 - TestKernelArgs (Subprocess aborted)
    896 - RegressionTest302 (Failed)
    904 - TestAtomics (Subprocess aborted)
    926 - shuffle (Not Run)
    927 - broadcast (Not Run)
    928 - broadcast2 (Not Run)
    929 - 2d_shuffle (Not Run)
    934 - unroll (Not Run)
    955 - PrintfSimple (Failed)
    956 - PrintfNOP (Failed)
    957 - PrintfDynamic (Failed)
    958 - shuffles (Failed)
    966 - cuda-asyncAPI (Failed)
    990 - cuda-reduction (Failed)
[ERROR_MESSAGE]
Errors while running CTest
pvelesko commented 8 months ago

checkpy_igpu_opencl.txt

pvelesko commented 3 months ago

@franz Is there a single reason why most of these fail?

franz commented 3 months ago

IIRC there were multiple reasons. The major one was that the Mali does not support FP64 and there is no emulation (AFAIK). Some of the tests have "int" or "unsigned" in the name, but they have shared kernel code for all variants including "double", and because of that, they driver failed to compile the SPIRV. Another reason was that some tests required more memory than the driver could allocate, this is perhaps fixable with some kernel parameter tuning, i haven't tried. Then there are a bunch of memset/memfill tests that fail, i have no idea why, possibly bugs in Mali drivers.

pvelesko commented 3 months ago

The major one was that the Mali does not support FP64 and there is no emulation (AFAIK). Some of the tests have "int" or "unsigned" in the name, but they have shared kernel code for all variants including "double",

This we can filter out now with -DCHIP_SKIP_DOUBLE_TESTS=ON

IIRC only a couple of tests failed due to this.

pvelesko commented 2 weeks ago

Updated list of failing tests:

    Unit_hipGraphAddEventRecordNode_Functional_Simple: '' 
    Unit_hipGraphAddEventRecordNode_Functional_WithoutFlags: '' 
    Unit_hipGraphAddEventRecordNode_Functional_WithFlags: '' 
    Unit_hipGraphAddEventRecordNode_Functional_TimingDisabled: '' 
    Unit_hipGraphAddEventWaitNode_Functional_Simple: '' 
    Unit_hipGraphAddEventWaitNode_MultGraphMultStrmDependency: '' 
    Unit_hipGraphAddEventWaitNode_MultGraphOneStrmDependency: '' 
    Unit_hipGraphAddEventWaitNode_differentFlags: '' 
    Unit_hipGraphNodeGetType_NodeType: '' 
    Unit_hipGraphEventWaitNodeSetEvent_SetProp: '' 
    Unit_hipStreamBeginCapture_ColligatedStrmCapture_defaultflag: '' 
    Unit_hipStreamBeginCapture_ColligatedStrmCapture_blockingflag: '' 
    Unit_hipStreamBeginCapture_ColligatedStrmCapture_diffflags: '' 
    Unit_hipStreamBeginCapture_ColligatedStrmCapture_diffprio: '' 
    Unit_hipMemcpy2DToArrayAsync_PinnedHostMemSameGpu: '' 
    Unit_hipMemcpy_HalfMemCopy: '' 
    Unit_hipMemcpy_MultiThread-AllAPIs: '' 
    Unit_hipMemcpyWithStream_TestwithTwoStream: '' 
    Unit_hipMemcpyWithStream_TestDtoDonSameDevice: '' 
    Unit_hipMemcpyWithStream_MultiThread: '' 
    Unit_hipHostMalloc_NonCoherent: '' 
    Unit_hipHostMalloc_Default: '' 
    Unit_hipStreamDestroy_WithFinishedWork: '' 
    Unit_hipStreamCreate_MultistreamBasicFunctionalities: '' 
    Unit_hipEvent: '' 
    Unit_hipEventElapsedTime: '' 
    Unit_hipEventMGpuMThreads_1: '' 
    Unit_hipDeviceSynchronize_Functional: '' 
    Unit_hipStreamPerThread_EventSynchronize: '' 
    Unit_hipMultiThreadStreams1_AsyncSync: '' 
    Unit_hipMultiThreadStreams1_AsyncAsync: '' 
    hipMemset_Unit_hipMemsetAsync_SetMemoryWithOffset_Helgrind: '' 
    TestRecordEventBlocking: '' 
    MatrixMultiply: '' 
    hipEvent: '' 
    shuffle: '' 
    broadcast: '' 
    broadcast2: '' 
    2d_shuffle: '' 
    unroll: '' 
    hip_async_binomial: '' 
    BinomialOption: '' 
    BitonicSort: '' 
    DCT: '' 
    dwtHaar1D: '' 
    FastWalshTransform: '' 
    FloydWarshall: '' 
    Histogram: '' 
    RecursiveGaussian: '' 
    PrintfSimple: '' 
    PrintfNOP: '' 
    PrintfDynamic: '' 
    shuffles: '' 
    graphMatrixMultiply: '' 
    cuda-asyncAPI: '' 
    cuda-matrixMul: '' 
    cuda-bandwidthTest: '' 

Will work on root-causing these in for 1.3

pvelesko commented 2 weeks ago
The following tests FAILED:
    422 - Unit_hipGraphAddEventRecordNode_Functional_Simple (Timeout)
    423 - Unit_hipGraphAddEventRecordNode_Functional_WithoutFlags (Timeout)
    425 - Unit_hipGraphAddEventRecordNode_Functional_WithFlags (Timeout)
    427 - Unit_hipGraphAddEventRecordNode_Functional_TimingDisabled (Timeout)
    429 - Unit_hipGraphAddEventWaitNode_Functional_Simple (Timeout)
    430 - Unit_hipGraphAddEventWaitNode_MultGraphMultStrmDependency (Timeout)
    432 - Unit_hipGraphAddEventWaitNode_MultGraphOneStrmDependency (Timeout)
    433 - Unit_hipGraphAddEventWaitNode_differentFlags (Timeout)
    492 - Unit_hipGraphNodeGetType_NodeType (Timeout)
    510 - Unit_hipGraphEventWaitNodeSetEvent_SetProp (Timeout)
    536 - Unit_hipStreamBeginCapture_ColligatedStrmCapture_defaultflag (Timeout)
    537 - Unit_hipStreamBeginCapture_ColligatedStrmCapture_blockingflag (Timeout)
    538 - Unit_hipStreamBeginCapture_ColligatedStrmCapture_diffflags (Timeout)
    539 - Unit_hipStreamBeginCapture_ColligatedStrmCapture_diffprio (Timeout)
    630 - Unit_hipMemcpy2DToArrayAsync_PinnedHostMemSameGpu (Timeout)
    717 - Unit_hipMemcpy_HalfMemCopy (Timeout)
    718 - Unit_hipMemcpy_MultiThread-AllAPIs (Timeout)
    831 - Unit_hipMemcpyWithStream_TestwithTwoStream (Timeout)
    838 - Unit_hipMemcpyWithStream_TestDtoDonSameDevice (Timeout)
    839 - Unit_hipMemcpyWithStream_MultiThread (Timeout)
    858 - Unit_hipHostMalloc_NonCoherent (Timeout)
    860 - Unit_hipHostMalloc_Default (Timeout)
    1043 - Unit_hipStreamDestroy_WithFinishedWork (Timeout)
    1044 - Unit_hipStreamCreate_MultistreamBasicFunctionalities (Timeout)
    1055 - Unit_hipEvent (Timeout)
    1059 - Unit_hipEventElapsedTime (Timeout)
    1067 - Unit_hipEventMGpuMThreads_1 (Timeout)
    1094 - Unit_hipDeviceSynchronize_Functional (Timeout)
    1189 - Unit_hipStreamPerThread_EventSynchronize (Timeout)
    1200 - Unit_hipMultiThreadStreams1_AsyncSync (Timeout)
    1201 - Unit_hipMultiThreadStreams1_AsyncAsync (Timeout)
    1224 - hipMemset_Unit_hipMemsetAsync_SetMemoryWithOffset_Helgrind (Failed)
    1284 - TestRecordEventBlocking (Timeout)
    1329 - MatrixMultiply (Timeout)
    1330 - hipEvent (Timeout)
    1333 - shuffle (Failed)
    1334 - broadcast (Failed)
    1335 - broadcast2 (Failed)
    1336 - 2d_shuffle (Failed)
    1341 - unroll (Failed)
    1354 - hip_async_binomial (Timeout)
    1355 - BinomialOption (Timeout)
    1356 - BitonicSort (Timeout)
    1357 - DCT (Timeout)
    1358 - dwtHaar1D (Timeout)
    1359 - FastWalshTransform (Timeout)
    1360 - FloydWarshall (Timeout)
    1361 - Histogram (Timeout)
    1362 - RecursiveGaussian (Timeout)
    1364 - PrintfSimple (Failed)
    1365 - PrintfNOP (Failed)
    1366 - PrintfDynamic (Failed)
    1367 - shuffles (Failed)
    1369 - graphMatrixMultiply (Timeout)
    1374 - cuda-asyncAPI (Timeout)
    1376 - cuda-matrixMul (Timeout)
    1385 - cuda-bandwidthTest (Timeout)

For non-timeout failures:

pvelesko@salami:~/actions-runner/_work/chipStar/chipStar/build$ ctest -R "shuffle$|broadcast$|broadcast2$|2d_shuffle$|unroll$|PrintfSimple$|PrintfNOP$|PrintfDynamic$|shuffles$" -V
UpdateCTestConfiguration  from :/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/DartConfiguration.tcl
Test project /home/pvelesko/actions-runner/_work/chipStar/chipStar/build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1333
    Start 1333: shuffle

1333: Test command: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/bin/spirv-extractor "--check-for-doubles" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/4_shfl/shfl"
1333: Test timeout computed to be: 10000000
1333: CHIP warning [TID 158391] [1723457021.633496613] : The device might not support subgroup size 32, warp-size sensitive kernels might not work correctly.
1333: Device name Mali-G52 r0p0
1333: CHIP error [TID 158391] [1723457021.634605877] : hipErrorNotInitialized (CL_INVALID_VALUE ) in /home/pvelesko/actions-runner/_work/chipStar/chipStar/src/backend/OpenCL/CHIPBackendOpenCL.cc:817:compileIL
1333:
1333: CHIP error [TID 158391] [1723457021.634775635] : Caught Error: hipErrorNotInitialized
1333: ERROR on line 101: 3
1/9 Test #1333: shuffle ..........................***Failed  Required regular expression not found. Regex=[PASSED
]  0.58 sec
test 1334
    Start 1334: broadcast

1334: Test command: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/bin/spirv-extractor "--check-for-doubles" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/4_shfl/broadcast"
1334: Test timeout computed to be: 10000000
1334: CHIP warning [TID 158403] [1723457022.211940958] : The device might not support subgroup size 32, warp-size sensitive kernels might not work correctly.
1334: CHIP error [TID 158403] [1723457022.212910965] : hipErrorNotInitialized (CL_INVALID_VALUE ) in /home/pvelesko/actions-runner/_work/chipStar/chipStar/src/backend/OpenCL/CHIPBackendOpenCL.cc:817:compileIL
1334:
1334: CHIP error [TID 158403] [1723457022.213082016] : Caught Error: hipErrorNotInitialized
1334: ERROR on line 62: 3
2/9 Test #1334: broadcast ........................***Failed  Required regular expression not found. Regex=[PASSED
]  0.58 sec
test 1335
    Start 1335: broadcast2

1335: Test command: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/bin/spirv-extractor "--check-for-doubles" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/4_shfl/broadcast2"
1335: Test timeout computed to be: 10000000
1335: CHIP warning [TID 158415] [1723457022.787183852] : The device might not support subgroup size 32, warp-size sensitive kernels might not work correctly.
1335: CHIP error [TID 158415] [1723457022.788151484] : hipErrorNotInitialized (CL_INVALID_VALUE ) in /home/pvelesko/actions-runner/_work/chipStar/chipStar/src/backend/OpenCL/CHIPBackendOpenCL.cc:817:compileIL
1335:
1335: CHIP error [TID 158415] [1723457022.788319450] : Caught Error: hipErrorNotInitialized
1335: ERROR on line 62: 3
3/9 Test #1335: broadcast2 .......................***Failed  Required regular expression not found. Regex=[PASSED
]  0.58 sec
test 1336
    Start 1336: 2d_shuffle

1336: Test command: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/bin/spirv-extractor "--check-for-doubles" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/5_2dshfl/2dshfl"
1336: Test timeout computed to be: 10000000
1336: CHIP warning [TID 158427] [1723457023.382947028] : The device might not support subgroup size 32, warp-size sensitive kernels might not work correctly.
1336: Device name Mali-G52 r0p0
1336: CHIP error [TID 158427] [1723457023.384035833] : hipErrorNotInitialized (CL_INVALID_VALUE ) in /home/pvelesko/actions-runner/_work/chipStar/chipStar/src/backend/OpenCL/CHIPBackendOpenCL.cc:817:compileIL
1336:
1336: CHIP error [TID 158427] [1723457023.384205800] : Caught Error: hipErrorNotInitialized
1336: ITEM 0 OK
1336: ITEM: 1 cpu: 40 gpu: 0
1336: ITEM: 2 cpu: 80 gpu: 2.8026e-45
1336: ITEM: 3 cpu: 120 gpu: 0
1336: ITEM: 4 cpu: 10 gpu: 2.8026e-45
1336: ITEM: 5 cpu: 50 gpu: 0
1336: ITEM: 6 cpu: 90 gpu: 2.8026e-45
1336: ITEM: 7 cpu: 130 gpu: 0
1336: ITEM: 8 cpu: 20 gpu: 2.8026e-45
1336: ITEM: 9 cpu: 60 gpu: 0
1336: ITEM: 10 cpu: 100 gpu: 2.8026e-45
1336: ITEM: 11 cpu: 140 gpu: 0
1336: ITEM: 12 cpu: 30 gpu: 2.8026e-45
1336: ITEM: 13 cpu: 70 gpu: 0
1336: ITEM: 14 cpu: 110 gpu: 2.8026e-45
1336: ITEM: 15 cpu: 150 gpu: 0
1336: FAIL: 15 errors
4/9 Test #1336: 2d_shuffle .......................***Failed  Required regular expression not found. Regex=[PASSED
]  0.64 sec
test 1341
    Start 1341: unroll

1341: Test command: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/bin/spirv-extractor "--check-for-doubles" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/9_unroll/unroll"
1341: Test timeout computed to be: 10000000
1341: CHIP warning [TID 158439] [1723457024.016589235] : The device might not support subgroup size 32, warp-size sensitive kernels might not work correctly.
1341: Device name Mali-G52 r0p0
1341: CHIP error [TID 158439] [1723457024.017691374] : hipErrorNotInitialized (CL_INVALID_VALUE ) in /home/pvelesko/actions-runner/_work/chipStar/chipStar/src/backend/OpenCL/CHIPBackendOpenCL.cc:817:compileIL
1341:
1341: CHIP error [TID 158439] [1723457024.017855174] : Caught Error: hipErrorNotInitialized
1341: 1 cpu: 40.000000 gpu  0.000000
1341: 2 cpu: 80.000000 gpu  0.000000
1341: 3 cpu: 120.000000 gpu  0.000000
1341: 4 cpu: 10.000000 gpu  0.000000
1341: 5 cpu: 50.000000 gpu  0.000000
1341: 6 cpu: 90.000000 gpu  0.000000
1341: 7 cpu: 130.000000 gpu  0.000000
1341: 8 cpu: 20.000000 gpu  0.000000
1341: 9 cpu: 60.000000 gpu  0.000000
1341: 10 cpu: 100.000000 gpu  0.000000
1341: 11 cpu: 140.000000 gpu  0.000000
1341: 12 cpu: 30.000000 gpu  0.000000
1341: 13 cpu: 70.000000 gpu  0.000000
1341: 14 cpu: 110.000000 gpu  0.000000
1341: 15 cpu: 150.000000 gpu  0.000000
1341: FAILED: 15 errors
5/9 Test #1341: unroll ...........................***Failed  Required regular expression not found. Regex=[PASSED
]  0.59 sec
test 1364
    Start 1364: PrintfSimple

1364: Test command: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/bin/spirv-extractor "--check-for-doubles" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/printf/strings"
1364: Test timeout computed to be: 10000000
1364: CHIP warning [TID 158451] [1723457024.603341126] : The device might not support subgroup size 32, warp-size sensitive kernels might not work correctly.
1364: PASSED!
6/9 Test #1364: PrintfSimple .....................***Failed  Required regular expression not found. Regex=[## no_args.*Hello.*## literal_str_arg.*Hello strings.*## global_str_arg.*HELLO.* STRiNGS.*PASSED!

]  2.47 sec
test 1365
    Start 1365: PrintfNOP

1365: Test command: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/bin/spirv-extractor "--check-for-doubles" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/printf/nop_printfs"
1365: Test timeout computed to be: 10000000
1365: CHIP warning [TID 158463] [1723457027.072899529] : The device might not support subgroup size 32, warp-size sensitive kernels might not work correctly.
1365: PASSED!
7/9 Test #1365: PrintfNOP ........................***Failed  Required regular expression not found. Regex=[Howdy HIPpies!.*I'm correct since I'm not an empty string!.*PASSED!

]  1.26 sec
test 1366
    Start 1366: PrintfDynamic

1366: Test command: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/bin/spirv-extractor "--check-for-doubles" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/printf/dynamic_str_args"
1366: Test timeout computed to be: 10000000
1366: CHIP warning [TID 158475] [1723457028.347460986] : The device might not support subgroup size 32, warp-size sensitive kernels might not work correctly.
1366: PASSED!
8/9 Test #1366: PrintfDynamic ....................***Failed  Required regular expression not found. Regex=[## conditional string.*inout\[0\] was 1 1.*## array of strings, selecting index 1.*I am a dynamic str arg B.*## host defined string: Extremely dynamic printf str!.*## host defined string with skip: mely dynamic printf str!.*PASSED!

]  2.53 sec
test 1367
    Start 1367: shuffles

1367: Test command: /usr/bin/cmake "-P" "/home/pvelesko/actions-runner/_work/chipStar/chipStar/build/samples/shuffles/run-shuffles.cmake"
1367: Test timeout computed to be: 10000000
1367: CMake Error at run-shuffles.cmake:19 (message):
1367:   FAIL: Standard output does not match 'shuffles.xstdout'
1367:
1367:
9/9 Test #1367: shuffles .........................***Failed    0.68 sec

0% tests passed, 9 tests failed out of 9

Label Time Summary:
internal    =   9.23 sec*proc (8 tests)

Total Test time (real) =  10.08 sec

The following tests FAILED:
    1333 - shuffle (Failed)
    1334 - broadcast (Failed)
    1335 - broadcast2 (Failed)
    1336 - 2d_shuffle (Failed)
    1341 - unroll (Failed)
    1364 - PrintfSimple (Failed)
    1365 - PrintfNOP (Failed)
    1366 - PrintfDynamic (Failed)
    1367 - shuffles (Failed)
Errors while running CTest
Output from these tests are in: /home/pvelesko/actions-runner/_work/chipStar/chipStar/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
pvelesko commented 2 weeks ago

Conclusions:

pjaaskel commented 2 weeks ago

Does it support the KHR subgroup shuffles? Did you notice in devicelib.cl:

// Use the Intel versions for now by default, since the Intel OpenCL CPU
// driver still implements only them, not the KHR versions.
#define sub_group_shuffle intel_sub_group_shuffle
#define sub_group_shuffle_xor intel_sub_group_shuffle_xor

If it supports the KHR one, we could detect at build time that we are building on non-Intel and use that by the default. Ideally, this would be selected at .spv linkage time in chipStar.

pvelesko commented 2 weeks ago
                                                  cl_khr_subgroups                                                 0x400000 (1.0.0)
                                                  cl_khr_subgroup_extended_types                                   0x400000 (1.0.0)
                                                  cl_khr_subgroup_non_uniform_vote                                 0x400000 (1.0.0)
                                                  cl_khr_subgroup_ballot                                           0x400000 (1.0.0)
                                                  cl_khr_subgroup_non_uniform_arithmetic                           0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle                                          0x400000 (1.0.0)
                                                  cl_khr_subgroup_shuffle_relative                                 0x400000 (1.0.0)
                                                  cl_khr_subgroup_clustered_reduce                                 0x400000 (1.0.0)

Seems like it is supported.

pvelesko commented 2 weeks ago

So seems like this is one of the issues but there are more - I haven't investigated every single SPIR-V.