ROCm / ROCm-Device-Libs

ROCm Device Libraries
97 stars 60 forks source link

5.6.0: test suite is failing in two units #95

Closed kloczek closed 8 months ago

kloczek commented 1 year ago

Looks like something is wrong and test suite is failing in two units

```console + cd ROCm-Device-Libs-rocm-5.6.0 + /usr/bin/ctest --test-dir x86_64-redhat-linux-gnu --output-on-failure --force-new-ctest-process -j48 Internal ctest changing into directory: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/x86_64-redhat-linux-gnu Test project /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/x86_64-redhat-linux-gnu Start 1: constant_fold_lgamma_r__gfx900 Start 2: constant_fold_lgamma_r__gfx1030 Start 3: compile_asin__gfx700 Start 4: compile_atan2__gfx700 Start 5: compile_atan2pi__gfx700 Start 6: compile_asin__gfx803 Start 7: compile_atan2__gfx803 Start 8: compile_atan2pi__gfx803 Start 9: compile_frexp__gfx600 Start 10: compile_frexp__gfx700 1/10 Test #3: compile_asin__gfx700 .............. Passed 0.14 sec 2/10 Test #10: compile_frexp__gfx700 ............. Passed 0.13 sec 3/10 Test #8: compile_atan2pi__gfx803 ........... Passed 0.14 sec 4/10 Test #6: compile_asin__gfx803 .............. Passed 0.15 sec 5/10 Test #7: compile_atan2__gfx803 ............. Passed 0.15 sec 6/10 Test #9: compile_frexp__gfx600 ............. Passed 0.15 sec 7/10 Test #1: constant_fold_lgamma_r__gfx900 .... Passed 0.17 sec 8/10 Test #2: constant_fold_lgamma_r__gfx1030 ... Passed 0.17 sec 9/10 Test #4: compile_atan2__gfx700 .............***Failed 0.27 sec CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/RunCompileTest.cmake:24 (message): Error compiling test: fatal error: error in backend: Cannot select: 0x5626e75ceee0: f16 = fma # D:1 0x5626e70ab360, 0x5626e75cee70, 0x5626e70ab360 0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440 0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0> 0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90 0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80 0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e70ab280: i32 = TargetConstant<0> 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e75cee70: f16 = fmul # D:1 0x5626e75cebd0, 0x5626e75cee00 0x5626e75cebd0: f16 = fmul # D:1 0x5626e70ab360, 0x5626e70ab360 0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440 0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0> 0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90 0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80 0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e70ab280: i32 = TargetConstant<0> 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440 0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0> 0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90 0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80 0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e70ab280: i32 = TargetConstant<0> 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e75cee00: f16 = fma # D:1 0x5626e75cebd0, 0x5626e75ced20, ConstantFP:f16 0x5626e75cebd0: f16 = fmul # D:1 0x5626e70ab360, 0x5626e70ab360 0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440 0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0> 0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90 0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80 0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e70ab280: i32 = TargetConstant<0> 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440 0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0> 0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90 0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80 0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e70ab280: i32 = TargetConstant<0> 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e75ced20: f16 = fma # D:1 0x5626e75cebd0, ConstantFP:f16, ConstantFP:f16 0x5626e75cebd0: f16 = fmul # D:1 0x5626e70ab360, 0x5626e70ab360 0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440 0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0> 0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90 0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80 0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80 0x5626e70ab280: i32 = TargetConstant<0> 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440 0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0> 0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90 0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80 0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80 0x5626e70ab280: i32 = TargetConstant<0> 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75cecb0: f16 = ConstantFP 0x5626e75cec40: f16 = ConstantFP 0x5626e75ced90: f16 = ConstantFP 0x5626e70ab360: f16 = DIV_FIXUP # D:1 0x5626e70ab2f0, 0x5626e75cea80, 0x5626e70ab440 0x5626e70ab2f0: f16 = fp_round # D:1 0x5626e70ab210, TargetConstant:i32<0> 0x5626e70ab210: f32 = fmul # D:1 0x5626e70aab10, 0x5626e70aae90 0x5626e70aab10: f32 = fp_extend # D:1 0x5626e70ab440 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70aae90: f32 = RCP # D:1 0x5626e70aab80 0x5626e70aab80: f32 = fp_extend # D:1 0x5626e75cea80 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e70ab280: i32 = TargetConstant<0> 0x5626e75cea80: f16 = fmaxnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 0x5626e70ab440: f16 = fminnum_ieee # D:1 0x5626e75ceb60, 0x5626e70ab3d0 0x5626e75ceb60: f16 = fcanonicalize # D:1 0x5626e75ce9a0 0x5626e75ce9a0: f16 = fabs # D:1 0x5626e75ce930 0x5626e75ce930: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %1 0x5626e75ce8c0: f16 = Register %1 0x5626e70ab3d0: f16 = fcanonicalize # D:1 0x5626e75cea10 0x5626e75cea10: f16 = fabs # D:1 0x5626e75ce850 0x5626e75ce850: f16,ch = CopyFromReg # D:1 0x5626e76b2500, Register:f16 %0 0x5626e75ce7e0: f16 = Register %0 In function: __ocml_atan2_f16 PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /usr/bin/clang-16 -O3 -S -cl-std=CL2.0 -target amdgcn-amd-amdhsa -mcpu=gfx700 -Xclang -finclude-default-header --rocm-path=/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/x86_64-redhat-linux-gnu -mllvm -amdgpu-simplify-libcall=0 -o output.atan2.gfx700.s /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/atan2.cl 1. parser at end of file 2. Code generation 3. Running pass 'CallGraph Pass Manager' on module '/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/atan2.cl'. 4. Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@__ocml_atan2_f16' #0 0x00007ff6399e3911 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/lib64/libLLVM-16.so.0+0xde3911) #1 0x00007ff6399e161a llvm::sys::RunSignalHandlers() (/lib64/libLLVM-16.so.0+0xde161a) #2 0x00007ff6398ff72a llvm::CrashRecoveryContext::HandleExit(int) (/lib64/libLLVM-16.so.0+0xcff72a) #3 0x00007ff6399db6c4 llvm::sys::Process::Exit(int, bool) (/lib64/libLLVM-16.so.0+0xddb6c4) #4 0x00005626e5890fa6 (/usr/bin/clang-16+0x11fa6) #5 0x00007ff6399116c5 llvm::report_fatal_error(llvm::Twine const&, bool) (/lib64/libLLVM-16.so.0+0xd116c5) #6 0x00007ff63a325a42 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/lib64/libLLVM-16.so.0+0x1725a42) #7 0x00007ff63a32a893 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/lib64/libLLVM-16.so.0+0x172a893) #8 0x00007ff63a322ddc llvm::SelectionDAGISel::DoInstructionSelection() (/lib64/libLLVM-16.so.0+0x1722ddc) #9 0x00007ff63a32d179 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/lib64/libLLVM-16.so.0+0x172d179) #10 0x00007ff63a330461 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/lib64/libLLVM-16.so.0+0x1730461) #11 0x00007ff63a3321bc (/lib64/libLLVM-16.so.0+0x17321bc) #12 0x00007ff639df9897 (/lib64/libLLVM-16.so.0+0x11f9897) #13 0x00007ff639b4466b llvm::FPPassManager::runOnFunction(llvm::Function&) (/lib64/libLLVM-16.so.0+0xf4466b) #14 0x00007ff63afffd73 (/lib64/libLLVM-16.so.0+0x23ffd73) #15 0x00007ff639b44ecc llvm::legacy::PassManagerImpl::run(llvm::Module&) (/lib64/libLLVM-16.so.0+0xf44ecc) #16 0x00007ff641ac1ead clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr>) (/lib64/libclang-cpp.so.16.0+0x14c1ead) #17 0x00007ff641ac47fb (/lib64/libclang-cpp.so.16.0+0x14c47fb) #18 0x00007ff642d60ba2 clang::ParseAST(clang::Sema&, bool, bool) (/lib64/libclang-cpp.so.16.0+0x2760ba2) #19 0x00007ff6424acf79 clang::FrontendAction::Execute() (/lib64/libclang-cpp.so.16.0+0x1eacf79) #20 0x00007ff642d6acfb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/lib64/libclang-cpp.so.16.0+0x276acfb) #21 0x00007ff642d6cbea clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/lib64/libclang-cpp.so.16.0+0x276cbea) #22 0x00005626e5897e68 cc1_main(llvm::ArrayRef, char const*, void*) (/usr/bin/clang-16+0x18e68) #23 0x00005626e589ddda (/usr/bin/clang-16+0x1edda) #24 0x00007ff640de31f1 (/lib64/libclang-cpp.so.16.0+0x7e31f1) #25 0x00007ff6398ff60a llvm::CrashRecoveryContext::RunSafely(llvm::function_ref) (/lib64/libLLVM-16.so.0+0xcff60a) #26 0x00007ff640f7440f (/lib64/libclang-cpp.so.16.0+0x97440f) #27 0x00007ff641a1b826 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/lib64/libclang-cpp.so.16.0+0x141b826) #28 0x00007ff641a1bb6c clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl>&, bool) const (/lib64/libclang-cpp.so.16.0+0x141bb6c) #29 0x00007ff641a20aec clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl>&) (/lib64/libclang-cpp.so.16.0+0x1420aec) #30 0x00005626e58a0518 clang_main(int, char**) (/usr/bin/clang-16+0x21518) #31 0x00007ff6384280ca __libc_start_call_main (/lib64/libc.so.6+0x280ca) #32 0x00007ff63842818b __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2818b) #33 0x00005626e588f175 _start (/usr/bin/clang-16+0x10175) clang-16: error: clang frontend command failed with exit code 70 (use -v to see invocation) clang version 16.0.6 (G2V 16.0.6-2.fc35) Target: amdgcn-amd-amdhsa Thread model: posix InstalledDir: /usr/bin clang-16: note: diagnostic msg: ******************** PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: Preprocessed source(s) and associated run script(s) are located at: clang-16: note: diagnostic msg: /tmp/atan2-84026d.cl clang-16: note: diagnostic msg: /tmp/atan2-84026d.sh clang-16: note: diagnostic msg: ******************** 10/10 Test #5: compile_atan2pi__gfx700 ...........***Failed 0.27 sec CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/RunCompileTest.cmake:24 (message): Error compiling test: fatal error: error in backend: Cannot select: 0x5584c1b533e0: f16 = fma # D:1 0x5584c1b530d0, 0x5584c1b53300, ConstantFP:f16 0x5584c1b530d0: f16 = fmul # D:1 0x5584c162ef30, 0x5584c162ef30 0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010 0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0> 0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60 0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750 0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c162ee50: i32 = TargetConstant<0> 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010 0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0> 0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60 0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750 0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c162ee50: i32 = TargetConstant<0> 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c1b53300: f16 = fma # D:1 0x5584c1b530d0, 0x5584c1b53220, ConstantFP:f16 0x5584c1b530d0: f16 = fmul # D:1 0x5584c162ef30, 0x5584c162ef30 0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010 0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0> 0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60 0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750 0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c162ee50: i32 = TargetConstant<0> 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010 0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0> 0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60 0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750 0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c162ee50: i32 = TargetConstant<0> 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c1b53220: f16 = fma # D:1 0x5584c1b530d0, ConstantFP:f16, ConstantFP:f16 0x5584c1b530d0: f16 = fmul # D:1 0x5584c162ef30, 0x5584c162ef30 0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010 0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0> 0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60 0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750 0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c162ee50: i32 = TargetConstant<0> 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162ef30: f16 = DIV_FIXUP # D:1 0x5584c162eec0, 0x5584c1b52f80, 0x5584c162f010 0x5584c162eec0: f16 = fp_round # D:1 0x5584c162ede0, TargetConstant:i32<0> 0x5584c162ede0: f32 = fmul # D:1 0x5584c162e6e0, 0x5584c162ea60 0x5584c162e6e0: f32 = fp_extend # D:1 0x5584c162f010 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c162ea60: f32 = RCP # D:1 0x5584c162e750 0x5584c162e750: f32 = fp_extend # D:1 0x5584c1b52f80 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c162ee50: i32 = TargetConstant<0> 0x5584c1b52f80: f16 = fmaxnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c162f010: f16 = fminnum_ieee # D:1 0x5584c1b53060, 0x5584c162efa0 0x5584c1b53060: f16 = fcanonicalize # D:1 0x5584c1b52ea0 0x5584c1b52ea0: f16 = fabs # D:1 0x5584c1b52e30 0x5584c1b52e30: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %1 0x5584c1b52dc0: f16 = Register %1 0x5584c162efa0: f16 = fcanonicalize # D:1 0x5584c1b52f10 0x5584c1b52f10: f16 = fabs # D:1 0x5584c1b52d50 0x5584c1b52d50: f16,ch = CopyFromReg # D:1 0x5584c1615ac0, Register:f16 %0 0x5584c1b52ce0: f16 = Register %0 0x5584c1b531b0: f16 = ConstantFP 0x5584c1b53140: f16 = ConstantFP 0x5584c1b53290: f16 = ConstantFP 0x5584c1b53370: f16 = ConstantFP In function: __ocml_atan2pi_f16 PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /usr/bin/clang-16 -O3 -S -cl-std=CL2.0 -target amdgcn-amd-amdhsa -mcpu=gfx700 -Xclang -finclude-default-header --rocm-path=/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/x86_64-redhat-linux-gnu -mllvm -amdgpu-simplify-libcall=0 -o output.atan2pi.gfx700.s /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/atan2pi.cl 1. parser at end of file 2. Code generation 3. Running pass 'CallGraph Pass Manager' on module '/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.6.0/test/compile/atan2pi.cl'. 4. Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@__ocml_atan2pi_f16' #0 0x00007fb1937e3911 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/lib64/libLLVM-16.so.0+0xde3911) #1 0x00007fb1937e161a llvm::sys::RunSignalHandlers() (/lib64/libLLVM-16.so.0+0xde161a) #2 0x00007fb1936ff72a llvm::CrashRecoveryContext::HandleExit(int) (/lib64/libLLVM-16.so.0+0xcff72a) #3 0x00007fb1937db6c4 llvm::sys::Process::Exit(int, bool) (/lib64/libLLVM-16.so.0+0xddb6c4) #4 0x00005584bf51bfa6 (/usr/bin/clang-16+0x11fa6) #5 0x00007fb1937116c5 llvm::report_fatal_error(llvm::Twine const&, bool) (/lib64/libLLVM-16.so.0+0xd116c5) #6 0x00007fb194125a42 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/lib64/libLLVM-16.so.0+0x1725a42) #7 0x00007fb19412a893 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/lib64/libLLVM-16.so.0+0x172a893) #8 0x00007fb194122ddc llvm::SelectionDAGISel::DoInstructionSelection() (/lib64/libLLVM-16.so.0+0x1722ddc) #9 0x00007fb19412d179 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/lib64/libLLVM-16.so.0+0x172d179) #10 0x00007fb194130461 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/lib64/libLLVM-16.so.0+0x1730461) #11 0x00007fb1941321bc (/lib64/libLLVM-16.so.0+0x17321bc) #12 0x00007fb193bf9897 (/lib64/libLLVM-16.so.0+0x11f9897) #13 0x00007fb19394466b llvm::FPPassManager::runOnFunction(llvm::Function&) (/lib64/libLLVM-16.so.0+0xf4466b) #14 0x00007fb194dffd73 (/lib64/libLLVM-16.so.0+0x23ffd73) #15 0x00007fb193944ecc llvm::legacy::PassManagerImpl::run(llvm::Module&) (/lib64/libLLVM-16.so.0+0xf44ecc) #16 0x00007fb19b8c1ead clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr>) (/lib64/libclang-cpp.so.16.0+0x14c1ead) #17 0x00007fb19b8c47fb (/lib64/libclang-cpp.so.16.0+0x14c47fb) #18 0x00007fb19cb60ba2 clang::ParseAST(clang::Sema&, bool, bool) (/lib64/libclang-cpp.so.16.0+0x2760ba2) #19 0x00007fb19c2acf79 clang::FrontendAction::Execute() (/lib64/libclang-cpp.so.16.0+0x1eacf79) #20 0x00007fb19cb6acfb clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/lib64/libclang-cpp.so.16.0+0x276acfb) #21 0x00007fb19cb6cbea clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/lib64/libclang-cpp.so.16.0+0x276cbea) #22 0x00005584bf522e68 cc1_main(llvm::ArrayRef, char const*, void*) (/usr/bin/clang-16+0x18e68) #23 0x00005584bf528dda (/usr/bin/clang-16+0x1edda) #24 0x00007fb19abe31f1 (/lib64/libclang-cpp.so.16.0+0x7e31f1) #25 0x00007fb1936ff60a llvm::CrashRecoveryContext::RunSafely(llvm::function_ref) (/lib64/libLLVM-16.so.0+0xcff60a) #26 0x00007fb19ad7440f (/lib64/libclang-cpp.so.16.0+0x97440f) #27 0x00007fb19b81b826 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/lib64/libclang-cpp.so.16.0+0x141b826) #28 0x00007fb19b81bb6c clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl>&, bool) const (/lib64/libclang-cpp.so.16.0+0x141bb6c) #29 0x00007fb19b820aec clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl>&) (/lib64/libclang-cpp.so.16.0+0x1420aec) #30 0x00005584bf52b518 clang_main(int, char**) (/usr/bin/clang-16+0x21518) #31 0x00007fb1922280ca __libc_start_call_main (/lib64/libc.so.6+0x280ca) #32 0x00007fb19222818b __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2818b) #33 0x00005584bf51a175 _start (/usr/bin/clang-16+0x10175) clang-16: error: clang frontend command failed with exit code 70 (use -v to see invocation) clang version 16.0.6 (G2V 16.0.6-2.fc35) Target: amdgcn-amd-amdhsa Thread model: posix InstalledDir: /usr/bin clang-16: note: diagnostic msg: ******************** PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: Preprocessed source(s) and associated run script(s) are located at: clang-16: note: diagnostic msg: /tmp/atan2pi-4fa3b0.cl clang-16: note: diagnostic msg: /tmp/atan2pi-4fa3b0.sh clang-16: note: diagnostic msg: ******************** 80% tests passed, 2 tests failed out of 10 Total Test time (real) = 0.28 sec The following tests FAILED: 4 - compile_atan2__gfx700 (Failed) 5 - compile_atan2pi__gfx700 (Failed) Errors while running CTest ```

I'm using LLVM 16.0.6. I'm not sure what more I should provide as details about my build env.

kloczek commented 1 year ago

gentle ping .. 🤔

b-sumner commented 1 year ago

Sorry I missed this earlier. We'll take a look.

arsenm commented 1 year ago

This could happen with mismatched compiler and device libs versions. The 16-bit target feature was applied to all the half functions, which would just break any non-legal half targets. The current device libs is free of the 16-bit features and should work on all subtargets. This should work if you use the latest device libs

kloczek commented 1 year ago

OK so what exactly it means?
cmake messed something?🤔

arsenm commented 1 year ago

OK so what exactly it means? cmake messed something?thinking

This was fixed by 439adec40dbddbe8b0c13c022ffc4eeb12e81b54 (which depended on quite a lot of patches in the library and compiler to avoid dependence on the 16-bit feature)

kloczek commented 1 year ago

Do you have any plans to make new release as looks like last one has some issues?

BTW is it possible to change tagging convention from rocm-<version> to just <version>? 🤔 Tar ball automatically generated from git tag base directory is assembled from -`.

arsenm commented 1 year ago

Do you have any plans to make new release as looks like last one has some issues?

Eventually there will be a new release. You can safely ignore these tests, they're incomplete and currently not run as part of any CI. I've only recently been trying to get the infrastructure for this type of testing going. The point of this test was to show the failure you see does fail, so really it should have been XFAIL to begin with.

BTW is it possible to change tagging convention from rocm-<version> to just <version>? thinking Tar ball automatically generated from git tag base directory is assembled from -`.

Not sure, I would hope this would be consistent across all rocm projects

kloczek commented 1 year ago

Eventually there will be a new release. You can safely ignore these tests, they're incomplete and currently not run as part of any CI. I've only recently been trying to get the infrastructure for this type of testing going. The point of this test was to show the failure you see does fail, so really it should have been XFAIL to begin with.

So this issue affects only test suite and not actual generated bytecode? 🤔

Not sure, I would hope this would be consistent across all rocm projects

I understand.

arsenm commented 1 year ago

Eventually there will be a new release. You can safely ignore these tests, they're incomplete and currently not run as part of any CI. I've only recently been trying to get the infrastructure for this type of testing going. The point of this test was to show the failure you see does fail, so really it should have been XFAIL to begin with.

So this issue affects only test suite and not actual generated bytecode? thinking

It shows half was broken on antique targets which don't have native half support. For OpenCL clang wouldn't report the half extension as available, so an ordinary use wouldn't run into this

b-sumner commented 1 year ago

FWIW, every compiled device code uses at least some of the library byte code. The release can't happen without it working properly.

kloczek commented 1 year ago

Just tested 5.7.0 and looks like now test suite is failing in 3 units

```console + cd ROCm-Device-Libs-rocm-5.7.0 + /usr/bin/ctest --test-dir x86_64-redhat-linux-gnu --output-on-failure --force-new-ctest-process -j48 ' ' Internal ctest changing into directory: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/x86_64-redhat-linux-gnu Test project /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/x86_64-redhat-linux-gnu Start 1: constant_fold_lgamma_r__gfx900 Start 2: constant_fold_lgamma_r__gfx1030 Start 3: compile_asin__gfx700 Start 4: compile_atan2__gfx700 Start 5: compile_atan2pi__gfx700 Start 6: compile_asin__gfx803 Start 7: compile_atan2__gfx803 Start 8: compile_atan2pi__gfx803 Start 9: compile_frexp__gfx600 Start 10: compile_frexp__gfx700 1/10 Test #3: compile_asin__gfx700 .............. Passed 0.27 sec 2/10 Test #1: constant_fold_lgamma_r__gfx900 .... Passed 0.28 sec 3/10 Test #2: constant_fold_lgamma_r__gfx1030 ... Passed 0.28 sec 4/10 Test #4: compile_atan2__gfx700 .............***Failed 0.28 sec CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/RunCompileTest.cmake:36 (message): Error in test output: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/atan2.cl:5:12: error: GFX700: expected string not found in input // GFX700: v_cvt_f32_f16 ^ output.atan2.gfx700.s:7:16: note: scanning from here test_atan2_f16: ; @test_atan2_f16 ^ output.atan2.gfx700.s:15:2: note: possible intended match here v_cvt_f16_f32_e32 v2, s0 ^ Input file: output.atan2.gfx700.s Check file: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/atan2.cl -dump-input=help explains the following input dump. Input was: <<<<<< 1: .text 2: .amdgcn_target "amdgcn-amd-amdhsa--gfx700" 3: .protected test_atan2_f16 ; -- Begin function test_atan2_f16 4: .globl test_atan2_f16 5: .p2align 8 6: .type test_atan2_f16,@function 7: test_atan2_f16: ; @test_atan2_f16 check:4 ^~~~~~~~~~~~~~~ check:5'0 X~~~~~~~~~~~~~~~~~~ error: no match found 8: ; %bb.0: check:5'0 ~~~~~~~~~ 9: s_load_dwordx2 s[0:1], s[4:5], 0x0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 10: v_lshlrev_b32_e32 v0, 1, v0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 11: s_add_i32 s6, s6, s9 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 12: s_mov_b32 flat_scratch_lo, s7 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 13: s_lshr_b32 flat_scratch_hi, s6, 8 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 14: s_waitcnt lgkmcnt(0) check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 15: v_cvt_f16_f32_e32 v2, s0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ check:5'1 ? possible intended match 16: v_mov_b32_e32 v1, s1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 17: v_add_i32_e32 v0, vcc, s0, v0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 18: v_addc_u32_e32 v1, vcc, 0, v1, vcc check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 19: flat_store_short v[0:1], v2 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 20: s_endpgm check:5'0 ~~~~~~~~~~ 21: .section .rodata,#alloc check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 22: .p2align 6, 0x0 check:5'0 ~~~~~~~~~~~~~~~~~ 23: .amdhsa_kernel test_atan2_f16 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 24: .amdhsa_group_segment_fixed_size 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 25: .amdhsa_private_segment_fixed_size 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 26: .amdhsa_kernarg_size 24 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 27: .amdhsa_user_sgpr_count 8 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 28: .amdhsa_user_sgpr_private_segment_buffer 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 29: .amdhsa_user_sgpr_dispatch_ptr 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 30: .amdhsa_user_sgpr_queue_ptr 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 31: .amdhsa_user_sgpr_kernarg_segment_ptr 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 32: .amdhsa_user_sgpr_dispatch_id 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33: .amdhsa_user_sgpr_flat_scratch_init 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 34: .amdhsa_user_sgpr_private_segment_size 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 35: .amdhsa_system_sgpr_private_segment_wavefront_offset 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 36: .amdhsa_system_sgpr_workgroup_id_x 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 37: .amdhsa_system_sgpr_workgroup_id_y 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 38: .amdhsa_system_sgpr_workgroup_id_z 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 39: .amdhsa_system_sgpr_workgroup_info 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 40: .amdhsa_system_vgpr_workitem_id 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 41: .amdhsa_next_free_vgpr 3 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 42: .amdhsa_next_free_sgpr 10 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 43: .amdhsa_float_round_mode_32 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 44: .amdhsa_float_round_mode_16_64 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 45: .amdhsa_float_denorm_mode_32 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 46: .amdhsa_float_denorm_mode_16_64 3 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 47: .amdhsa_dx10_clamp 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 48: .amdhsa_ieee_mode 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 49: .amdhsa_exception_fp_ieee_invalid_op 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 50: .amdhsa_exception_fp_denorm_src 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 51: .amdhsa_exception_fp_ieee_div_zero 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 52: .amdhsa_exception_fp_ieee_overflow 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 53: .amdhsa_exception_fp_ieee_underflow 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 54: .amdhsa_exception_fp_ieee_inexact 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55: .amdhsa_exception_int_div_zero 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56: .end_amdhsa_kernel check:5'0 ~~~~~~~~~~~~~~~~~~~~ 57: .text check:5'0 ~~~~~~~ 58: .Lfunc_end0: check:5'0 ~~~~~~~~~~~~~ 59: .size test_atan2_f16, .Lfunc_end0-test_atan2_f16 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 60: ; -- End function check:5'0 ~~~~~~~~~~~~~~~~~~~ 61: .section .AMDGPU.csdata check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 62: ; Kernel info: check:5'0 ~~~~~~~~~~~~~~~ 63: ; codeLenInByte = 52 check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 64: ; NumSgprs: 14 check:5'0 ~~~~~~~~~~~~~~~ 65: ; NumVgprs: 3 check:5'0 ~~~~~~~~~~~~~~ 66: ; ScratchSize: 0 check:5'0 ~~~~~~~~~~~~~~~~~ 67: ; MemoryBound: 0 check:5'0 ~~~~~~~~~~~~~~~~~ 68: ; FloatMode: 192 check:5'0 ~~~~~~~~~~~~~~~~~ 69: ; IeeeMode: 1 check:5'0 ~~~~~~~~~~~~~~ 70: ; LDSByteSize: 0 bytes/workgroup (compile time only) check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 71: ; SGPRBlocks: 1 check:5'0 ~~~~~~~~~~~~~~~~ 72: ; VGPRBlocks: 0 check:5'0 ~~~~~~~~~~~~~~~~ 73: ; NumSGPRsForWavesPerEU: 14 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 74: ; NumVGPRsForWavesPerEU: 3 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 75: ; Occupancy: 10 check:5'0 ~~~~~~~~~~~~~~~~ 76: ; WaveLimiterHint : 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 77: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 78: ; COMPUTE_PGM_RSRC2:USER_SGPR: 8 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 79: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 81: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 82: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 83: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 84: .ident "clang version 17.0.2 (G2V 17.0.2-2.fc35)" check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 85: .section ".note.GNU-stack" check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 86: .addrsig check:5'0 ~~~~~~~~~~ 87: .amdgpu_metadata check:5'0 ~~~~~~~~~~~~~~~~~~ 88: --- check:5'0 ~~~~ 89: amdhsa.kernels: check:5'0 ~~~~~~~~~~~~~~~~ 90: - .args: check:5'0 ~~~~~~~~~~ 91: - .address_space: global check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 92: .is_restrict: true check:5'0 ~~~~~~~~~~~~~~~~~~~~ 93: .offset: 0 check:5'0 ~~~~~~~~~~~~ 94: .size: 8 check:5'0 ~~~~~~~~~~ 95: .type_name: 'half*' check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 96: .value_kind: global_buffer check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 97: - .access: read_only check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 98: .address_space: global check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~ 99: .is_restrict: true check:5'0 ~~~~~~~~~~~~~~~~~~~~ 100: .offset: 8 check:5'0 ~~~~~~~~~~~~ 101: .size: 8 check:5'0 ~~~~~~~~~~ 102: .type_name: 'half*' check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 103: .value_kind: global_buffer check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 104: - .access: read_only check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 105: .address_space: global check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~ 106: .is_restrict: true check:5'0 ~~~~~~~~~~~~~~~~~~~~ 107: .offset: 16 check:5'0 ~~~~~~~~~~~~~ 108: .size: 8 check:5'0 ~~~~~~~~~~ 109: .type_name: 'half*' check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 110: .value_kind: global_buffer check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 111: .group_segment_fixed_size: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 112: .kernarg_segment_align: 8 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 113: .kernarg_segment_size: 24 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 114: .language: OpenCL C check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 115: .language_version: check:5'0 ~~~~~~~~~~~~~~~~~~~~ 116: - 2 check:5'0 ~~~~~ 117: - 0 check:5'0 ~~~~~ 118: .max_flat_workgroup_size: 256 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 119: .name: test_atan2_f16 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~ 120: .private_segment_fixed_size: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 121: .sgpr_count: 14 check:5'0 ~~~~~~~~~~~~~~~~~ 122: .sgpr_spill_count: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 123: .symbol: test_atan2_f16.kd check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 124: .vgpr_count: 3 check:5'0 ~~~~~~~~~~~~~~~~ 125: .vgpr_spill_count: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 126: .wavefront_size: 64 check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 127: amdhsa.target: amdgcn-amd-amdhsa--gfx700 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 128: amdhsa.version: check:5'0 ~~~~~~~~~~~~~~~~ 129: - 1 check:5'0 ~~~~~ 130: - 1 check:5'0 ~~~~~ 131: ... check:5'0 ~~~~ 132: check:5'0 ~ 133: .end_amdgpu_metadata check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ >>>>>> 5/10 Test #10: compile_frexp__gfx700 ............. Passed 0.27 sec 6/10 Test #6: compile_asin__gfx803 .............. Passed 0.29 sec 7/10 Test #7: compile_atan2__gfx803 ............. Passed 0.30 sec 8/10 Test #8: compile_atan2pi__gfx803 ........... Passed 0.31 sec 9/10 Test #5: compile_atan2pi__gfx700 ...........***Failed 0.32 sec CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/RunCompileTest.cmake:36 (message): Error in test output: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/atan2pi.cl:5:12: error: GFX700: expected string not found in input // GFX700: v_cvt_f32_f16 ^ output.atan2pi.gfx700.s:7:18: note: scanning from here test_atan2pi_f16: ; @test_atan2pi_f16 ^ output.atan2pi.gfx700.s:15:2: note: possible intended match here v_cvt_f16_f32_e32 v2, s0 ^ Input file: output.atan2pi.gfx700.s Check file: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/atan2pi.cl -dump-input=help explains the following input dump. Input was: <<<<<< 1: .text 2: .amdgcn_target "amdgcn-amd-amdhsa--gfx700" 3: .protected test_atan2pi_f16 ; -- Begin function test_atan2pi_f16 4: .globl test_atan2pi_f16 5: .p2align 8 6: .type test_atan2pi_f16,@function 7: test_atan2pi_f16: ; @test_atan2pi_f16 check:4 ^~~~~~~~~~~~~~~~~ check:5'0 X~~~~~~~~~~~~~~~~~~~~ error: no match found 8: ; %bb.0: check:5'0 ~~~~~~~~~ 9: s_load_dwordx2 s[0:1], s[4:5], 0x0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 10: v_lshlrev_b32_e32 v0, 1, v0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 11: s_add_i32 s6, s6, s9 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 12: s_mov_b32 flat_scratch_lo, s7 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 13: s_lshr_b32 flat_scratch_hi, s6, 8 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 14: s_waitcnt lgkmcnt(0) check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 15: v_cvt_f16_f32_e32 v2, s0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ check:5'1 ? possible intended match 16: v_mov_b32_e32 v1, s1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 17: v_add_i32_e32 v0, vcc, s0, v0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 18: v_addc_u32_e32 v1, vcc, 0, v1, vcc check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 19: flat_store_short v[0:1], v2 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 20: s_endpgm check:5'0 ~~~~~~~~~~ 21: .section .rodata,#alloc check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 22: .p2align 6, 0x0 check:5'0 ~~~~~~~~~~~~~~~~~ 23: .amdhsa_kernel test_atan2pi_f16 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 24: .amdhsa_group_segment_fixed_size 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 25: .amdhsa_private_segment_fixed_size 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 26: .amdhsa_kernarg_size 24 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 27: .amdhsa_user_sgpr_count 8 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 28: .amdhsa_user_sgpr_private_segment_buffer 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 29: .amdhsa_user_sgpr_dispatch_ptr 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 30: .amdhsa_user_sgpr_queue_ptr 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 31: .amdhsa_user_sgpr_kernarg_segment_ptr 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 32: .amdhsa_user_sgpr_dispatch_id 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33: .amdhsa_user_sgpr_flat_scratch_init 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 34: .amdhsa_user_sgpr_private_segment_size 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 35: .amdhsa_system_sgpr_private_segment_wavefront_offset 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 36: .amdhsa_system_sgpr_workgroup_id_x 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 37: .amdhsa_system_sgpr_workgroup_id_y 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 38: .amdhsa_system_sgpr_workgroup_id_z 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 39: .amdhsa_system_sgpr_workgroup_info 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 40: .amdhsa_system_vgpr_workitem_id 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 41: .amdhsa_next_free_vgpr 3 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 42: .amdhsa_next_free_sgpr 10 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 43: .amdhsa_float_round_mode_32 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 44: .amdhsa_float_round_mode_16_64 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 45: .amdhsa_float_denorm_mode_32 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 46: .amdhsa_float_denorm_mode_16_64 3 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 47: .amdhsa_dx10_clamp 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 48: .amdhsa_ieee_mode 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 49: .amdhsa_exception_fp_ieee_invalid_op 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 50: .amdhsa_exception_fp_denorm_src 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 51: .amdhsa_exception_fp_ieee_div_zero 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 52: .amdhsa_exception_fp_ieee_overflow 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 53: .amdhsa_exception_fp_ieee_underflow 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 54: .amdhsa_exception_fp_ieee_inexact 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55: .amdhsa_exception_int_div_zero 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56: .end_amdhsa_kernel check:5'0 ~~~~~~~~~~~~~~~~~~~~ 57: .text check:5'0 ~~~~~~~ 58: .Lfunc_end0: check:5'0 ~~~~~~~~~~~~~ 59: .size test_atan2pi_f16, .Lfunc_end0-test_atan2pi_f16 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 60: ; -- End function check:5'0 ~~~~~~~~~~~~~~~~~~~ 61: .section .AMDGPU.csdata check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 62: ; Kernel info: check:5'0 ~~~~~~~~~~~~~~~ 63: ; codeLenInByte = 52 check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 64: ; NumSgprs: 14 check:5'0 ~~~~~~~~~~~~~~~ 65: ; NumVgprs: 3 check:5'0 ~~~~~~~~~~~~~~ 66: ; ScratchSize: 0 check:5'0 ~~~~~~~~~~~~~~~~~ 67: ; MemoryBound: 0 check:5'0 ~~~~~~~~~~~~~~~~~ 68: ; FloatMode: 192 check:5'0 ~~~~~~~~~~~~~~~~~ 69: ; IeeeMode: 1 check:5'0 ~~~~~~~~~~~~~~ 70: ; LDSByteSize: 0 bytes/workgroup (compile time only) check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 71: ; SGPRBlocks: 1 check:5'0 ~~~~~~~~~~~~~~~~ 72: ; VGPRBlocks: 0 check:5'0 ~~~~~~~~~~~~~~~~ 73: ; NumSGPRsForWavesPerEU: 14 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 74: ; NumVGPRsForWavesPerEU: 3 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 75: ; Occupancy: 10 check:5'0 ~~~~~~~~~~~~~~~~ 76: ; WaveLimiterHint : 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 77: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 78: ; COMPUTE_PGM_RSRC2:USER_SGPR: 8 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 79: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 81: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 82: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 83: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 84: .ident "clang version 17.0.2 (G2V 17.0.2-2.fc35)" check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 85: .section ".note.GNU-stack" check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 86: .addrsig check:5'0 ~~~~~~~~~~ 87: .amdgpu_metadata check:5'0 ~~~~~~~~~~~~~~~~~~ 88: --- check:5'0 ~~~~ 89: amdhsa.kernels: check:5'0 ~~~~~~~~~~~~~~~~ 90: - .args: check:5'0 ~~~~~~~~~~ 91: - .address_space: global check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 92: .is_restrict: true check:5'0 ~~~~~~~~~~~~~~~~~~~~ 93: .offset: 0 check:5'0 ~~~~~~~~~~~~ 94: .size: 8 check:5'0 ~~~~~~~~~~ 95: .type_name: 'half*' check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 96: .value_kind: global_buffer check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 97: - .access: read_only check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 98: .address_space: global check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~ 99: .is_restrict: true check:5'0 ~~~~~~~~~~~~~~~~~~~~ 100: .offset: 8 check:5'0 ~~~~~~~~~~~~ 101: .size: 8 check:5'0 ~~~~~~~~~~ 102: .type_name: 'half*' check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 103: .value_kind: global_buffer check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 104: - .access: read_only check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 105: .address_space: global check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~ 106: .is_restrict: true check:5'0 ~~~~~~~~~~~~~~~~~~~~ 107: .offset: 16 check:5'0 ~~~~~~~~~~~~~ 108: .size: 8 check:5'0 ~~~~~~~~~~ 109: .type_name: 'half*' check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 110: .value_kind: global_buffer check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 111: .group_segment_fixed_size: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 112: .kernarg_segment_align: 8 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 113: .kernarg_segment_size: 24 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 114: .language: OpenCL C check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 115: .language_version: check:5'0 ~~~~~~~~~~~~~~~~~~~~ 116: - 2 check:5'0 ~~~~~ 117: - 0 check:5'0 ~~~~~ 118: .max_flat_workgroup_size: 256 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 119: .name: test_atan2pi_f16 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 120: .private_segment_fixed_size: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 121: .sgpr_count: 14 check:5'0 ~~~~~~~~~~~~~~~~~ 122: .sgpr_spill_count: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 123: .symbol: test_atan2pi_f16.kd check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 124: .vgpr_count: 3 check:5'0 ~~~~~~~~~~~~~~~~ 125: .vgpr_spill_count: 0 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ 126: .wavefront_size: 64 check:5'0 ~~~~~~~~~~~~~~~~~~~~~ 127: amdhsa.target: amdgcn-amd-amdhsa--gfx700 check:5'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 128: amdhsa.version: check:5'0 ~~~~~~~~~~~~~~~~ 129: - 1 check:5'0 ~~~~~ 130: - 1 check:5'0 ~~~~~ 131: ... check:5'0 ~~~~ 132: check:5'0 ~ 133: .end_amdgpu_metadata check:5'0 ~~~~~~~~~~~~~~~~~~~~~~ >>>>>> 10/10 Test #9: compile_frexp__gfx600 .............***Failed 0.33 sec CMake Error at /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/RunCompileTest.cmake:36 (message): Error in test output: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/frexp.cl:8:16: error: GCN-DAG: expected string not found in input // GFX600-DAG: s_movk_i32 [[CLASS_MASK:s[0-9]+]], 0x1f8 ^ output.frexp.gfx600.s:7:16: note: scanning from here test_frexp_f32: ; @test_frexp_f32 ^ output.frexp.gfx600.s:17:2: note: possible intended match here v_mov_b32_e32 v3, 0x1f8 ^ /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/frexp.cl:34:16: error: CHECK-DAG: expected string not found in input // GFX600-DAG: s_movk_i32 [[CLASS_MASK:s[0-9]+]], 0x1f8 ^ output.frexp.gfx600.s:98:16: note: scanning from here test_frexp_f64: ; @test_frexp_f64 ^ output.frexp.gfx600.s:108:2: note: possible intended match here v_mov_b32_e32 v7, 0x1f8 ^ Input file: output.frexp.gfx600.s Check file: /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-5.7.0/test/compile/frexp.cl -dump-input=help explains the following input dump. Input was: <<<<<< 1: .text 2: .amdgcn_target "amdgcn-amd-amdhsa--gfx600" 3: .protected test_frexp_f32 ; -- Begin function test_frexp_f32 4: .globl test_frexp_f32 5: .p2align 8 6: .type test_frexp_f32,@function 7: test_frexp_f32: ; @test_frexp_f32 label:7'0 ^~~~~~~~~~~~~~~ label:7'1 ^~~~~~~~~~~~~~~ dag:8'0 X~~~~~~~~~~~~~~~~~~ error: no match found 8: ; %bb.0: dag:8'0 ~~~~~~~~~ 9: s_load_dwordx2 s[0:1], s[4:5], 0x4 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 10: s_mov_b32 s3, 0x100f000 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 11: s_mov_b32 s2, 0 dag:8'0 ~~~~~~~~~~~~~~~~~ 12: v_lshlrev_b32_e32 v0, 2, v0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 13: v_mov_b32_e32 v1, 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~ 14: s_waitcnt lgkmcnt(0) dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~ 15: buffer_load_dword v2, v[0:1], s[0:3], 0 addr64 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 16: s_load_dwordx4 s[4:7], s[4:5], 0x0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 17: v_mov_b32_e32 v3, 0x1f8 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ dag:8'1 ? possible intended match 18: s_waitcnt lgkmcnt(0) dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~ 19: s_mov_b64 s[0:1], s[4:5] dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 20: s_waitcnt vmcnt(0) dag:8'0 ~~~~~~~~~~~~~~~~~~~~ 21: v_frexp_mant_f32_e32 v5, v2 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 22: v_cmp_class_f32_e32 vcc, v2, v3 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 23: v_frexp_exp_i32_f32_e32 v4, v2 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 24: v_cndmask_b32_e32 v2, v2, v5, vcc dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 25: v_cndmask_b32_e32 v3, 0, v4, vcc dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 26: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 27: s_mov_b64 s[0:1], s[6:7] dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 28: buffer_store_dword v3, v[0:1], s[0:3], 0 addr64 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 29: s_endpgm dag:8'0 ~~~~~~~~~~ 30: .section .rodata,#alloc dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 31: .p2align 6, 0x0 dag:8'0 ~~~~~~~~~~~~~~~~~ 32: .amdhsa_kernel test_frexp_f32 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33: .amdhsa_group_segment_fixed_size 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 34: .amdhsa_private_segment_fixed_size 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 35: .amdhsa_kernarg_size 24 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 36: .amdhsa_user_sgpr_count 6 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 37: .amdhsa_user_sgpr_private_segment_buffer 1 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 38: .amdhsa_user_sgpr_dispatch_ptr 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 39: .amdhsa_user_sgpr_queue_ptr 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 40: .amdhsa_user_sgpr_kernarg_segment_ptr 1 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 41: .amdhsa_user_sgpr_dispatch_id 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 42: .amdhsa_user_sgpr_flat_scratch_init 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 43: .amdhsa_user_sgpr_private_segment_size 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 44: .amdhsa_system_sgpr_private_segment_wavefront_offset 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 45: .amdhsa_system_sgpr_workgroup_id_x 1 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 46: .amdhsa_system_sgpr_workgroup_id_y 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 47: .amdhsa_system_sgpr_workgroup_id_z 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 48: .amdhsa_system_sgpr_workgroup_info 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 49: .amdhsa_system_vgpr_workitem_id 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 50: .amdhsa_next_free_vgpr 6 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 51: .amdhsa_next_free_sgpr 8 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 52: .amdhsa_float_round_mode_32 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 53: .amdhsa_float_round_mode_16_64 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 54: .amdhsa_float_denorm_mode_32 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 55: .amdhsa_float_denorm_mode_16_64 3 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56: .amdhsa_dx10_clamp 1 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~ 57: .amdhsa_ieee_mode 1 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~ 58: .amdhsa_exception_fp_ieee_invalid_op 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 59: .amdhsa_exception_fp_denorm_src 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 60: .amdhsa_exception_fp_ieee_div_zero 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 61: .amdhsa_exception_fp_ieee_overflow 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 62: .amdhsa_exception_fp_ieee_underflow 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 63: .amdhsa_exception_fp_ieee_inexact 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 64: .amdhsa_exception_int_div_zero 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 65: .end_amdhsa_kernel dag:8'0 ~~~~~~~~~~~~~~~~~~~~ 66: .text dag:8'0 ~~~~~~~ 67: .Lfunc_end0: dag:8'0 ~~~~~~~~~~~~~ 68: .size test_frexp_f32, .Lfunc_end0-test_frexp_f32 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69: ; -- End function dag:8'0 ~~~~~~~~~~~~~~~~~~~ 70: .section .AMDGPU.csdata dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 71: ; Kernel info: dag:8'0 ~~~~~~~~~~~~~~~ 72: ; codeLenInByte = 104 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~ 73: ; NumSgprs: 10 dag:8'0 ~~~~~~~~~~~~~~~ 74: ; NumVgprs: 6 dag:8'0 ~~~~~~~~~~~~~~ 75: ; ScratchSize: 0 dag:8'0 ~~~~~~~~~~~~~~~~~ 76: ; MemoryBound: 0 dag:8'0 ~~~~~~~~~~~~~~~~~ 77: ; FloatMode: 192 dag:8'0 ~~~~~~~~~~~~~~~~~ 78: ; IeeeMode: 1 dag:8'0 ~~~~~~~~~~~~~~ 79: ; LDSByteSize: 0 bytes/workgroup (compile time only) dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80: ; SGPRBlocks: 1 dag:8'0 ~~~~~~~~~~~~~~~~ 81: ; VGPRBlocks: 1 dag:8'0 ~~~~~~~~~~~~~~~~ 82: ; NumSGPRsForWavesPerEU: 10 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 83: ; NumVGPRsForWavesPerEU: 6 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 84: ; Occupancy: 10 dag:8'0 ~~~~~~~~~~~~~~~~ 85: ; WaveLimiterHint : 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~ 86: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 87: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 88: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 89: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 90: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 91: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 92: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 93: .text dag:8'0 ~~~~~~~ 94: .protected test_frexp_f64 ; -- Begin function test_frexp_f64 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 95: .globl test_frexp_f64 dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~ 96: .p2align 8 dag:8'0 ~~~~~~~~~~~~ 97: .type test_frexp_f64,@function dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 98: test_frexp_f64: ; @test_frexp_f64 label:32 ^~~~~~~~~~~~~~~ dag:8'0 ~~~~~~~~~~~~~~~ dag:34'0 X~~~~~~~~~~~~~~~~~~ error: no match found 99: ; %bb.0: dag:34'0 ~~~~~~~~~ 100: s_load_dwordx2 s[0:1], s[4:5], 0x4 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 101: s_mov_b32 s3, 0x100f000 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 102: s_mov_b32 s2, 0 dag:34'0 ~~~~~~~~~~~~~~~~~ 103: v_lshlrev_b32_e32 v1, 3, v0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 104: v_mov_b32_e32 v2, 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~ 105: s_waitcnt lgkmcnt(0) dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 106: buffer_load_dwordx2 v[3:4], v[1:2], s[0:3], 0 addr64 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 107: s_load_dwordx4 s[4:7], s[4:5], 0x0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 108: v_mov_b32_e32 v7, 0x1f8 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ dag:34'1 ? possible intended match 109: s_waitcnt lgkmcnt(0) dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 110: s_mov_b64 s[0:1], s[4:5] dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 111: s_waitcnt vmcnt(0) dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 112: v_frexp_mant_f64_e32 v[5:6], v[3:4] dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 113: v_cmp_class_f64_e32 vcc, v[3:4], v7 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 114: v_frexp_exp_i32_f64_e32 v8, v[3:4] dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 115: v_cndmask_b32_e32 v4, v4, v6, vcc dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 116: v_cndmask_b32_e32 v3, v3, v5, vcc dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 117: v_cndmask_b32_e32 v7, 0, v8, vcc dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 118: buffer_store_dwordx2 v[3:4], v[1:2], s[0:3], 0 addr64 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 119: s_mov_b64 s[0:1], s[6:7] dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 120: v_lshlrev_b32_e32 v1, 2, v0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 121: buffer_store_dword v7, v[1:2], s[0:3], 0 addr64 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 122: s_endpgm dag:34'0 ~~~~~~~~~~ 123: .section .rodata,#alloc dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 124: .p2align 6, 0x0 dag:34'0 ~~~~~~~~~~~~~~~~~ 125: .amdhsa_kernel test_frexp_f64 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 126: .amdhsa_group_segment_fixed_size 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 127: .amdhsa_private_segment_fixed_size 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 128: .amdhsa_kernarg_size 24 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 129: .amdhsa_user_sgpr_count 6 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 130: .amdhsa_user_sgpr_private_segment_buffer 1 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 131: .amdhsa_user_sgpr_dispatch_ptr 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 132: .amdhsa_user_sgpr_queue_ptr 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 133: .amdhsa_user_sgpr_kernarg_segment_ptr 1 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 134: .amdhsa_user_sgpr_dispatch_id 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 135: .amdhsa_user_sgpr_flat_scratch_init 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 136: .amdhsa_user_sgpr_private_segment_size 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 137: .amdhsa_system_sgpr_private_segment_wavefront_offset 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 138: .amdhsa_system_sgpr_workgroup_id_x 1 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 139: .amdhsa_system_sgpr_workgroup_id_y 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 140: .amdhsa_system_sgpr_workgroup_id_z 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 141: .amdhsa_system_sgpr_workgroup_info 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 142: .amdhsa_system_vgpr_workitem_id 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 143: .amdhsa_next_free_vgpr 9 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 144: .amdhsa_next_free_sgpr 8 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 145: .amdhsa_float_round_mode_32 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 146: .amdhsa_float_round_mode_16_64 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 147: .amdhsa_float_denorm_mode_32 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 148: .amdhsa_float_denorm_mode_16_64 3 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 149: .amdhsa_dx10_clamp 1 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 150: .amdhsa_ieee_mode 1 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~ 151: .amdhsa_exception_fp_ieee_invalid_op 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 152: .amdhsa_exception_fp_denorm_src 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 153: .amdhsa_exception_fp_ieee_div_zero 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 154: .amdhsa_exception_fp_ieee_overflow 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 155: .amdhsa_exception_fp_ieee_underflow 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 156: .amdhsa_exception_fp_ieee_inexact 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 157: .amdhsa_exception_int_div_zero 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 158: .end_amdhsa_kernel dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 159: .text dag:34'0 ~~~~~~~ 160: .Lfunc_end1: dag:34'0 ~~~~~~~~~~~~~ 161: .size test_frexp_f64, .Lfunc_end1-test_frexp_f64 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 162: ; -- End function dag:34'0 ~~~~~~~~~~~~~~~~~~~ 163: .section .AMDGPU.csdata dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~ 164: ; Kernel info: dag:34'0 ~~~~~~~~~~~~~~~ 165: ; codeLenInByte = 112 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 166: ; NumSgprs: 10 dag:34'0 ~~~~~~~~~~~~~~~ 167: ; NumVgprs: 9 dag:34'0 ~~~~~~~~~~~~~~ 168: ; ScratchSize: 0 dag:34'0 ~~~~~~~~~~~~~~~~~ 169: ; MemoryBound: 0 dag:34'0 ~~~~~~~~~~~~~~~~~ 170: ; FloatMode: 192 dag:34'0 ~~~~~~~~~~~~~~~~~ 171: ; IeeeMode: 1 dag:34'0 ~~~~~~~~~~~~~~ 172: ; LDSByteSize: 0 bytes/workgroup (compile time only) dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 173: ; SGPRBlocks: 1 dag:34'0 ~~~~~~~~~~~~~~~~ 174: ; VGPRBlocks: 2 dag:34'0 ~~~~~~~~~~~~~~~~ 175: ; NumSGPRsForWavesPerEU: 10 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 176: ; NumVGPRsForWavesPerEU: 9 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 177: ; Occupancy: 10 dag:34'0 ~~~~~~~~~~~~~~~~ 178: ; WaveLimiterHint : 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 179: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 180: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 181: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 182: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 183: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 184: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 185: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 186: .ident "clang version 17.0.2 (G2V 17.0.2-2.fc35)" dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 187: .section ".note.GNU-stack" dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 188: .addrsig dag:34'0 ~~~~~~~~~~ 189: .amdgpu_metadata dag:34'0 ~~~~~~~~~~~~~~~~~~ 190: --- dag:34'0 ~~~~ 191: amdhsa.kernels: dag:34'0 ~~~~~~~~~~~~~~~~ 192: - .args: dag:34'0 ~~~~~~~~~~ 193: - .address_space: global dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 194: .is_restrict: true dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 195: .offset: 0 dag:34'0 ~~~~~~~~~~~~ 196: .size: 8 dag:34'0 ~~~~~~~~~~ 197: .type_name: 'float*' dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 198: .value_kind: global_buffer dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 199: - .address_space: global dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 200: .is_restrict: true dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 201: .offset: 8 dag:34'0 ~~~~~~~~~~~~ 202: .size: 8 dag:34'0 ~~~~~~~~~~ 203: .type_name: 'int*' dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 204: .value_kind: global_buffer dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 205: - .access: read_only dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 206: .address_space: global dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~ 207: .is_restrict: true dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 208: .offset: 16 dag:34'0 ~~~~~~~~~~~~~ 209: .size: 8 dag:34'0 ~~~~~~~~~~ 210: .type_name: 'float*' dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 211: .value_kind: global_buffer dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 212: .group_segment_fixed_size: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 213: .kernarg_segment_align: 8 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 214: .kernarg_segment_size: 24 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 215: .language: OpenCL C dag:34'0 ~~~~~~~~~~~~~~~~~~~~~ 216: .language_version: dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 217: - 2 dag:34'0 ~~~~~ 218: - 0 dag:34'0 ~~~~~ 219: .max_flat_workgroup_size: 256 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 220: .name: test_frexp_f32 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~ 221: .private_segment_fixed_size: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 222: .sgpr_count: 10 dag:34'0 ~~~~~~~~~~~~~~~~~ 223: .sgpr_spill_count: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 224: .symbol: test_frexp_f32.kd dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 225: .vgpr_count: 6 dag:34'0 ~~~~~~~~~~~~~~~~ 226: .vgpr_spill_count: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 227: .wavefront_size: 64 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~ 228: - .args: dag:34'0 ~~~~~~~~~~ 229: - .address_space: global dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 230: .is_restrict: true dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 231: .offset: 0 dag:34'0 ~~~~~~~~~~~~ 232: .size: 8 dag:34'0 ~~~~~~~~~~ 233: .type_name: 'double*' dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~ 234: .value_kind: global_buffer dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 235: - .address_space: global dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 236: .is_restrict: true dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 237: .offset: 8 dag:34'0 ~~~~~~~~~~~~ 238: .size: 8 dag:34'0 ~~~~~~~~~~ 239: .type_name: 'int*' dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 240: .value_kind: global_buffer dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 241: - .access: read_only dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 242: .address_space: global dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~ 243: .is_restrict: true dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 244: .offset: 16 dag:34'0 ~~~~~~~~~~~~~ 245: .size: 8 dag:34'0 ~~~~~~~~~~ 246: .type_name: 'double*' dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~ 247: .value_kind: global_buffer dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 248: .group_segment_fixed_size: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 249: .kernarg_segment_align: 8 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 250: .kernarg_segment_size: 24 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 251: .language: OpenCL C dag:34'0 ~~~~~~~~~~~~~~~~~~~~~ 252: .language_version: dag:34'0 ~~~~~~~~~~~~~~~~~~~~ 253: - 2 dag:34'0 ~~~~~ 254: - 0 dag:34'0 ~~~~~ 255: .max_flat_workgroup_size: 256 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 256: .name: test_frexp_f64 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~ 257: .private_segment_fixed_size: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 258: .sgpr_count: 10 dag:34'0 ~~~~~~~~~~~~~~~~~ 259: .sgpr_spill_count: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 260: .symbol: test_frexp_f64.kd dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 261: .vgpr_count: 9 dag:34'0 ~~~~~~~~~~~~~~~~ 262: .vgpr_spill_count: 0 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ 263: .wavefront_size: 64 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~ 264: amdhsa.target: amdgcn-amd-amdhsa--gfx600 dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 265: amdhsa.version: dag:34'0 ~~~~~~~~~~~~~~~~ 266: - 1 dag:34'0 ~~~~~ 267: - 1 dag:34'0 ~~~~~ 268: ... dag:34'0 ~~~~ 269: dag:34'0 ~ 270: .end_amdgpu_metadata dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~ >>>>>> 70% tests passed, 3 tests failed out of 10 Total Test time (real) = 0.35 sec The following tests FAILED: 4 - compile_atan2__gfx700 (Failed) 5 - compile_atan2pi__gfx700 (Failed) 9 - compile_frexp__gfx600 (Failed) Errors while running CTest ```
kloczek commented 1 year ago

BTW: why .bc files are installed in $(prefix)/amdgcn/bitcode/ directory? Are you sure that this is correct location? (is it should not be somewhere under $(libdir)? 🤔

kloczek commented 10 months ago

Just tested 5.7.1 an test suite still fails in the same 3 units. Any update about install path of .bc files? 🤔

kzhuravl commented 10 months ago

.bc files are installed into clang resource dir (/lib/clang//lib/amdgcn/bitcode) in 6.0.

I'd suggest checking device-libs tests in 6.0, when it becomes available publicly.

kloczek commented 10 months ago

Just checked 6.0.0 and all .bc files are still installed in the same path. Test suite fails in the same 3 units as well.

arsenm commented 8 months ago

Just tested 5.7.1 an test suite still fails in the same 3 units. Any update about install path of .bc files? 🤔

All tests should now pass as of https://github.com/ROCm/llvm-project/commit/794ebeffcafbf6f4d86cb1bfd7a5a0d1d30f1fc7

arsenm commented 8 months ago

BTW: why .bc files are installed in $(prefix)/amdgcn/bitcode/ directory? Are you sure that this is correct location? (is it should not be somewhere under $(libdir)? 🤔

This has been an endless source of debate; they should be moving to the clang resource directory

arsenm commented 8 months ago

Closing since the tests should now be passing, and this repository is closed and should be archived. The new location is under ROCm/llvm-project