Closed pvelesko closed 1 year ago
With both backends?
OpenCL CPU
The following tests FAILED:
810 - abort (Failed)
859 - hip_sycl_interop (Subprocess aborted)
860 - hip_sycl_interop_no_buffers (Subprocess aborted)
What is the special difference in this system? Some old LLVM version or similar?
I tried to reproduce and saw different fails for the LZ backed.
The following tests FAILED:
603 - Unit_hipFreeDoubleArray (Failed)
779 - TestIndirectMappedHostAlloc (Failed)
These failed at runtime, there were no compile fails at least. But I maybe built slightly differently from you?
I used:
- git clone git@github.com:CHIP-SPV/chip-spv.git, commit f30cff151a6256cbfa00c0a2a5fd2170b7181cad
- git clone git@github.com:CHIP-SPV/SPIRV-LLVM-Translator.git -b chipspv-llvm-16-patches , commit 0d66986d102f782f5d91c22a1f8e50e0bf2cfbe8
- git clone git@github.com:CHIP-SPV/llvm-project.git -b chipspv-llvm-16-patches , commit 6161a08fd1688e57b3bd1c09f50b664673e9e50a
and using gcc 11.2.0 and intel_compute_runtime/release/agama-devel-627
if it matters.
Can you also try with the CL backend?
I tried with HIP_BE=opencl make check
but it looks like it's hanging for me at Start 634: Unit_hipStreamAddCallback_ParamTst_Positive
. I'm not sure how to enable a timeout for the unit tests so we can get it to finish.
I've tested on Australis, with concurrency 8 (scripts/check.py $PWD dgpu level0 8 2
).
With LZ backend:
The following tests FAILED:
528 - Unit_hipMemset3DAsync_ConcurrencyMthread (Timeout)
560 - Unit_hipMemset2DAsync_MultiThread (Timeout)
564 - Unit_hipMalloc_LoopRegressionAllocFreeCycles (Child aborted)
565 - Unit_hipMalloc_AllocateAndPoolBuffers (Child aborted)
646 - Unit_hipFreeDoubleArray (Failed)
750 - Unit_hipTextureFetch_vector (Failed)
751 - Unit_hipTextureObj2D_Check (Failed)
753 - Unit_hipCreateTextureObject_tex1DfetchVerification (Failed)
754 - Unit_hipTextureObj1DCheckModes (Failed)
755 - Unit_hipTextureObj2DCheckModes (Failed)
832 - TestIndirectMappedHostAlloc (Failed)
869 - hip_async_binomial (Failed)
When run separately (not in parallel):
Unit_hipMemset3DAsync_ConcurrencyMthread
, Unit_hipMalloc_LoopRegressionAllocFreeCycles
and Unit_hipMalloc_AllocateAndPoolBuffers
passUnit_hipFreeDoubleArray
fails with:997: Expected Error: hipErrorUnknown
997: Expected Code: 709
997: Actual Error: hipErrorInvalidValue
TestIndirectMappedHostAlloc
fails with *OutH = 0
hip_async_binomial
fails with:1329: No. Output Output(hex) Refoutput Refoutput(hex)
1329: [1] 663.692810 0x4425ec57 0.920039 0x3f6b87b0,
1329: [2] inf 0x7f800000 5.398513 0x40acc09e,
1329: [3] inf 0x7f800000 5.701616 0x40b673a3,
1329: [4] inf 0x7f800000 8.260965 0x41042cea,
1329: [5] inf 0x7f800000 0.291796 0x3e956646,
1329: [6] inf 0x7f800000 0.649763 0x3f2656d6,
1329: [7] inf 0x7f800000 5.124279 0x40a3fa18,
1329: [8] inf 0x7f800000 0.456920 0x3ee9f176,
1329: [9] inf 0x7f800000 2.129339 0x40084719,
Level Zero runtime is known to fail when unit tests are run in parallel. For this reason we always run unit tests sequentially.
The compiler crash BT has the texture pass in it. Could this be fixed also with https://github.com/CHIP-SPV/chipStar/pull/531?
@pjaaskel @linehill Still unable to compile textures on Sunspot.
[ 98%] Building CXX object catch/catch_tests/unit/texture/CMakeFiles/hipTextureObj1DCheckModes.dir/hipTextureObj1DCheckModes.cc.o
opt: /home/pvelesko/CHIP-SPV/main/llvm_passes/HipTextureLowering.cpp:459: bool (anonymous namespace)::lowerTextureFunctions(Module &): Assertion `false && "Unsupported texture function use."' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/pvelesko/install/llvm/16.0/bin/opt /tmp/hipTextureObj1DCheckModes-generic-link-387417.bc -load-pass-plugin /home/pvelesko/CHIP-SPV/main/build_llvm16/lib/libLLVMHipSpvPasses.so -passes=hip-post-link-passes -o /tmp/hipTextureObj1DCheckModes-generic-lower-f168a5.bc
#0 0x0000000001d803b1 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/pvelesko/install/llvm/16.0/bin/opt+0x1d803b1)
#1 0x0000000001d7de44 SignalHandler(int) Signals.cpp:0:0
#2 0x00007f8e45c658c0 __restore_rt (/lib64/libpthread.so.0+0x168c0)
#3 0x00007f8e44c20c6b raise (/lib64/libc.so.6+0x4ac6b)
#4 0x00007f8e44c22305 abort (/lib64/libc.so.6+0x4c305)
#5 0x00007f8e44c18c6a __assert_fail_base (/lib64/libc.so.6+0x42c6a)
#6 0x00007f8e44c18cf2 (/lib64/libc.so.6+0x42cf2)
#7 0x00007f8e44b5d0bd (anonymous namespace)::lowerTextureFunctions(llvm::Module&) /home/pvelesko/CHIP-SPV/main/llvm_passes/HipTextureLowering.cpp:457:7
#8 0x00007f8e44b5cecc HipTextureLoweringPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/pvelesko/CHIP-SPV/main/llvm_passes/HipTextureLowering.cpp:492:10
#9 0x00007f8e44b22f64 llvm::detail::PassModel<llvm::Module, HipTextureLoweringPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/pvelesko/install/llvm/16.0/include/llvm/IR/PassManagerInternal.h:89:17
#10 0x00000000015ac346 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/pvelesko/install/llvm/16.0/bin/opt+0x15ac346)
#11 0x0000000000790cb7 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool) (/home/pvelesko/install/llvm/16.0/bin/opt+0x790cb7)
#12 0x00000000006f00e6 main (/home/pvelesko/install/llvm/16.0/bin/opt+0x6f00e6)
#13 0x00007f8e44c0b24d __libc_start_main (/lib64/libc.so.6+0x3524d)
#14 0x000000000078414a _start /home/abuild/rpmbuild/BUILD/glibc-2.31/csu/../sysdeps/x86_64/start.S:122:0
clang-16: error: unable to execute command: Aborted
clang-16: error: hipspv-link command failed due to signal (use -v to see invocation)
clang version 16.0.6 (https://github.com/llvm/llvm-project.git 7cbf1a2591520c2491aa35339f227775f4d3adf6)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/pvelesko/install/llvm/16.0/bin
clang-16: note: diagnostic msg: Error generating preprocessed source(s).
Failing tests with #535
LLVM-15 Debug Runtime 551
The following tests FAILED:
779 - Unit_hipTextureFetch_vector (Failed)
780 - Unit_hipTextureObj2D_Check (Failed)
782 - Unit_hipTexObjPitch_texture2D - float (Failed)
783 - Unit_hipTexObjPitch_texture2D - int (Failed)
784 - Unit_hipTexObjPitch_texture2D - unsigned char (Failed)
785 - Unit_hipTexObjPitch_texture2D - int16_t (Failed)
786 - Unit_hipTexObjPitch_texture2D - char (Failed)
787 - Unit_hipTexObjPitch_texture2D - unsigned int (Failed)
788 - Unit_hipCreateTextureObject_tex1DfetchVerification (Failed)
789 - Unit_tex1Dfetch_CheckModes (Failed)
790 - Unit_hipTextureObj1DCheckModes (Failed)
791 - Unit_hipTextureObj2DCheckModes (Failed)
811 - Unit_hipMultiThreadStreams2 (Subprocess aborted)
912 - hip_async_binomial (Failed)
Things to note:
811/959 Test #811: Unit_hipMultiThreadStreams2 ...............................................Subprocess aborted***Exception: 0.39 sec
CHIP error [TID 151308] [1689223435.489258105] : hipErrorOutOfMemory (ZE_RESULT_ERROR_OUT_OF_HOST_MEMORY ) in /home/pvelesko/CHIP-SPV/CmdList/src/backend/Level0/CHIPBackendLevel0.cc:1762:allocateImpl
CHIP error [TID 151308] [1689223435.489470021] : Caught Error: hipErrorOutOfMemory
Filters: Unit_hipMultiThreadStreams2
error: 'hipErrorOutOfMemory'(2) from hipHostMalloc((void**)&Ehh, size, hipHostMallocDefault) at /home/pvelesko/CHIP-SPV/CmdList/HIP/tests/catch/unit/multiThread/hipMultiThreadStreams2.cc:94
Textures fail because they are not supported and we don't check.
Only 2 issues remain:
811 - Unit_hipMultiThreadStreams2 (Subprocess aborted)
912 - hip_async_binomial (Failed)
closed since Sunspot no longer maintains drivers.
Level Zero:
OpenCL:
The following tests crash the compiler and fail to build:
Also, while building on Sunspot I see this:
Furthermore, sometimes, building fails. In this case, I just re-run a couple of times and it passes.