Open etiotto opened 1 year ago
Some of these benchmarks (below) may be updated with minor formatting and optimization.
https://github.com/zjin-lcf/HeCBench/tree/master/backprop-sycl https://github.com/zjin-lcf/HeCBench/tree/master/gaussian-sycl https://github.com/zjin-lcf/HeCBench/tree/master/particlefilter-sycl https://github.com/zjin-lcf/HeCBench/tree/master/streamcluster-sycl https://github.com/zjin-lcf/HeCBench/tree/master/lud-sycl
Reduced test case for the problem illustrated in the description. Indexing array "A" works fine, indexing "shadow" is problematic. The difference is that "A" is an accessor for a global buffer while "shadow" is an accessor to local memory.
#include <iostream>
#include <sycl/sycl.hpp>
using namespace sycl;
#define BLOCK_SIZE 16
int test(std::array<float, BLOCK_SIZE> &A) {
auto q = queue{};
auto range = sycl::range<1>{BLOCK_SIZE};
auto bufA = buffer<float, 1>{A.data(), range};
q.submit([&](handler &cgh) {
accessor<float, 1, access::mode::read_write, access::target::local> shadow(
BLOCK_SIZE, cgh);
accessor<float, 1, access::mode::read_write, access::target::global_buffer>
A(bufA, cgh);
cgh.parallel_for<class diagonal>(range,
[=](id<1> id) { A[0] = shadow[0] = 0; });
});
}
PRs https://github.com/intel/llvm/pull/7663 and https://github.com/intel/llvm/pull/7684 resolve the problem illustrated at https://github.com/intel/llvm/issues/7641#issuecomment-1337894883
Next problem affecting streamcluster
:
######## Start streamcluster #########
rm -f streamcluster streamcluster.linkinfo result*
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -O3 streamcluster.cpp -o streamcluster -lm
cgeist: /nfs/site/home/etiotto/projects/intel-llvm/polygeist/tools/cgeist/Lib/ValueCategory.cc:187: void ValueCategory::store(mlir::OpBuilder &, mlir::Value) const: Assertion `toStore.getType() == val.getType().cast<MemRefType>().getElementType() && "expect same type"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /localdisk2/etiotto/intel-llvm/build/bin/cgeist -emit-llvm -w streamcluster.cpp -o /tmp/streamcluster-c49eb6.bc --args -cc1 -triple spir64-unknown-unknown-syclmlir -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -fenable-sycl-dae -fsycl-instrument-device-code -Wno-sycl-strict -fsycl-int-header=/tmp/streamcluster-header-e64163.h -fsycl-int-footer=/tmp/streamcluster-footer-1a9696.h -sycl-std=2020 -fsycl-unique-prefix=27d026f783b3cce6 -Wspir-compat -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name streamcluster.cpp -fsycl-use-main-file-name -full-main-file-name streamcluster.cpp -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -resource-dir /localdisk2/etiotto/intel-llvm/build/lib/clang/16 -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include/sycl -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -Wall -w -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/nfs/site/home/etiotto/projects/Rodinia_SYCL/sycl/streamcluster -ferror-limit 19 -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -D __GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/streamcluster-c49eb6.bc -x c++ streamcluster.cpp
1. <eof> parser at end of file
#0 0x00007f91d63fec53 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d4c53)
#1 0x00007f91d63fcec0 llvm::sys::RunSignalHandlers() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d2ec0)
#2 0x00007f91d63ff13f SignalHandler(int) Signals.cpp:0:0
#3 0x00007f91ecfe0b20 __restore_rt sigaction.c:0:0
#4 0x00007f91d556d37f raise (/lib64/libc.so.6+0x3737f)
#5 0x00007f91d5557db5 abort (/lib64/libc.so.6+0x21db5)
#6 0x00007f91d5557c89 _nl_load_domain.cold.0 loadmsgcat.c:0:0
#7 0x00007f91d5565a76 .annobin___GI___assert_fail.end assert.c:0:0
#8 0x00000000004d4e9c ValueCategory::store(mlir::OpBuilder&, mlir::Value) const (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4d4e9c)
#9 0x00000000004a85fd MLIRScanner::VisitBinAssign(clang::BinaryOperator*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4a85fd)
#10 0x000000000043524c clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x43524c)
#11 0x00000000004bd9c9 MLIRScanner::VisitCompoundStmt(clang::CompoundStmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4bd9c9)
#12 0x0000000000435517 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x435517)
#13 0x0000000000431547 MLIRScanner::init(mlir::FunctionOpInterface, FunctionToEmit const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x431547)
#14 0x000000000044aa55 MLIRASTConsumer::run() (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x44aa55)
#15 0x00007f91d016adb0 clang::ParseAST(clang::Sema&, bool, bool) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/../lib/libclangParse.so.16git+0x38db0)
#16 0x00007f91dc3d8f61 clang::FrontendAction::Execute() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libclangFrontend.so.16git+0x12bf61)
#17 0x0000000000455248 processInputFiles(llvm::cl::list<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, llvm::cl::parser<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, llvm::cl::list<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, llvm::cl::parser<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>> const&, mlir::MLIRContext&, mlir::OwningOpRef<mlir::ModuleOp>&, llvm::DataLayout&, llvm::Triple&, char const*, bool) driver.cc:0:0
#18 0x00000000004503ce main (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4503ce)
#19 0x00007f91d5559493 __libc_start_main (/lib64/libc.so.6+0x23493)
#20 0x000000000042da8e _start (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x42da8e)
(Fixed)
Next problem affecting backprop
:
######## Start backprop #########
rm -f *.o *~ backprop *.linkinfo
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -O3 backprop_sycl.cpp -c
cgeist: /nfs/site/home/etiotto/projects/intel-llvm/polygeist/tools/cgeist/Lib/CGCall.cc:89: void castCallerArgs(func::FuncOp, llvm::SmallVectorImpl<Value> &, mlir::OpBuilder &): Assertion `CalleeArgType == Args[I].getType() && "Callsite argument mismatch"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /localdisk2/etiotto/intel-llvm/build/bin/cgeist -emit-llvm -w backprop_sycl.cpp -o /tmp/backprop_sycl-d104c1.bc --args -cc1 -triple spir64-unknown-unknown-syclmlir -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -fenable-sycl-dae -fsycl-instrument-device-code -Wno-sycl-strict -fsycl-int-header=/tmp/backprop_sycl-header-6a398c.h -fsycl-int-footer=/tmp/backprop_sycl-footer-aeaaf3.h -sycl-std=2020 -fsycl-unique-prefix=7b14c76bdff44000 -Wspir-compat -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name backprop_sycl.cpp -fsycl-use-main-file-name -full-main-file-name backprop_sycl.cpp -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -resource-dir /localdisk2/etiotto/intel-llvm/build/lib/clang/16 -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include/sycl -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-redhat-
linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -Wall -w -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/nfs/site/home/etiotto/projects/Rodinia_SYCL/sycl/backprop -ferror-limit 19 -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -D __GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/backprop_sycl-d104c1.bc -x c++ backprop_sycl.cpp
1. <eof> parser at end of file
#0 0x00007f4e83bbec53 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d4c53)
#1 0x00007f4e83bbcec0 llvm::sys::RunSignalHandlers() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d2ec0)
#2 0x00007f4e83bbf13f SignalHandler(int) Signals.cpp:0:0
#3 0x00007f4e9a7a0b20 __restore_rt sigaction.c:0:0
#4 0x00007f4e82d2d37f raise (/lib64/libc.so.6+0x3737f)
#5 0x00007f4e82d17db5 abort (/lib64/libc.so.6+0x21db5)
#6 0x00007f4e82d17c89 _nl_load_domain.cold.0 loadmsgcat.c:0:0
#7 0x00007f4e82d25a76 .annobin___GI___assert_fail.end assert.c:0:0
#8 0x000000000047ce67 MLIRScanner::callHelper(mlir::func::FuncOp, clang::QualType, llvm::ArrayRef<std::pair<ValueCategory, clang::Expr*>>, clang::QualType, bool, clang::Expr*, clang::FunctionDecl const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x47ce67)
#9 0x0000000000483a6a MLIRScanner::VisitCallExpr(clang::CallExpr*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x483a6a)
#10 0x00000000004352c1 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4352c1)
#11 0x0000000000439507 MLIRScanner::VisitVarDecl(clang::VarDecl*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x439507)
#12 0x00000000004bd68e MLIRScanner::VisitDeclStmt(clang::DeclStmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4bd68e)
#13 0x0000000000435531 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x435531)
#14 0x00000000004bd9c9 MLIRScanner::VisitCompoundStmt(clang::CompoundStmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4bd9c9)
#15 0x0000000000435517 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x435517)
#16 0x0000000000431547 MLIRScanner::init(mlir::FunctionOpInterface, FunctionToEmit const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x431547)
#17 0x000000000044aa55 MLIRASTConsumer::run() (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x44aa55)
Possible duplicate of https://github.com/intel/llvm/issues/7662
EDIT: fixed by https://github.com/intel/llvm/pull/7752
Next problem affecting particlefilter
:
######## Start particlefilter #########
rm -f *.o *~ SYCL_particlefilter_single
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -DBLOCK_SIZE=256 -O3 ex_particle_SYCL_single_seq.cpp -c
cgeist: /nfs/site/home/etiotto/projects/intel-llvm/polygeist/tools/cgeist/Lib/ValueCategory.cc:585: ValueCategory FPBinOp(mlir::OpBuilder &, mlir::Location, mlir::Value, mlir::Value) [OpTy = mlir::arith::AddFOp]: Assertion `LHS.getType() == RHS.getType() && "Cannot operate on values of different types"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /localdisk2/etiotto/intel-llvm/build/bin/cgeist -emit-llvm -w ex_particle_SYCL_single_seq.cpp -o /tmp/ex_particle_SYCL_single_seq-e9151e.bc --args -cc1 -triple spir64-unknown-unknown-syclmlir -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -fenable-sycl-dae -fsycl-instrument-device-code -Wno-sycl-strict -fsycl-int-header=/tmp/ex_particle_SYCL_single_seq-header-4a99c8.h -fsycl-int-footer=/tmp/ex_particle_SYCL_single_seq-footer-7c9355.h -sycl-std=2020 -fsycl-unique-prefix=78d00ad7068752e7 -Wspir-compat -emit-llvm-bc -emit-llvm-uselists -disable-free -clear-ast-before-backend -main-file-name ex_particle_SYCL_single_seq.cpp -fsycl-use-main-file-name -full-main-file-name ex_particle_SYCL_single_seq.cpp -mrelocation-model static -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -resource-dir /localdisk2/etiotto/intel-llvm/build/lib/clang/16 -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include/sycl -internal-isystem /iusers/etiotto/intel-llvm/build/bin/../include -D BLOCK_SIZE=256 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /localdisk2/etiotto/intel-llvm/build/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -Wall -w -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/nfs/site/home/etiotto/projects/Rodinia_SYCL/sycl/particlefilter -ferror-limit 19 -fgnuc-version=4.2.1 -no-opaque-pointers -fcxx-exceptions -fexceptions -fcolor-diagnostics -vectorize-loops -vectorize-slp -faddrsig -D __GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/ex_particle_SYCL_single_seq-e9151e.bc -x c++ ex_particle_SYCL_single_seq.cpp
1. <eof> parser at end of file
#0 0x00007fac3bec1c53 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d4c53)
#1 0x00007fac3bebfec0 llvm::sys::RunSignalHandlers() (/localdisk2/etiotto/intel-llvm/build/bin/../lib/libLLVMSupport.so.16git+0x1d2ec0)
#2 0x00007fac3bec213f SignalHandler(int) Signals.cpp:0:0
#3 0x00007fac52aa3b20 __restore_rt sigaction.c:0:0
#4 0x00007fac3b03037f raise (/lib64/libc.so.6+0x3737f)
#5 0x00007fac3b01adb5 abort (/lib64/libc.so.6+0x21db5)
#6 0x00007fac3b01ac89 _nl_load_domain.cold.0 loadmsgcat.c:0:0
#7 0x00007fac3b028a76 .annobin___GI___assert_fail.end assert.c:0:0
#8 0x00000000004dbc03 ValueCategory::FAdd(mlir::OpBuilder&, mlir::Location, mlir::Value) const (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4dbc03)
#9 0x00000000004a8bad MLIRScanner::EmitBinAdd(BinOpInfo const&) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4a8bad)
#10 0x00000000004abcab MLIRScanner::EmitCompoundAssignLValue(clang::CompoundAssignOperator*, ValueCategory (MLIRScanner::*)(BinOpInfo const&)) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4abcab)
#11 0x00000000004acfd4 MLIRScanner::VisitBinAddAssign(clang::BinaryOperator*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4acfd4)
#12 0x00000000004352a7 clang::StmtVisitorBase<std::add_pointer, MLIRScanner, ValueCategory>::Visit(clang::Stmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4352a7)
#13 0x00000000004bd9c9 MLIRScanner::VisitCompoundStmt(clang::CompoundStmt*) (/localdisk2/etiotto/intel-llvm/build/bin/cgeist+0x4bd9c9)
In order to reproduce clone https://github.com/etiotto/Rodinia_SYCL/tree/sycl_mlir ( branch sycl_mlir). Edit the script sycl/run_all.sh
to replace the clang compiler you want to use, then use it to run the bmks.
Using the latest SYCL-MLIR compiler backprop
now compiles and run:
./run_all.sh ✔ 15s etiotto@hds-clx-7
######## Start backprop #########
rm -f *.o *~ backprop *.linkinfo
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -O3 backprop_sycl.cpp -c
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -O3 backprop.c -c
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -O3 imagenet.c -c
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -O3 facetrain.c -c
/iusers/etiotto/intel-llvm/build/bin/clang++ -w -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -O3 backprop_sycl.o backprop.o imagenet.o facetrain.o -o backprop -lm
./backprop 65536
Random number generator seed: 7
Input layer size : 65536
Starting training kernel
Performing GPU computation
Device offloading time = 0.638426(s)
...
I will start to look at the streamcluster
issue reported here: https://github.com/intel/llvm/issues/7641#issuecomment-1341172207.
I'm looking at the other bmk that fails with an assert (particlefilter). Reduced test case:
#include <CL/sycl.hpp>
using namespace cl::sycl;
constexpr access::mode sycl_read_write = access::mode::read_write;
void likelyhood(int Nparticles) {
cpu_selector dev_sel;
queue q(dev_sel);
const property_list props = property::buffer::use_host_ptr();
float * arrayX = (float *) calloc(Nparticles, sizeof (float));
buffer<float, 1>arrayX_GPU(arrayX, Nparticles, props);
/****************** L I K E L I H O O D ************************************/
q.submit([&](handler& cgh) {
auto arrayX_acc = arrayX_GPU.get_access<sycl_read_write>(cgh);
cgh.parallel_for<class likelihood>(
// nd_range<1>(range<1>(global_work_size), range<1>(local_work_size)), [=] (nd_item<1> item) {
nd_range<1>(range<1>(10), range<1>(20)), [=] (nd_item<1> item) {
int i = item.get_global_linear_id();
#ifdef BAD
arrayX_acc[i] += 1.0;
#else
arrayX_acc[i] = arrayX_acc[i] + 1.0;
#endif
});
});
}
Noting that if arrayX_acc[i] += 1.0;
is changed to arrayX_acc[i] = arrayX_acc[i] + 1.0;
the assertion disappears.
I will take on lud
I'm looking at the other bmk that fails with an assert (particlefilter). Reduced test case:
#include <CL/sycl.hpp> using namespace cl::sycl; constexpr access::mode sycl_read_write = access::mode::read_write; void likelyhood(int Nparticles) { cpu_selector dev_sel; queue q(dev_sel); const property_list props = property::buffer::use_host_ptr(); float * arrayX = (float *) calloc(Nparticles, sizeof (float)); buffer<float, 1>arrayX_GPU(arrayX, Nparticles, props); /****************** L I K E L I H O O D ************************************/ q.submit([&](handler& cgh) { auto arrayX_acc = arrayX_GPU.get_access<sycl_read_write>(cgh); cgh.parallel_for<class likelihood>( // nd_range<1>(range<1>(global_work_size), range<1>(local_work_size)), [=] (nd_item<1> item) { nd_range<1>(range<1>(10), range<1>(20)), [=] (nd_item<1> item) { int i = item.get_global_linear_id(); #ifdef BAD arrayX_acc[i] += 1.0; #else arrayX_acc[i] = arrayX_acc[i] + 1.0; #endif }); }); }
Noting that if
arrayX_acc[i] += 1.0;
is changed toarrayX_acc[i] = arrayX_acc[i] + 1.0;
the assertion disappears.
https://github.com/intel/llvm/pull/7760 fixes this
lud
fixed by #7760 also -> compiles, but fails.
lud
fixed by #7760 also -> compiles, but fails.
AFAIK lud was compiling without #7760 and was failing to verify at runtime.
Status:
backprop
, gaussian
particlefilter
, streamcluster
lud
New reduced test case for particlefilter
:
#include <sycl/sycl.hpp>
using namespace sycl;
constexpr access::mode sycl_write = access::mode::write;
constexpr access::mode sycl_read = access::mode::read;
#ifndef TY
#define TY unsigned char
#endif
void likelyhood(int Nparticles, int IszX, int IszY, int Nfr, int countOnes) {
queue q;
buffer<float, 1> A(Nparticles + 1);
buffer<TY, 1> B(Nparticles + 1);
q.submit([&](handler &cgh) {
auto A_acc = A.get_access<sycl_write>(cgh);
auto B_acc = B.get_access<sycl_read>(cgh);
cgh.parallel_for<class likelihood>(range<1>(10), [=](item<1> Item) {
id<1> Id = Item.get_id();
A_acc[Id] = B_acc[Id];
});
});
}
Fails with the following error (noting also that when TY is not char
or unsigned char
but, for example, is a short
the error disappears):
clang++ -fsycl -fsycl-targets=spir64-unknown-unknown-syclmlir -std=c++17 -Wall -DBLOCK_SIZE=256 -O3 ~/projects/tmp/ex_particle_reduced.cpp -w -c 2>&1 | grep error
error: 'sycl.accessor.subscript' op Expecting memref return type. Got '<<NULL TYPE>>'
Fails with the following error (noting also that when TY is not
char
orunsigned char
but, for example, is ashort
the error disappears):
Apparently, compiling pointers to i8
types yields an llvm pointer instead of a memref. We might be having issues with that. IMO, we should generate memref<i8>
instead. Related issue: https://github.com/intel/llvm/pull/7767
With fix in PR https://github.com/intel/llvm/pull/7784 particlefilter
compiles to the end, and the bmk runs.
Using draft PR https://github.com/intel/llvm/pull/7783 streamcluster
no longer asserts in the store operation, the assertion moved to SYCLAccessorSubscriptOp::verify()
(because that operation receives a !llvm.ptr<?xstruct....>
and it expects a memref). Given that cgeist
always represents pointers to struct by using !llvm.ptr
, we might want to relax that verification code. @whitneywhtsang will work on that.
I have quickly verified that this is the last compile time issue affecting streamcluster
. Commenting out the verification code for SYCLAccessorSubscriptOp yields successful compilation and the bmk runs clean.
Created https://github.com/intel/llvm/pull/7802, for the verification issue mentioned in https://github.com/intel/llvm/issues/7641#issuecomment-1352344685.
Status:
Performance measurements on Intel(R) Iris(R) Xe Graphics:
Great to see that at -O3 we are mostly ahead of the default clang (without MLIR).
The Rodinia benchmarks have been ported to SYCL and are available publicly at: https://github.com/zjin-lcf/Rodinia_SYCL. I have forked the repository and modified the Makefiles to be able to compile the benchmarks with the Intel
clang++
SYCL compiler and with theSYCL-MLIR
compiler. The fork with the required changes (use branch sycl_mlir) is at: https://github.com/etiotto/Rodinia_SYCL/tree/sycl_mlir.We can initially focus on 5 bmks:
backprop
,gaussian
,particlefilter
,streamcluster
andlud
. These 5 bmks compile and run cleanly using the clang++ SYCL Intel compiler, andgaussian
also compiles and runs when the SYCL-MLIR/cgeist compiler is used. The remaining 4 bmks fail to compile, with the same symptom: