Closed wangzy0327 closed 4 months ago
llvm-foreach: Segmentation fault (core dumped) clang-15: error: ptxas command failed with exit code 254 (use -v to see invocation)
@wangzy0327 The issue is more likely in the compiler and the log shows that the core-dump happens in llvm-foreach.
You may enable then debug capabilities when building llvm and use gdb to check which compiler pass is guilty for this.
The issue https://github.com/intel/llvm/issues/5980 in intel/llvm repo is very similar to this one and there are already many investigations there. Could you please check if it is helpful?
@wangzy0327,
Intel C++/DPC++ Compiler follows semantic versioning schema and guarantees backward compatibility within the same major version. You can find version that oneDNN was tested with in the README.md of the corresponding release. On the source code level oneDNN may also be compatible with earlier compiler releases.
@vpirogov I cannot find relative content about oneDNN was tested in the README.md Which readme.md has the relevant onednn test version and the dependent sycl version? Can you provide a screenshot or link to the relevant test version?
I'm referring to Validated Configurations section of the README.md.
@shu1chen I refered to the issue-5980 I modified the two files llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp and llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h as you listed above with the modifications. I modified it and recompiled SYCL, then compiled oneDNN.
The output of make -j
as follow.
[100%] Linking CXX shared library libdnnl.so
llvm-foreach: Segmentation fault (core dumped)
clang-15: error: ptxas command failed with exit code 254 (use -v to see invocation)
clang version 15.0.0 (ssh://git@gitlab.gxnzx12729.ict:2222/wangziyang/intel-llvm-new.git 7ecb566e497fa926844521e8df2a2405c7e92e63)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/wzy/sycl_workspace/build-cuda-2022-06/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).
src/CMakeFiles/dnnl.dir/build.make:776: recipe for target 'src/libdnnl.so.3.2' failed
make[2]: *** [src/libdnnl.so.3.2] Error 1
CMakeFiles/Makefile2:355: recipe for target 'src/CMakeFiles/dnnl.dir/all' failed
make[1]: *** [src/CMakeFiles/dnnl.dir/all] Error 2
Makefile:159: recipe for target 'all' failed
make: *** [all] Error 2
It's still the same error as before.
@wangzy0327 I meant that the core dump happens in the compiler and for CUDA backend, not in oneDNN. From the log, the compilation of oneDNN has completed, and the compiler triggers this error during linking phase. The issue in https://github.com/intel/llvm/issues/5980 has the similar issue for another shared library in debug mode and has some tracing info. Perhaps the solution there doesn't work for your case. I am personally not an expert in CUDA backend compiler. Could you please raise a ticket in https://github.com/intel/llvm/issues repo to see if it's more helpful?
@shu1chen I have raised a ticket in intel/llvm#5980. But it no reply yet.
Summary
Cannot correctly compile oneDNN v3.2 version based in dpcpp(2022-06)
cmake command line
Version
oneDNN version in v3.2.
Environment
lscpu
; if yourlscpu
does not list CPU flags, try runningcat /proc/cpuinfo | grep flags | sort -u
)git log -1 --format=%H
)How to solve the problem?