oneapi-src / oneDNN

oneAPI Deep Neural Network Library (oneDNN)
https://uxlfoundation.org
Apache License 2.0
3.6k stars 991 forks source link

How to check whether the SYCL version oneDNN depends on is backward compatible? #1869

Closed wangzy0327 closed 4 months ago

wangzy0327 commented 6 months ago

Summary

Cannot correctly compile oneDNN v3.2 version based in dpcpp(2022-06)

cmake command line

cmake .. -DCMAKE_C_COMPILER=/home/wzy/sycl_workspace/build-cuda-2022-06/bin/clang -DCMAKE_CXX_COMPILER=/home/wzy/sycl_workspace/build-cuda-2022-06/bin/clang++ -DONEDNN_CPU_RUNTIME=NONE -DONEDNN_GPU_RUNTIME=SYCL -DDNNL_BUILD_EXAMPLES=OFF -DDNNL_BUILD_TESTS=OFF -DONEDNN_BUILD_GRAPH=OFF -DDNNL_GPU_VENDOR=NVIDIA -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=/home/wzy/sycl_workspace/oneDNN-cuda-v34

Version

oneDNN version in v3.2.

Environment

How to solve the problem?

shu1chen commented 6 months ago

llvm-foreach: Segmentation fault (core dumped) clang-15: error: ptxas command failed with exit code 254 (use -v to see invocation)

@wangzy0327 The issue is more likely in the compiler and the log shows that the core-dump happens in llvm-foreach.

vpirogov commented 6 months ago

@wangzy0327,

Intel C++/DPC++ Compiler follows semantic versioning schema and guarantees backward compatibility within the same major version. You can find version that oneDNN was tested with in the README.md of the corresponding release. On the source code level oneDNN may also be compatible with earlier compiler releases.

wangzy0327 commented 6 months ago

@vpirogov I cannot find relative content about oneDNN was tested in the README.md Which readme.md has the relevant onednn test version and the dependent sycl version? Can you provide a screenshot or link to the relevant test version?

vpirogov commented 6 months ago

I'm referring to Validated Configurations section of the README.md.

wangzy0327 commented 6 months ago

@shu1chen I refered to the issue-5980 I modified the two files llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp and llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h as you listed above with the modifications. I modified it and recompiled SYCL, then compiled oneDNN.

The output of make -j as follow.

[100%] Linking CXX shared library libdnnl.so
llvm-foreach: Segmentation fault (core dumped)
clang-15: error: ptxas command failed with exit code 254 (use -v to see invocation)
clang version 15.0.0 (ssh://git@gitlab.gxnzx12729.ict:2222/wangziyang/intel-llvm-new.git 7ecb566e497fa926844521e8df2a2405c7e92e63)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/wzy/sycl_workspace/build-cuda-2022-06/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).
src/CMakeFiles/dnnl.dir/build.make:776: recipe for target 'src/libdnnl.so.3.2' failed
make[2]: *** [src/libdnnl.so.3.2] Error 1
CMakeFiles/Makefile2:355: recipe for target 'src/CMakeFiles/dnnl.dir/all' failed
make[1]: *** [src/CMakeFiles/dnnl.dir/all] Error 2
Makefile:159: recipe for target 'all' failed
make: *** [all] Error 2

It's still the same error as before.

shu1chen commented 6 months ago

@wangzy0327 I meant that the core dump happens in the compiler and for CUDA backend, not in oneDNN. From the log, the compilation of oneDNN has completed, and the compiler triggers this error during linking phase. The issue in https://github.com/intel/llvm/issues/5980 has the similar issue for another shared library in debug mode and has some tracing info. Perhaps the solution there doesn't work for your case. I am personally not an expert in CUDA backend compiler. Could you please raise a ticket in https://github.com/intel/llvm/issues repo to see if it's more helpful?

wangzy0327 commented 6 months ago

@shu1chen I have raised a ticket in intel/llvm#5980. But it no reply yet.