intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Other
1.23k stars 735 forks source link

Test suite fails after brand new install using documentation directions #11488

Closed lopippo closed 1 month ago

lopippo commented 1 year ago

Greetings, thank you for this amaizing project!

I am on a Debian box (testing) and I successfully run my desktop with the NVidia driver (not the nouveau driver). I successfully installed OpenCL using the Debian repos packages and could successfully run OpenCL code on the NVidia GPU (the Intel GPU was not visible at the time).

Then I wanted to install DPC++ according to this repo documentation (https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md).

Running python $DPCPP_HOME/llvm/buildbot/configure.py --native_cpu I discovered that I needed spirv-tools, so I apt-install'ed it. Version: spirv-tools 2023.3-1 from the Debian repos. I also discovered that the "Configuring SYCL End-to-End Tests" configure.py section mentioned this:

-- SPIR-V Headers location is not specified. Will try to download
          spirv.hpp from https://github.com/KhronosGroup/SPIRV-Headers into
          ${DPCPP_HOME}/llvm/build/tools/llvm-spirv/SPIRV-Headers

Indeed, that directory is not empty.

Then I had to install the ocl-icd-libopencl1 package from the Debian repos. Version: ocl-icd-libopencl1:amd64 2.3.2-1.

Then I installed Debian packages from the intel/llvm repositories and other archives as detailed in the documentation, that I followed to the letter.

I happily check the platforms using a program that I initially used to check the OpenCL-only install with Debian packages and see that I now have available the

----------------------------------------------
Platform 2: Intel(R) OpenCL Graphics
----------------------------------------------
Vendor    : Intel(R) Corporation
Version   : OpenCL 3.0 
----------------------------------------------

   Device 1: Intel(R) Graphics [0x46a6]
         Device Version     : OpenCL 3.0 NEO 
         OpenCL C Version   : OpenCL C 1.2 
         Compute Units      : 96
         Max Work Group Size: 512
         Clock Frequency    : 1450
         Local Memory Size  : 65536
         Global Memory Size : 53662576640
         Double Precision   : no

----------------------------------------------

which I did not see before installing DPC++.

Then I run the test suite:

python $DPCPP_HOME/llvm/buildbot/check.py

and there, I see failed tests:

Failed Tests (17):
  Clang :: CodeGen/address-safety-attr.cpp
  Clang :: CodeGen/asan-globals.cpp
  Clang :: CodeGen/profile-filter-new.c
  Clang :: CodeGen/profile-filter.c
  Clang :: CodeGen/sanitize-init-order.cpp
  Clang :: CodeGen/sanitize-thread-attr.cpp
  Clang :: CodeGen/ubsan-ignorelist.c
  Clang :: Driver/clang-offload-wrapper-exe.cpp
  Clang :: Driver/sycl-offload-amdgcn.cpp
  Clang :: Driver/sycl-oneapi-gpu.cpp
  Clang :: Driver/sycl-target-mismatch.cpp
  Clang :: Driver/sycl-triple-dae-flags.cpp
  LLVM_SPIRV :: OpFNegate.spvasm
  SYCL :: abi/layout_handler.cpp
  SYCL :: native_cpu/multi-devices-swap.cpp
  SYCL :: native_cpu/multi-devices.cpp
  SYCL :: tools/sycl-ls.test

Testing Time: 359.27s
  Skipped          :   327
  Unsupported      : 28861
  Passed           : 63809
  Expectedly Failed:    95
  Failed           :    17

Also I used another OpenCL test program that used to succeed on my OpenCL-only Debian setup and now that fails with the following error:

error: undefined reference to `__builtin_spirv_OpEnqueueKernel_i64_i32_p0ndrange_t_i32_p4i64_p4i64_p0func_p4i8_i32_i32'
in function: '__builtin_spirv_OpEnqueueKernel_i64_i32_p0ndrange_t_i32_p4i64_p4i64_p0func_p4i8_i32_i32' called by kernel: 'vectorAdd'

error: backend compiler failed build.

I am a seasoned C++ developer but totally new to OpenCL and Sycl, so I would be happy to receive directions for solving this problem (I am aware that in this venue, Sycl is the system that is dealt with, not OpenCL).

I would like to provide the clinfo output: see it at https://paste.debian.net/1294596/.

Most sincerely, Filippo

bader commented 1 year ago

I am on a Debian box (testing) and I successfully run my desktop with the NVidia driver (not the nouveau driver). I successfully installed OpenCL using the Debian repos packages and could successfully run OpenCL code on the NVidia GPU (the Intel GPU was not visible at the time).

NOTE: If your intention to run SYCL code on NVIDIA GPU, you will need to install CUDA. AFAIK, running SYCL code through OpenCL driver doesn't work on NVIDIA. At least it was the case in the past as DPC++ sets high bar for the OpenCL and SPIR-V features the driver must support in order to successfully run SYCL applications.

I discovered that I needed spirv-tools, so I apt-install'ed it. Version: spirv-tools 2023.3-1 from the Debian repos.

I don't think DPC++ project has a dependency on spirv-tools. How did you discover the need for spirv-tools? Is this required for running LLVM-SPIRV-Translator tests?

Then I installed Debian packages from the intel/llvm repositories and other archives as detailed in the documentation, that I followed to the letter.

Do you refer to https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md#install-low-level-runtime section? It describes the steps to install "device drivers". DPC++ uses "device drivers" to offload execution on accelerators (e.g. GPUs).

Then I run the test suite:

python $DPCPP_HOME/llvm/buildbot/check.py

and there, I see failed tests:

Failed Tests (17):
  Clang :: CodeGen/address-safety-attr.cpp
  Clang :: CodeGen/asan-globals.cpp
  Clang :: CodeGen/profile-filter-new.c
  Clang :: CodeGen/profile-filter.c
  Clang :: CodeGen/sanitize-init-order.cpp
  Clang :: CodeGen/sanitize-thread-attr.cpp
  Clang :: CodeGen/ubsan-ignorelist.c
  Clang :: Driver/clang-offload-wrapper-exe.cpp
  Clang :: Driver/sycl-offload-amdgcn.cpp
  Clang :: Driver/sycl-oneapi-gpu.cpp
  Clang :: Driver/sycl-target-mismatch.cpp
  Clang :: Driver/sycl-triple-dae-flags.cpp
  LLVM_SPIRV :: OpFNegate.spvasm
  SYCL :: abi/layout_handler.cpp
  SYCL :: native_cpu/multi-devices-swap.cpp
  SYCL :: native_cpu/multi-devices.cpp
  SYCL :: tools/sycl-ls.test

Testing Time: 359.27s
  Skipped          :   327
  Unsupported      : 28861
  Passed           : 63809
  Expectedly Failed:    95
  Failed           :    17

Could you attach the full output of python $DPCPP_HOME/llvm/buildbot/check.py command, please?

Please, provide the information about the environment on your system:

- OS: [e.g Windows/Linux]
- Target device and vendor: [e.g. Intel GPU]
- DPC++ version: [e.g. commit hash or output of `clang++ --version`]
- Dependencies version: [e.g. low-level runtime versions (like NEO 20.04)]

Also I used another OpenCL test program that used to succeed on my OpenCL-only Debian setup and now that fails with the following error:

error: undefined reference to `__builtin_spirv_OpEnqueueKernel_i64_i32_p0ndrange_t_i32_p4i64_p4i64_p0func_p4i8_i32_i32'
in function: '__builtin_spirv_OpEnqueueKernel_i64_i32_p0ndrange_t_i32_p4i64_p4i64_p0func_p4i8_i32_i32' called by kernel: 'vectorAdd'

error: backend compiler failed build.

How does your program select the device? My wild guess is that before installing OpenCL drivers for GPUs, you might have an OpenCL driver for a CPU device and your program uses the first device from the list returned by the OpenCL ICD. After installing new OpenCL drivers, your program might select a GPU device, which doesn't support device enqueue feature.

lopippo commented 1 year ago

Greetings,

first off some background informations:

I have now re-run the build, using this configuration line:

python $DPCPP_HOME/llvm/buildbot/configure.py --shared-libs --enable-all-llvm-targets --native_cpu

then

python $DPCPP_HOME/llvm/buildbot/compile.py

At this point, in the instructions at https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md#deployment

it occurs to me that there are no instructions on how to deploy the built code since that section reads "TODO: add instructions how to deploy built DPC++ toolchain."

So I wonder if the following section "Use DPC++ toolchain" (https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md#use-dpc-toolchain) do not install binaries that I have built above... Anyways, I go on with the guide lines.

Note one documentation glitch:

==> Note the discrepancy between the doc and reality: ==> doc says oclcpuexp_.tar.gz to be extracted at /opt/intel/oclcpuexp_ ==> but then it says tar -zxvf oclcpurt.tar.gz, while the file that was downloaded ==> was indeed oclcpuexp-2023.16.6.0.28_rel.tar.gz.

I do not go through the Obtain prerequisites for ahead of time (AOT) compilation section (is this required?) and run the check right away:

This is the sum-up:



Failed Tests (14): Clang :: CodeGen/address-safety-attr.cpp Clang :: CodeGen/asan-globals.cpp Clang :: CodeGen/profile-filter-new.c Clang :: CodeGen/profile-filter.c Clang :: CodeGen/sanitize-init-order.cpp Clang :: CodeGen/sanitize-thread-attr.cpp Clang :: CodeGen/ubsan-ignorelist.c Clang :: Driver/clang-offload-wrapper-exe.cpp Clang :: Driver/sycl-offload-amdgcn.cpp Clang :: Driver/sycl-oneapi-gpu.cpp Clang :: Driver/sycl-target-mismatch.cpp Clang :: Driver/sycl-triple-dae-flags.cpp LLVM_SPIRV :: OpFNegate.spvasm SYCL :: abi/layout_handler.cpp

Testing Time: 601.86s Skipped : 316 Unsupported : 24123 Passed : 68770 Expectedly Failed: 115 Failed : 14

with the whole output at

https://paste.debian.net/1295341/

I hope we will be able to sort this out!

Sincerely Filippo

maarquitos14 commented 10 months ago

@lopippo when I run configure.py+compile.py+check.py I am getting the next output:

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
********************
Failed Tests (6):
  Clang :: Driver/sycl-offload-amdgcn.cpp
  Clang :: Driver/sycl-oneapi-gpu.cpp
  Clang :: Driver/sycl-target-mismatch-amdgpu.cpp
  Clang :: Driver/sycl-triple-dae-flags.cpp
  Clang :: Driver/sycl-unsupported-arch.cpp
  SYCL :: abi/layout_handler.cpp

Testing Time: 126.75s

Total Discovered Tests: 94828
  Skipped          :   318 (0.34%)
  Unsupported      : 24875 (26.23%)
  Passed           : 69513 (73.30%)
  Expectedly Failed:   116 (0.12%)
  Failed           :     6 (0.01%)

Since it has been a while since you opened the issue, could you please update the code to the latest version and rerun to see what are the failing tests as of now? Thanks in advance!

github-actions[bot] commented 1 month ago

This issue was closed because it has been stalled for 30 days with no activity. Please, re-open if the issue still exists.