artyom-beilis / pytorch_dlprim

DLPrimitives/OpenCL out of tree backend for pytorch
http://blog.dlprimitives.org/
MIT License
227 stars 16 forks source link

Rusticl & `OCL_PATH-NOTFOUND` #37

Closed VirxEC closed 1 year ago

VirxEC commented 1 year ago

My machine

I have mesa git and Rusticl support installed and working in theory along with spirv-tools on Ubuntu 22.10 - but I can't seem to build dlprim!

clinfo -l outputs this:

Platform #0: rusticl
 `-- Device #0: AMD Radeon RX 6700 XT (navi22, LLVM 15.0.7, DRM 3.49, 6.1.23-060123-generic)

What I think the problem is

The command I'm running is cmake -DCMAKE_PREFIX_PATH=$VIRTUAL_ENV/lib/python3.10/site-packages/torch/share/cmake/Torch ..

The problem is one of two things:

static library kineto_LIBRARY-NOTFOUND not found

OR

  OpenCL: include OCL_PATH-NOTFOUND
          lib     OCL_LIB-NOTFOUND

Full command log

-- The C compiler identification is GNU 12.2.0
-- The CXX compiler identification is GNU 12.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Warning at ~/Documents/ai_test/venv/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  ~/Documents/ai_test/venv/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
  CMakeLists.txt:4 (find_package)

=== Status ===
  OpenCL: include OCL_PATH-NOTFOUND
          lib     OCL_LIB-NOTFOUND
  Python: ~/Documents/ai_test/venv/bin/python3
  BLAS: None
  HDF5: None
  Sqlite3: include /usr/include
           lib /usr/lib/x86_64-linux-gnu/libsqlite3.so
  Protobuf (onnx): disabled
  Python dlprim: disabled
-- Configuring done
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
OCL_LIB
    linked by target "pt_ocl" in directory ~/Documents/pytorch_dlprim
    linked by target "dlprim_core" in directory ~/Documents/pytorch_dlprim/dlprimitives
OCL_PATH
   used as include directory in directory ~/Documents/pytorch_dlprim
   used as include directory in directory ~/Documents/pytorch_dlprim
   used as include directory in directory ~/Documents/pytorch_dlprim
   used as include directory in directory ~/Documents/pytorch_dlprim
   used as include directory in directory ~/Documents/pytorch_dlprim
   used as include directory in directory ~/Documents/pytorch_dlprim
   used as include directory in directory ~/Documents/pytorch_dlprim
   used as include directory in directory ~/Documents/pytorch_dlprim/dlprimitives
   used as include directory in directory ~/Documents/pytorch_dlprim/dlprimitives
   used as include directory in directory ~/Documents/pytorch_dlprim/dlprimitives
   used as include directory in directory ~/Documents/pytorch_dlprim/dlprimitives
   used as include directory in directory ~/Documents/pytorch_dlprim/dlprimitives
   used as include directory in directory ~/Documents/pytorch_dlprim/dlprimitives
   used as include directory in directory ~/Documents/pytorch_dlprim/dlprimitives

CMake Error in CMakeLists.txt:
  Found relative path while evaluating include directories of "pt_ocl":

    "OCL_PATH-NOTFOUND"

CMake Error in dlprimitives/CMakeLists.txt:
  Found relative path while evaluating include directories of "dlprim_core":

    "OCL_PATH-NOTFOUND"

-- Generating done
CMake Generate step failed.  Build files cannot be regenerated correctly.
artyom-beilis commented 1 year ago

Actually the error is related to OpenCL headers and runtime library that is missing - without it, for sure you can't build.

Also make sure you build against CPU version of pytorch.

VirxEC commented 1 year ago

Oh cool. I already have the CPU version of PyTorch 1.13.0 from their own index - the repo built with static library kineto_LIBRARY-NOTFOUND not found still emitting.

I just had to install ocl-icd-opencl-dev and opencl-headers

JasonS05 commented 9 months ago

I'm having the same problem, but on macOS 11.7.10 (Big Sur). I've tried installing opencl-headers, opencl-clhpp-headers, and opencl-icd through homebrew but I'm still facing the same problem. My error output looks just like what's posted above, except only OCL_PATH is missing and not OCL_LIB.

tangjinchuan commented 1 month ago

I'm having the same problem, but on macOS 11.7.10 (Big Sur). I've tried installing opencl-headers, opencl-clhpp-headers, and opencl-icd through homebrew but I'm still facing the same problem. My error output looks just like what's posted above, except only OCL_PATH is missing and not OCL_LIB.

The hacking is here: https://github.com/artyom-beilis/pytorch_dlprim/issues/25#issuecomment-1403873957

However, for Apple clang plus Apple silicon, it is not easy to build one. I have been through loads of errors with Apple Clang more than here https://github.com/artyom-beilis/dlprimitives/pull/16