abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
7.65k stars 920 forks source link

Could not find nvcc, please set CUDAToolkit_ROOT #409

Open EugeoSynthesisThirtyTwo opened 1 year ago

EugeoSynthesisThirtyTwo commented 1 year ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

--n-gpu-layers 36 is supposed to fill my VRAM and use my GPU, it's also supposed to print in the console llama_model_load_internal: [cublas] offloading 36 layers to GPU and I suppose it should be printing BLAS = 1

Current Behavior

I don't have anything about offloading in the console, my GPU is sleeping, and my VRAM is empty. And it prints BLAS = 0

Environment and Context

Windows 10 CPU with 20 threads 64 Go RAM RTX 3080 Ti Laptop GPU 16 Go VRAM

Python 3.10.9 fastapi 0.97.0 numpy 1.25.0 starlette 0.27.0 uvicorn 0.22.0

Can't verify "make" and "g++" versions because windows does not find these commands

Steps to Reproduce

conda create -n textgen python=3.10.9 conda activate textgen pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 pip install -r requirements.txt

pip uninstall -y llama-cpp-python set CMAKE_ARGS="-DLLAMA_CUBLAS=on" set FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir

- download a ggml model (I used [WizardLM-30B-Uncensored.ggmlv3.q4_0.bin](https://huggingface.co/TheBloke/WizardLM-30B-Uncensored-GGML/blob/main/WizardLM-30B-Uncensored.ggmlv3.q4_0.bin)) and put it in the "models" folder
- start the webui

python server.py --model WizardLM-30B-Uncensored.ggmlv3.q4_0.bin --n-gpu-layers 36 --auto-devices

- open the gradio webui
- generate something in the chat-box to load the model

# Failure Logs

bin C:\Users\Armaguedin\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.dll C:\Users\Armaguedin\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " function 'cadam32bit_grad_fp32' not found 2023-06-20 23:40:24 INFO:Loading WizardLM-30B-Uncensored.ggmlv3.q4_0.bin... 2023-06-20 23:40:24 INFO:llama.cpp weights detected: models\WizardLM-30B-Uncensored.ggmlv3.q4_0.bin

2023-06-20 23:40:24 INFO:Cache capacity is 0 bytes llama.cpp: loading model from models\WizardLM-30B-Uncensored.ggmlv3.q4_0.bin llama_model_load_internal: format = ggjt v3 (latest) llama_model_load_internal: n_vocab = 32001 llama_model_load_internal: n_ctx = 2048 llama_model_load_internal: n_embd = 6656 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 52 llama_model_load_internal: n_layer = 60 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 2 (mostly Q4_0) llama_model_load_internal: n_ff = 17920 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 30B llama_model_load_internal: ggml ctx size = 0.13 MB llama_model_load_internal: mem required = 19756.67 MB (+ 3124.00 MB per state) .................................................................................................... llama_init_from_file: kv self size = 3120.00 MB AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 2023-06-20 23:40:25 INFO:Loaded the model in 0.97 seconds.

2023-06-20 23:40:25 INFO:Loading the extension "gallery"... Running on local URL: http://0.0.0.0:7861

To create a public link, set share=True in launch().

llama_print_timings: load time = 60209.82 ms llama_print_timings: sample time = 1.47 ms / 10 runs ( 0.15 ms per token, 6807.35 tokens per second) llama_print_timings: prompt eval time = 60209.69 ms / 8 tokens ( 7526.21 ms per token, 0.13 tokens per second) llama_print_timings: eval time = 8880.62 ms / 9 runs ( 986.74 ms per token, 1.01 tokens per second) llama_print_timings: total time = 69109.63 ms Output generated in 69.34 seconds (0.13 tokens/s, 9 tokens, context 8, seed 1976462288)

abetlen commented 1 year ago

@EugeoSynthesisThirtyTwo that is odd but I don't think it's the parameter, even without any GPU layers set it should still print the card that's detected. Can you reinstall with pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose and copy the log here if it doesn't fix the issue?

EugeoSynthesisThirtyTwo commented 1 year ago

I reinstalled llama-cpp-python using your command, the problem remains, here is the verbose (it says cublas not found)

Using pip 23.1.2 from C:\Users\Armaguedin\.conda\envs\textgen\lib\site-packages\pip (python 3.10)
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.1.65.tar.gz (1.5 MB)
     ---------------------------------------- 1.5/1.5 MB 2.7 MB/s eta 0:00:00
  Running command pip subprocess to install build dependencies
  Collecting setuptools>=42
    Using cached setuptools-68.0.0-py3-none-any.whl (804 kB)
  Collecting scikit-build>=0.13
    Using cached scikit_build-0.17.6-py3-none-any.whl (84 kB)
  Collecting cmake>=3.18
    Using cached cmake-3.26.4-py2.py3-none-win_amd64.whl (33.0 MB)
  Collecting ninja
    Using cached ninja-1.11.1-py2.py3-none-win_amd64.whl (313 kB)
  Collecting distro (from scikit-build>=0.13)
    Using cached distro-1.8.0-py3-none-any.whl (20 kB)
  Collecting packaging (from scikit-build>=0.13)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting tomli (from scikit-build>=0.13)
    Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
  Collecting wheel>=0.32.0 (from scikit-build>=0.13)
    Using cached wheel-0.40.0-py3-none-any.whl (64 kB)
  Installing collected packages: ninja, cmake, wheel, tomli, setuptools, packaging, distro, scikit-build
  Successfully installed cmake-3.26.4 distro-1.8.0 ninja-1.11.1 packaging-23.1 scikit-build-0.17.6 setuptools-68.0.0 tomli-2.0.1 wheel-0.40.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  running egg_info
  writing llama_cpp_python.egg-info\PKG-INFO
  writing dependency_links to llama_cpp_python.egg-info\dependency_links.txt
  writing requirements to llama_cpp_python.egg-info\requires.txt
  writing top-level names to llama_cpp_python.egg-info\top_level.txt
  reading manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
  Getting requirements to build wheel ... done
  Running command Preparing metadata (pyproject.toml)
  running dist_info
  creating C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python.egg-info
  writing C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python.egg-info\PKG-INFO
  writing dependency_links to C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python.egg-info\dependency_links.txt
  writing requirements to C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python.egg-info\requires.txt
  writing top-level names to C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python.egg-info\top_level.txt
  writing manifest file 'C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python.egg-info\SOURCES.txt'
  reading manifest file 'C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python.egg-info\SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python.egg-info\SOURCES.txt'
  creating 'C:\Users\Armaguedin\AppData\Local\Temp\pip-modern-metadata-qtxqbhtx\llama_cpp_python-0.1.65.dist-info'
  Preparing metadata (pyproject.toml) ... done
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
  Downloading typing_extensions-4.6.3-py3-none-any.whl (31 kB)
Collecting numpy>=1.20.0 (from llama-cpp-python)
  Downloading numpy-1.25.0-cp310-cp310-win_amd64.whl (15.0 MB)
     ---------------------------------------- 15.0/15.0 MB 2.9 MB/s eta 0:00:00
Collecting diskcache>=5.6.1 (from llama-cpp-python)
  Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
     ---------------------------------------- 45.6/45.6 kB 2.2 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
  Running command Building wheel for llama-cpp-python (pyproject.toml)

  --------------------------------------------------------------------------------
  -- Trying 'Ninja (Visual Studio 17 2022 x64 v143)' generator
  --------------------------------
  ---------------------------
  ----------------------
  -----------------
  ------------
  -------
  --
  Not searching for unused variables given on the command line.
  -- The C compiler identification is MSVC 19.35.32215.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx86/x64/cl.exe - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- The CXX compiler identification is MSVC 19.35.32215.0
  CMake Warning (dev) at C:/Users/Armaguedin/AppData/Local/Temp/pip-build-env-eymc2760/overlay/Lib/site-packages/cmake/data/share/cmake-3.26/Modules/CMakeDetermineCXXCompiler.cmake:168 (if):
    Policy CMP0054 is not set: Only interpret if() arguments as variables or
    keywords when unquoted.  Run "cmake --help-policy CMP0054" for policy
    details.  Use the cmake_policy command to set the policy and suppress this
    warning.

    Quoted variables like "MSVC" will no longer be dereferenced when the policy
    is set to NEW.  Since the policy is not set the OLD behavior will be used.
  Call Stack (most recent call first):
    CMakeLists.txt:4 (ENABLE_LANGUAGE)
  This warning is for project developers.  Use -Wno-dev to suppress it.

  CMake Warning (dev) at C:/Users/Armaguedin/AppData/Local/Temp/pip-build-env-eymc2760/overlay/Lib/site-packages/cmake/data/share/cmake-3.26/Modules/CMakeDetermineCXXCompiler.cmake:189 (elseif):
    Policy CMP0054 is not set: Only interpret if() arguments as variables or
    keywords when unquoted.  Run "cmake --help-policy CMP0054" for policy
    details.  Use the cmake_policy command to set the policy and suppress this
    warning.

    Quoted variables like "MSVC" will no longer be dereferenced when the policy
    is set to NEW.  Since the policy is not set the OLD behavior will be used.
  Call Stack (most recent call first):
    CMakeLists.txt:4 (ENABLE_LANGUAGE)
  This warning is for project developers.  Use -Wno-dev to suppress it.

  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx86/x64/cl.exe - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Configuring done (2.1s)
  -- Generating done (0.0s)
  -- Build files have been written to: C:/Users/Armaguedin/AppData/Local/Temp/pip-install-elvqhuwr/llama-cpp-python_e6bb6b2469de4f43b17a2fdff5431546/_cmake_test_compile/build
  --
  -------
  ------------
  -----------------
  ----------------------
  ---------------------------
  --------------------------------
  -- Trying 'Ninja (Visual Studio 17 2022 x64 v143)' generator - success
  --------------------------------------------------------------------------------

  Configuring Project
    Working directory:
      C:\Users\Armaguedin\AppData\Local\Temp\pip-install-elvqhuwr\llama-cpp-python_e6bb6b2469de4f43b17a2fdff5431546\_skbuild\win-amd64-3.10\cmake-build
    Command:
      'C:\Users\Armaguedin\AppData\Local\Temp\pip-build-env-eymc2760\overlay\Lib\site-packages\cmake\data\bin/cmake.exe' 'C:\Users\Armaguedin\AppData\Local\Temp\pip-install-elvqhuwr\llama-cpp-python_e6bb6b2469de4f43b17a2fdff5431546' -G Ninja '-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Users\Armaguedin\AppData\Local\Temp\pip-build-env-eymc2760\overlay\Lib\site-packages\ninja\data\bin\ninja' -D_SKBUILD_FORCE_MSVC=1930 --no-warn-unused-cli '-DCMAKE_INSTALL_PREFIX:PATH=C:\Users\Armaguedin\AppData\Local\Temp\pip-install-elvqhuwr\llama-cpp-python_e6bb6b2469de4f43b17a2fdff5431546\_skbuild\win-amd64-3.10\cmake-install' -DPYTHON_VERSION_STRING:STRING=3.10.9 -DSKBUILD:INTERNAL=TRUE '-DCMAKE_MODULE_PATH:PATH=C:\Users\Armaguedin\AppData\Local\Temp\pip-build-env-eymc2760\overlay\Lib\site-packages\skbuild\resources\cmake' '-DPYTHON_EXECUTABLE:PATH=C:\Users\Armaguedin\.conda\envs\textgen\python.exe' '-DPYTHON_INCLUDE_DIR:PATH=C:\Users\Armaguedin\.conda\envs\textgen\Include' '-DPYTHON_LIBRARY:PATH=C:\Users\Armaguedin\.conda\envs\textgen\libs\python310.lib' '-DPython_EXECUTABLE:PATH=C:\Users\Armaguedin\.conda\envs\textgen\python.exe' '-DPython_ROOT_DIR:PATH=C:\Users\Armaguedin\.conda\envs\textgen' -DPython_FIND_REGISTRY:STRING=NEVER '-DPython_INCLUDE_DIR:PATH=C:\Users\Armaguedin\.conda\envs\textgen\Include' '-DPython_LIBRARY:PATH=C:\Users\Armaguedin\.conda\envs\textgen\libs\python310.lib' '-DPython3_EXECUTABLE:PATH=C:\Users\Armaguedin\.conda\envs\textgen\python.exe' '-DPython3_ROOT_DIR:PATH=C:\Users\Armaguedin\.conda\envs\textgen' -DPython3_FIND_REGISTRY:STRING=NEVER '-DPython3_INCLUDE_DIR:PATH=C:\Users\Armaguedin\.conda\envs\textgen\Include' '-DPython3_LIBRARY:PATH=C:\Users\Armaguedin\.conda\envs\textgen\libs\python310.lib' '-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Users\Armaguedin\AppData\Local\Temp\pip-build-env-eymc2760\overlay\Lib\site-packages\ninja\data\bin\ninja' '"-DLLAMA_CUBLAS=on"' -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on

  Not searching for unused variables given on the command line.
  CMake Warning:
    Ignoring extra path from command line:

     ""-DLLAMA_CUBLAS=on""

  -- The C compiler identification is MSVC 19.35.32215.0
  -- The CXX compiler identification is MSVC 19.35.32215.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx86/x64/cl.exe - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx86/x64/cl.exe - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.38.0.windows.1")
  fatal: not a git repository (or any of the parent directories): .git
  fatal: not a git repository (or any of the parent directories): .git
  CMake Warning at vendor/llama.cpp/CMakeLists.txt:113 (message):
    Git repository not found; to enable automatic generation of build info,
    make sure Git is installed and the project is a Git repository.

  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
  -- Looking for pthread_create in pthreads
  -- Looking for pthread_create in pthreads - not found
  -- Looking for pthread_create in pthread
  -- Looking for pthread_create in pthread - not found
  -- Found Threads: TRUE
  -- Could not find nvcc, please set CUDAToolkit_ROOT.
  CMake Warning at vendor/llama.cpp/CMakeLists.txt:254 (message):
    cuBLAS not found

  -- CMAKE_SYSTEM_PROCESSOR: AMD64
  -- x86 detected
  -- Configuring done (2.2s)
  -- Generating done (0.0s)
  -- Build files have been written to: C:/Users/Armaguedin/AppData/Local/Temp/pip-install-elvqhuwr/llama-cpp-python_e6bb6b2469de4f43b17a2fdff5431546/_skbuild/win-amd64-3.10/cmake-build
  [1/7] Building C object vendor\llama.cpp\CMakeFiles\ggml.dir\k_quants.c.obj
  [2/7] Building C object vendor\llama.cpp\CMakeFiles\ggml.dir\ggml.c.obj
  C:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um\winbase.h(9531): warning C5105: l'expansion de macro produisant 'defined' a un comportement ind‚fini
  [3/7] Linking C static library vendor\llama.cpp\ggml_static.lib
  [4/7] Linking C shared library bin\ggml_shared.dll
  [5/7] Building CXX object vendor\llama.cpp\CMakeFiles\llama.dir\llama.cpp.obj
  [6/7]-- Install configuration: "Release"
  -- Installing: C:/Users/Armaguedin/AppData/Local/Temp/pip-install-elvqhuwr/llama-cpp-python_e6bb6b2469de4f43b17a2fdff5431546/_skbuild/win-amd64-3.10/cmake-install/llama_cpp/llama.lib
  -- Installing: C:/Users/Armaguedin/AppData/Local/Temp/pip-install-elvqhuwr/llama-cpp-python_e6bb6b2469de4f43b17a2fdff5431546/_skbuild/win-amd64-3.10/cmake-install/llama_cpp/llama.dll
   Linking CXX shared library bin\llama.dll
  [6/7] Install the project...

  copying llama_cpp\llama.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.py
  copying llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_cpp.py
  copying llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_types.py
  copying llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\__init__.py
  creating directory _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server
  copying llama_cpp/server\app.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\app.py
  copying llama_cpp/server\__init__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\__init__.py
  copying llama_cpp/server\__main__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\__main__.py

  running bdist_wheel
  running build
  running build_py
  creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310
  creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\app.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\__init__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\__main__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.lib -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.dll -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copied 7 files
  running build_ext
  installing to _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
  running install
  running install_lib
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama.dll -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama.lib -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp\server
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\app.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\__init__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\__main__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copied 9 files
  running install_egg_info
  running egg_info
  writing llama_cpp_python.egg-info\PKG-INFO
  writing dependency_links to llama_cpp_python.egg-info\dependency_links.txt
  writing requirements to llama_cpp_python.egg-info\requires.txt
  writing top-level names to llama_cpp_python.egg-info\top_level.txt
  reading manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
  Copying llama_cpp_python.egg-info to _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp_python-0.1.65-py3.10.egg-info
  running install_scripts
  copied 0 files
  C:\Users\Armaguedin\AppData\Local\Temp\pip-build-env-eymc2760\overlay\Lib\site-packages\wheel\bdist_wheel.py:100: RuntimeWarning: Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect
    if get_flag("Py_DEBUG", hasattr(sys, "gettotalrefcount"), warn=(impl == "cp")):
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp_python-0.1.65.dist-info\WHEEL
  creating 'C:\Users\Armaguedin\AppData\Local\Temp\pip-wheel-yljuija4\.tmp-pwg3ajr5\llama_cpp_python-0.1.65-cp310-cp310-win_amd64.whl' and adding '_skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel' to it
  adding 'llama_cpp/__init__.py'
  adding 'llama_cpp/llama.dll'
  adding 'llama_cpp/llama.lib'
  adding 'llama_cpp/llama.py'
  adding 'llama_cpp/llama_cpp.py'
  adding 'llama_cpp/llama_types.py'
  adding 'llama_cpp/server/__init__.py'
  adding 'llama_cpp/server/__main__.py'
  adding 'llama_cpp/server/app.py'
  adding 'llama_cpp_python-0.1.65.dist-info/LICENSE.md'
  adding 'llama_cpp_python-0.1.65.dist-info/METADATA'
  adding 'llama_cpp_python-0.1.65.dist-info/WHEEL'
  adding 'llama_cpp_python-0.1.65.dist-info/top_level.txt'
  adding 'llama_cpp_python-0.1.65.dist-info/RECORD'
  removing _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
  Building wheel for llama-cpp-python (pyproject.toml) ... done
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.1.65-cp310-cp310-win_amd64.whl size=474294 sha256=10945df15b5436b63093699d4114dbdf775e8afdf29901df2be8fd06b0b56fde
  Stored in directory: C:\Users\Armaguedin\AppData\Local\Temp\pip-ephem-wheel-cache-n6by2_1w\wheels\e0\62\84\21f820209ad725e813c2dd41eeda1a0dcb9184af7022d55c0d
Successfully built llama-cpp-python
Installing collected packages: typing-extensions, numpy, diskcache, llama-cpp-python
  Attempting uninstall: typing-extensions
    Found existing installation: typing_extensions 4.6.3
    Uninstalling typing_extensions-4.6.3:
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\__pycache__\typing_extensions.cpython-310.pyc
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\typing_extensions-4.6.3.dist-info\
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\typing_extensions.py
      Successfully uninstalled typing_extensions-4.6.3
  Attempting uninstall: numpy
    Found existing installation: numpy 1.25.0
    Uninstalling numpy-1.25.0:
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\numpy-1.25.0.dist-info\
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\numpy\
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\scripts\f2py.exe
      Successfully uninstalled numpy-1.25.0
  Attempting uninstall: diskcache
    Found existing installation: diskcache 5.6.1
    Uninstalling diskcache-5.6.1:
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\diskcache-5.6.1.dist-info\
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\diskcache\
      Successfully uninstalled diskcache-5.6.1
  Attempting uninstall: llama-cpp-python
    Found existing installation: llama-cpp-python 0.1.65
    Uninstalling llama-cpp-python-0.1.65:
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\llama_cpp\
      Removing file or directory c:\users\armaguedin\.conda\envs\textgen\lib\site-packages\llama_cpp_python-0.1.65.dist-info\
      Successfully uninstalled llama-cpp-python-0.1.65
Successfully installed diskcache-5.6.1 llama-cpp-python-0.1.65 numpy-1.25.0 typing-extensions-4.6.3
gjmulder commented 1 year ago

There's a error prior to that warning:

Could not find nvcc, please set CUDAToolkit_ROOT

EugeoSynthesisThirtyTwo commented 1 year ago

I guess I have to add the path to cuda in %PATH% on windows, do you know how can I locate cuda please? Or maybe I am wrong

m-from-space commented 1 year ago

You should be successful providing the variable like this:

pip uninstall -y llama-cpp-python
set CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCUDA_TOOLKIT_ROOT_DIR='C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.1/'"
set FORCE_CMAKE=1
pip install llama-cpp-python --no-cache-dir
gjmulder commented 1 year ago

Likely a duplicate of #459. You need to fix your CUDA install.