Open dadupriv opened 2 months ago
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
Solved! Replace DLLAMA_CUBLAS=on with GGML_CUDA=on
Is there a certain way I need to launch this? I launch using https://github.com/zylon-ai/private-gpt/issues/2083
after running '$env:CMAKE_ARGS='-GGML_CUDA=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0' and it still uses my CPU instead of GPU
paste the entire line into the terminal and click enter:
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir
Pre-check
Description
Windows OS: all requirements that CUDA has gcc++ 14 Runing PrivateGPT but only with CPU not GPU
CUDA: +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 560.94 Driver Version: 560.94 CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 WDDM | 00000000:03:00.0 Off | N/A | | 0% 37C P8 21W / 350W | 47MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GeForce RTX 3090 WDDM | 00000000:04:00.0 Off | N/A | | 0% 45C P8 31W / 350W | 340MiB / 24576MiB | 2% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 14620 C+G ...crosoft\Edge\Application\msedge.exe N/A | | 1 N/A N/A 9456 C+G C:\Windows\explorer.exe N/A | | 1 N/A N/A 10884 C+G ...2txyewy\StartMenuExperienceHost.exe N/A | | 1 N/A N/A 12132 C+G ....Search_cw5n1h2txyewy\SearchApp.exe N/A | | 1 N/A N/A 14668 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A | | 1 N/A N/A 17180 C+G ...am Files (x86)\VideoLAN\VLC\vlc.exe N/A | | 1 N/A N/A 18792 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A | +-----------------------------------------------------------------------------------------+
I have searched, and cannot compile llama cpp with CUDA problem as below.
Anaconda Powershell
PS C:\Users\XXXXX>
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [31 lines of output] scikit-build-core 0.10.6 using CMake 3.30.3 (wheel) Configuring CMake... 2024-09-11 10:14:43,243 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None loading initial cache file C:\Users\nasdadu\AppData\Local\Temp\tmp2efzwb2l\build\CMakeInit.txt -- Building for: Visual Studio 17 2022 -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19045. -- The C compiler identification is MSVC 19.35.32217.1 -- The CXX compiler identification is MSVC 19.35.32217.1 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: C:/Users/nasdadu/pinokio/bin/miniconda/Library/bin/git.exe (found version "2.42.0.windows.1") CMake Error at vendor/llama.cpp/CMakeLists.txt:95 (message): LLAMA_CUBLAS is deprecated and will be removed in the future.
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
[notice] A new release of pip is available: 23.3.1 -> 24.2 [notice] To update, run: python.exe -m pip install --upgrade pip
Steps to Reproduce
Windows OS: Input commands in powershell:
$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
Expected Behavior
Expected BLAS=1 with GPU usage
Actual Behavior
Output BLAS=0 only CPU usage
Environment
Windows 10 19045.4780 RTX 3090
Additional Information
No response
Version
No response
Setup Checklist
NVIDIA GPU Setup Checklist
nvidia-smi
to verify).sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
)