SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MIT License
7.96k stars 412 forks source link

The CUDA compiler identification is unknown And PowerInfer was compiled without cuBLAS #157

Open LHQUer opened 8 months ago

LHQUer commented 8 months ago

Prerequisites

Before submitting your question, please ensure the following:

Question Details

Why there is a error" PowerInfer was compiled without cuBLAS. It is not possible to set a VRAM budge"? And How to sovle it?

Additional Context

the outputs of the running code correspondingly are as followed:

(1)first runing the code“./build/bin/main -m ./ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.powerinfer.gguf -n 256 -t 8 -p "How does math work?" --vram-budget 4” warning: PowerInfer was compiled without cuBLAS. It is not possible to set a VRAM budget. Log start main: build = 1569 (7b09717) main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu main: seed = 1709190866 llama_model_loader: loaded meta data with 18 key-value pairs and 355 tensors from ./ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.powerinfer.gguf (version GGUF V3 (latest))

therefor, I recompile the code to see if there is some error

(2)secode runing the code “cmake -S . -B build -DLLAMA_CUBLAS=ON” to check if compile correctly at last time, the output is (PowerInfer) luohuaqin@labserver:~/PowerInfer$ cmake -S . -B build -DLLAMA_CUBLAS=ON -- cuBLAS found -- The CUDA compiler identification is unknown CMake Error at CMakeLists.txt:258 (enable_language): No CMAKE_CUDA_COMPILER could be found.

Tell CMake where to find the compiler by setting either the environment variable "CUDACXX" or the CMake cache entry CMAKE_CUDA_COMPILER to the full path to the compiler, or to the compiler name if it is in the PATH.

-- Configuring incomplete, errors occurred!

LHQUer commented 8 months ago

OK, Maybe I have find the solution of the above question, I ask the GPT to analysis the output and it give the possible solution : adding the cuda that required environment:export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH and finally I can successfully run the code cmake -S . -B build -DLLAMA_CUBLAS=ON cmake --build build --config Release and the model Inference code ./build/bin/main -m ./ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.powerinfer.gguf -n 256 -t 8 -p "How does math work?" --vram-budget 4 using 8 RTX 3090 GPUs.