I tried to build the project and run a simple demo code as said in README.
Current Behavior
It built successfully, but report error when I tried to run the demo code.
offload_ffn_split: applying augmentation to model - please wait ...
CUDA error 222 at /home/test/test06/jdz/PowerInfer/ggml-cuda.cu:9635: the provided PTX was compiled with an unsupported toolchain.
current device: 0
Environment and Context
SDK version, e.g. for Linux:
Python 3.10.14
cmake version 3.30.1
g++ (conda-forge gcc 11.4.0-13) 11.4.0
Failure Information (for bugs)
I followed README, use cmake -S . -B build -DLLAMA_CUBLAS=ON, cmake --build build --config Release to build the project, and then run the command:
./build/bin/main -m /home/test/test06/jdz/PLMs/ReluLLaMA-7B/llama-7b-relu.powerinfer.gguf -n 128 -t 8 -p "Once upon a time" --vram-budget 8
I got the following logout
Log start
<!--skip some log-->
offload_ffn_split: applying augmentation to model - please wait ...
CUDA error 222 at /home/test/test06/jdz/PowerInfer/ggml-cuda.cu:9635: the provided PTX was compiled with an unsupported toolchain.
current device: 0
Steps to Reproduce
Just follow README.
Additional info
After running cmake -S . -B build -DLLAMA_CUBLAS=ON I got the following:
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/test/test06/miniconda3/envs/jdz/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/test/test06/miniconda3/envs/jdz/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- Found CUDAToolkit: /home/test/test06/cuda-12.4/targets/x86_64-linux/include (found version "12.4.131")
-- cuBLAS found
-- The CUDA compiler identification is NVIDIA 12.4.131
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /home/test/test06/cuda-12.4/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Using CUDA architectures: 52;61;70
GNU ld (GNU Binutils) 2.40
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (17.3s)
-- Generating done (4.5s)
-- Build files have been written to: /home/test/test06/jdz/PowerInfer/build
Prerequisites
Before submitting your issue, please ensure the following:
Expected Behavior
I tried to build the project and run a simple demo code as said in README.
Current Behavior
It built successfully, but report error when I tried to run the demo code.
Environment and Context
Failure Information (for bugs)
I followed README, use
cmake -S . -B build -DLLAMA_CUBLAS=ON
,cmake --build build --config Release
to build the project, and then run the command:I got the following logout
Steps to Reproduce
Just follow README.
Additional info
After running
cmake -S . -B build -DLLAMA_CUBLAS=ON
I got the following: