Open zilunzhang opened 10 months ago
I encountered similar problem and solved it by adding -DCMAKE_CUDA_ARCHITECTURES=native
in cmake
options.
In detail, I re-ran the build part:
cmake -S . -B build -DLLAMA_CUBLAS=ON -DCMAKE_CUDA_ARCHITECTURES=native
cmake --build build --config Release
This problem is because the compilation does not use the correct CUDA architecture flag, you need to go to the official website to check the correct Compute Capability of the graphics card, for example, my NVIDIA GeForce RTX 3070 corresponds to 8.6, so the architecture flag is 86, so i just need to re-run the following bash:
cmake -S . -B build -DLLAMA_CUBLAS=ON -DCMAKE_CUDA_ARCHITECTURES=86
cmake --build build --config Release
Prerequisites
Before submitting your issue, please ensure the following:
Expected Behavior
Run the LLaMA-2 7B version in a dual 3090 machine (use "export CUDA_VISIBLE_DEVICES=0" to force the powerinfer run on single card.)
Current Behavior
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
$ lscpu
$ uname -a
Failure Information (for bugs)
Please help provide information about the failure / bug.
Steps to Reproduce
Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.
Issue might be relevant