Open 3dluvr opened 11 months ago
Same error here. Running in WSL also.
Same error, also in WSL2 Ubuntu with RTX4080.
Command line:
./build/bin/main -m models/13B/llama-13b-relu.q4.powerinfer.gguf -n 128 -t 8 -p "Once upon a time"
CUDA and CUDA Toolkit version: 12.2.
I don't know whether CUDA version matters?
Same error, also in WSL2 Ubuntu with RTX4090. Why?
Running in WSL, all deps satisified, most recent code pull, on a RTX 3090.
Command line:
./build/bin/main -m models/7B/llama-7b-relu.powerinfer.gguf -n 128 -t 8 -p "Once upon a time" --vram-budget 12
Log output: