Closed yesbroc closed 1 year ago
for llama, there's a flag called --gpu-layers N, basically oflloads some layers to the gpu for processing
from ooba
since cpu is super slow, gpu would be nice
You have to set some environment variables before llama-cpp-python installation so that it get compiled with cublas support. Follow the instructions here: https://pypi.org/project/llama-cpp-python/
Duplicates
Summary 💡
for llama, there's a flag called --gpu-layers N, basically oflloads some layers to the gpu for processing
Examples 🌈
from ooba
Motivation 🔦
since cpu is super slow, gpu would be nice