cublas implemetation? - Githubissues

rhohndorf / Auto-Llama-cpp

Uses Auto-GPT with Llama.cpp

MIT License

384 stars 68 forks source link

cublas implemetation? #29

Closed yesbroc closed 1 year ago

yesbroc commented 1 year ago

Duplicates

[X] I have searched the existing issues

Summary 💡

for llama, there's a flag called --gpu-layers N, basically oflloads some layers to the gpu for processing

Examples 🌈

from ooba

Motivation 🔦

since cpu is super slow, gpu would be nice

rhohndorf commented 1 year ago

You have to set some environment variables before llama-cpp-python installation so that it get compiled with cublas support. Follow the instructions here: https://pypi.org/project/llama-cpp-python/