keldenl / gpt-llama.cpp

A llama.cpp drop-in replacement for OpenAI's GPT endpoints, allowing GPT-powered apps to run off local llama.cpp models instead of OpenAI.
MIT License
594 stars 66 forks source link

llama.cpp GPU support #46

Open alexl83 opened 1 year ago

alexl83 commented 1 year ago

Hi, since commit 905d87b70aa189623d500a28602d7a3a755a4769 llama.cpp support GPU inference with nvidia CUDA via command-line switches like --gpu-layers Could you please consider adding support to GPT-LLAMA.CPP aswell?

Thank you!

msj121 commented 1 year ago

@alexl83 Just looking at the code, since you compile llama.cpp in theory it would appear to me that you can install with cuda support, then you are just passing the argument like any of the others listed, like threads.

ie: npm start ngl 4 to hoist on to 4 gpu layers. I don't have a compatible setup to test though.