Closed thiswillbeyourgithub closed 4 months ago
Hi,
Reading your articles made me really curious about trying that but I was wondering of it was possible to use HuggingFace's quantized models or even llamacpp or if that required deep changes.
Thanks!
Working on a llama.cpp implementation!
There's now a PR live on the llama.cpp repo: https://github.com/ggerganov/llama.cpp/pull/5970
That PR is merged, so closing this issue.
Hi,
Reading your articles made me really curious about trying that but I was wondering of it was possible to use HuggingFace's quantized models or even llamacpp or if that required deep changes.
Thanks!