Closed Arian-D closed 5 months ago
Hey cheers for the suggestion. I've pushed a change to allow that with the OPENAI_ENDPOINT
env var, so as long as it's fully compatible it should work.
I tried llama.cpp a while back and didn't get great results, but let me know how you get on.
FYI that change is available in 0.4 https://github.com/leona/helix-gpt/releases/tag/0.4
Thank you for the quick change!
I learned that the llama.cpp server implementation is not fully compatible with OpenAI's API, but llama-cpp-python's API is.
As you noted, it wasn't great (especially on my old thinkpad), and the requests timed out, but they were valid. I'll try it again on a beefier machine and will report back. Until then I'll suffer with gptel on Emacs :joy:.
First of all, thank you for your work on this!
There is a server implementation for llama.cpp and llamafile, and I use them very frequently. The API happens to be compatible with the OpenAI API.
It would be very helpful to set the URI with environment variable flag for people who self-host llama.cpp or llamafile.
P.S. I noticed the base URI is hardcoded. I'm not very knowledgeable in TS, but if I find the time I might make a PR.