llama.cpp endpoint alternative

leona / helix-gpt

Code assistant language server for Helix with support for Copilot/OpenAI/Codeium/Ollama

MIT License

287 stars 19 forks source link

llama.cpp endpoint alternative #3

Closed Arian-D closed 5 months ago

Arian-D commented 5 months ago

First of all, thank you for your work on this!

There is a server implementation for llama.cpp and llamafile, and I use them very frequently. The API happens to be compatible with the OpenAI API.

It would be very helpful to set the URI with environment variable flag for people who self-host llama.cpp or llamafile.

P.S. I noticed the base URI is hardcoded. I'm not very knowledgeable in TS, but if I find the time I might make a PR.

leona commented 5 months ago

Hey cheers for the suggestion. I've pushed a change to allow that with the OPENAI_ENDPOINT env var, so as long as it's fully compatible it should work.

I tried llama.cpp a while back and didn't get great results, but let me know how you get on.

FYI that change is available in 0.4 https://github.com/leona/helix-gpt/releases/tag/0.4

Arian-D commented 5 months ago

Thank you for the quick change!

I learned that the llama.cpp server implementation is not fully compatible with OpenAI's API, but llama-cpp-python's API is.

As you noted, it wasn't great (especially on my old thinkpad), and the requests timed out, but they were valid. I'll try it again on a beefier machine and will report back. Until then I'll suffer with gptel on Emacs :joy:.