Open its-ven opened 5 months ago
I couldn't reproduce. Could you consider trying a tighter integration via https://github.com/outlines-dev/outlines/blob/main/docs/reference/models/llamacpp.md
I couldn't reproduce. Could you consider trying a tighter integration via https://github.com/outlines-dev/outlines/blob/main/docs/reference/models/llamacpp.md
I've already tried but the llamacpp library is much slower than running an OAI proxy. I'm launching the server via this batch file:
start /B python oai_api.py --llama-api http://localhost:8080
start /B server --mlock -ngl 35 -m mistral-7b-instruct-v0.2.Q5_K_M.gguf -c 4096
pause
Describe the issue as clearly as possible:
I'm using the api_like_OAI.py script from the llamacpp repo, which works fine with the official OAI python library. The code below even correctly calls the server:
Adding a print statement in the tokenizer function returns an unending loop, regardless of model name, including official ones like gpt-4 and gpt-3.5-turbo:
As an additional test, I attempted to use the default OpenAI model by setting:
os.environ["OPENAI_BASE_URL"] = "http://localhost:8081"
which just returns a connection error and no activity from the server.Steps/code to reproduce the bug:
Expected result:
Error message:
Outlines/Python version information:
Outlines version: 0.0.25 Python version: 3.10.6
Context for the issue:
No response