Closed Squirreljetpack closed 2 months ago
Several issues here:
model
arg in completion requests. This PR attempts to add such functionality, but it comes with some caveats./v1/model/list
only showing a single model available - this is because a somewhat recent change was made to only allow admins to view the list of models in the directory for security reasons. Try this one again with an admin key.Ok my user error with 2, it was an admin api key but... I didn't know env vars overrode variables from .env in docker compose files so that messed up my docker compose. As for 1, I'll take a look at it, thanks!
OS
Linux
GPU Library
CUDA 12.x
Python version
3.12
Describe the bug
Likely this is incompetence rather than an actual bug... but when I try (tabbyapi is on 5001) curl http://localhost:5001/v1/completions \ -H "Content-Type: application/json" \ -d '{ "model": "phi48", "prompt": "Once upon a time,", "max_tokens": 400, "stream": false, "min_p": 0.05, "repetition_penalty": 1.05 }' I get back the response
The model did not change. Is this because I need an admin key? EDIT: tested with an admin key and also not working. Okay I also tried
And it seems tabbyAPI is only reading one model, even though I have two in the directory, and I can switch between them by editing the model specified in the config.
Reproduction steps
Running with this docker image: squirreljetpacks/exl2:latest All defaults in config.yml, except
Expected behavior
Model changes
Logs
No response
Additional context
No response
Acknowledgements