theroyallab / tabbyAPI

An OAI compatible exllamav2 API that's both lightweight and fast
GNU Affero General Public License v3.0
567 stars 75 forks source link

[BUG] Model request doesn't work #165

Closed Squirreljetpack closed 2 months ago

Squirreljetpack commented 2 months ago

OS

Linux

GPU Library

CUDA 12.x

Python version

3.12

Describe the bug

Likely this is incompetence rather than an actual bug... but when I try (tabbyapi is on 5001) curl http://localhost:5001/v1/completions \ -H "Content-Type: application/json" \ -d '{ "model": "phi48", "prompt": "Once upon a time,", "max_tokens": 400, "stream": false, "min_p": 0.05, "repetition_penalty": 1.05 }' I get back the response

{"id":"cmpl-c8cfa51b186b4e2eb40a96f4c88f06c2","choices":[{"index":0,"finish_reason":"length","logprobs":null,"text":" the end of 2013, there was a game called \"Pixel Art\", which is like a game about drawing pictures using \"pixels\".\n\nThe game's main character is a giant robot who loves drawing. One day, he met a cute and curious cat named Pippi. Together with her, they explored many \"pixel art\" scenes around the world.\n\nThis time, the main character's boss asked for his help, because he was busy drawing pictures. He drew some pictures but found that it was hard for him to make them look nice. Therefore, the main character asked his friend for help. His friend was busy too. Soon after that, the main character and his friend both were busy with their tasks.\n\nHowever, there was an important problem. The boss wanted the pictures drawn by them to be nice. But since they were too busy, they had no idea how to make those pictures look nice.\n\nAt last, the end of the year came and everyone had to go home. The main character, his friend, and the boss were all in the same place.\n\nAs soon as they arrived home, the boss thought that he needed to fix all the pictures drawn by them. Then, they went to the same place where they had been busy with their tasks. They gathered together.\n\nAfter a long time, they finally gathered all the pictures drawn by the main character and his friend.\n\nBut they didn't know how to make those pictures look nice. They needed someone to help them.\n\nAnd suddenly, the end of the year arrived. The main character and his friend went back to work, and the boss stayed with them. The main character and his friend helped the boss to make the pictures look nice.\n\nAfter that, the main character and his friend went back to work. The boss stayed with them to finish the pictures"}],"created":1723492974,"model":"dolphinstar","object":"text_completion","usage":{"prompt_tokens":6,"completion_tokens":400,"total_tokens":406}}

The model did not change. Is this because I need an admin key? EDIT: tested with an admin key and also not working. Okay I also tried

curl http://localhost:5001/v1/model/list -H "Authorization: Bearer b1a658421f69bc17e77aaece91f6f81f"
{"object":"list","data":[{"id":"phi48","object":"model","created":1723500906,"owned_by":"tabbyAPI","logging":null,"parameters":null}]}

And it seems tabbyAPI is only reading one model, even though I have two in the directory, and I can switch between them by editing the model specified in the config.

Reproduction steps

Running with this docker image: squirreljetpacks/exl2:latest All defaults in config.yml, except

disable_auth: True
model:
  # Overrides the directory to look for models (default: models)
  model_name: dolphinstar

Expected behavior

Model changes

Logs

No response

Additional context

No response

Acknowledgements

DocShotgun commented 2 months ago

Several issues here:

  1. TabbyAPI does not handle the model arg in completion requests. This PR attempts to add such functionality, but it comes with some caveats.
  2. Regarding /v1/model/list only showing a single model available - this is because a somewhat recent change was made to only allow admins to view the list of models in the directory for security reasons. Try this one again with an admin key.
Squirreljetpack commented 2 months ago

Ok my user error with 2, it was an admin api key but... I didn't know env vars overrode variables from .env in docker compose files so that messed up my docker compose. As for 1, I'll take a look at it, thanks!