Open ishaan-jaff opened 1 year ago
Relevant code: https://github.com/oobabooga/text-generation-webui/blob/main/api-examples/api-example-chat-stream.py
Initial thoughts on mapping;
custom_llm_provider = "oobabooga"
user_input -> prompt max_new_tokens -> max_tokens
URI -> API_BASE
Hi @krrishdholakia I was thinking of starting with this issue.
So as far as I understood you want to add support for oobabooga/text-generation-webui
based apis.
So if we call the completion
method with custom_llm_provider = "oobabooga"
then it should make a request as shown in the api-example-chat-stream.py
Also how is the API_BASE set by the user, is it taken as a .env
variable ?
So if we call the completion method with custom_llm_provider = "oobabooga" then it should make a request as shown in the api-example-chat-stream.py
Yes
@KanishkaHalder1771 here's how api base can be set https://docs.litellm.ai/docs/set_keys let me know if you have questions about it
@KanishkaHalder1771 for an example implementation check out the huggingface implementation - https://github.com/BerriAI/litellm/blob/main/litellm/llms/huggingface_restapi.py
@KanishkaHalder1771 any updates on this?
@KanishkaHalder1771 any updates on this?
Hey @krrishdholakia , didn't get much time yet, expect a PR tomorrow or the day after
this looks like a requested feature from our users (I can see +3) on the issue
What do we need to add this to our docs @krrishdholakia ?
+1 here as well. Has there been any updates ?
Hey @iguy0 iirc oobabooga is openai compatible, this should work - https://docs.litellm.ai/docs/providers/openai_compatible
Hi, @krrishdholakia ,
I launched ooba with --listen --api --api-port 5312 --api-key sk-12345678901234567890T3BlbkFJ12345678901234567890 and have the following config in litellm:
model_list:
- model_name: LoneStriker_Nous-Hermes-2-Mixtral-8x7B-DPO-5.0bpw-h6-exl2
litellm_params:
model: openai/LoneStriker_Nous-Hermes-2-Mixtral-8x7B-DPO-5.0bpw-h6-exl2
api_base: "http://LOCAL_LAN_IP:5312"
api_key: sk-12345678901234567890T3BlbkFJ12345678901234567890
- model_name: LoneStriker_Yi-9B-200K-5.0bpw-h6-exl2
litellm_params:
model: openai/LoneStriker_Yi-9B-200K-5.0bpw-h6-exl2
api_base: "http://LOCAL_LAN_IP:5312"
api_key: sk-12345678901234567890T3BlbkFJ12345678901234567890
But when i query the completions endpoint in litellm i'm seeing:
curl http://LOCAL_LAN_IP:8123/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "LoneStriker_Nous-Hermes-2-Mixtral-8x7B-DPO-5.0bpw-h6-exl2", "messages": [{"role": "user", "content": "Why is the sky blue?"}], "temperature": 0. 7 }'
{"error":{"message":"OpenAIException - Error code: 404 - {'detail': 'Not Found'}","type":null,"
Could you please point out what i'm doing wrong ? Note: the model isn't currently loaded in ooba
I got it to work by adding /v1 like:
model_list:
- model_name: LoneStriker_Nous-Hermes-2-Mixtral-8x7B-DPO-5.0bpw-h6-exl2
litellm_params:
model: openai/LoneStriker_Nous-Hermes-2-Mixtral-8x7B-DPO-5.0bpw-h6-exl2
api_base: "http://LOCAL_LAN_IP:5312/v1"
api_key: sk-12345678901234567890T3BlbkFJ12345678901234567890
- model_name: LoneStriker_Yi-9B-200K-5.0bpw-h6-exl2
litellm_params:
model: openai/LoneStriker_Yi-9B-200K-5.0bpw-h6-exl2
api_base: "http://LOCAL_LAN_IP:5312/v1"
api_key: sk-12345678901234567890T3BlbkFJ12345678901234567890
However, i'm can only query the completions endpoint with the model already loaded in ooba. In case it isn't loaded , i ger an error message - seem like an issue with ooba, an not litellm.
Thank you for pointing me the article!! Cheers.
request from open-interpreter community