BerriAI / litellm

Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.31k stars 1.43k forks source link

[Feature]: Add Premp AI provider #3848

Closed tuanlv14 closed 3 months ago

tuanlv14 commented 3 months ago
          > doesn't this already work? @tuanlv14

https://docs.litellm.ai/docs/providers/openai_compatible

Yes. I had tried but current method was not working. Based on Premp API document, it required payload with project_user_ID. I did not know how I can custom project_ID with liteLLM parameters. Pls help me.

import requests

url = "https://app.premai.io/v1/chat/completions"

payload = { "project_id": 123, "session_id": "", "repositories": { "ids": [123], "limit": 3, "similarity_threshold": 0.5 }, "messages": [ { "role": "user", "content": "", "template_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a", "params": {} } ], "model": "", "system_prompt": "", "max_tokens": 1, "stream": True, "temperature": 1 } headers = { "Authorization": "", "Content-Type": "application/json" }

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

Originally posted by @tuanlv14 in https://github.com/BerriAI/litellm/issues/3722#issuecomment-2132283713

krrishdholakia commented 3 months ago

any unmapped param is sent straight through - https://docs.litellm.ai/docs/completion/input#provider-specific-params

How did you try to call it with litellm?

krrishdholakia commented 3 months ago
from litellm import completion 

completion(model="openai/<your-prem-model-name", messages=[{"role": "user", "content": "Hey!"}], api_base="https://app.premai.io/v1", api_key="your-prem-key", project_id="1234")

i believe, this should work

kwekewk commented 3 months ago

i believe, this should work

I had this issue on open-webui, nextchat, chatkit. while its working on curl litellm v1.38.10

POST Request Sent from LiteLLM:
curl -X POST \
https://app.premai.io/v1 \
-d '{'model': 'gpt-4o', 'messages': [{'role': 'user', 'content': 'hi'}], 'temperature': 0.8, 'stream': True, 'max_tokens': 1000, 'user': 'default_user_id', 'extra_body': {'project_id': '1234'}}'

01:04:07 - LiteLLM Router:INFO: router.py:652 - litellm.acompletion(model=openai/gpt-4o) 200 OK
INFO:    - "POST /v1/chat/completions HTTP/1.1" 200 OK
Traceback (most recent call last):
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 3480, in async_data_generator
    async for chunk in response:
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/utils.py", line 11902, in __anext__
    raise e
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/utils.py", line 11786, in __anext__
    async for chunk in self.completion_stream:
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py", line 147, in __aiter__
    async for item in self._iterator:
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py", line 183, in __stream__
    data = sse.json()
           ^^^^^^^^^^
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py", line 259, in json
    return json.loads(self.data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/micromamba/envs/litellm/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/micromamba/envs/litellm/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/micromamba/envs/litellm/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 3480, in async_data_generator
    async for chunk in response:
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/utils.py", line 11902, in __anext__
    raise e
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/utils.py", line 11786, in __anext__
    async for chunk in self.completion_stream:
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py", line 147, in __aiter__
    async for item in self._iterator:
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py", line 183, in __stream__
    data = sse.json()
           ^^^^^^^^^^
  File "/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py", line 259, in json
    return json.loads(self.data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/micromamba/envs/litellm/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/micromamba/envs/litellm/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/micromamba/envs/litellm/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
tuanlv14 commented 3 months ago
from litellm import completion 

completion(model="openai/<your-prem-model-name", messages=[{"role": "user", "content": "Hey!"}], api_base="https://app.premai.io/v1", api_key="your-prem-key", project_id="1234")

i believe, this should work

Yes. liteLLM + Premp worked well with python code. But I do not know how can I use via liteLLM proxy: model_list:

How can I define/config the config.yaml file which have project ID ? Pls guide me. Thanks so much.

krrishdholakia commented 3 months ago
model_name: Premp-gpt-4o
litellm_params:
    model: openai/gpt-4o
    api_base: https://app.premai.io/v1
    api_key: PREMP_API_KEY
    rpm: 10
    project_id: <your-project-id>
krrishdholakia commented 3 months ago

it's the same thing - the litellm_params are the params going into the completion call

tuanlv14 commented 3 months ago

I tried. But I got this error message: "unhealthy_endpoints": [ { "model": "openai/llama-3-70b-instruct", "api_base": "https://app.premai.io/v1", "rpm": 10, "project_id": my_ID, "error": "Error code: 403 - {'detail': 'You do not have permission to perform this action.'} stack trace: Traceback (most recent call last).

Even I checked python code still work well. Pls help me review and fix.

krrishdholakia commented 3 months ago

"error": "Error code: 403 - {'detail': 'You do not have permission to perform this action.'} stack trace: Traceback (most recent call last).

it looks like your api key is not allowed to call the premai endpoint?

tuanlv14 commented 3 months ago

No. This python:

from litellm import completion 

response = completion(model="openai/gpt-4o", messages=[{"role": "user", "content": "Who are you?"}], api_base="https://app.premai.io/v1", api_key="my_key", project_id="my_id")
print(response)

Return result:

ModelResponse(id='chatcmpl-9TYsxGbAIrivfRqRtEFCgtSy8Msif', choices=[Choices(finish_reason='stop', index=0, message=Message(content="I'm an AI language model developed by OpenAI, known as ChatGPT. I'm designed to assist with a wide range of questions and tasks, from providing information on various topics to helping with writing and problem-solving. How can I assist you today?", role='assistant'))], created=1716832639, model='gpt-4o-2024-05-13', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=49, prompt_tokens=11, total_tokens=60))

So I am sure 100% is API_KEY are working well. Only proxy I have problem.

tuanlv14 commented 3 months ago

If I tried to use curl:

curl --request POST \
l https:>   --url https://app.premai.io/v1/chat/completions \
>   --header 'Authorization: my_API_key' \
>   --header 'Content-Type: application/json' \
>   --data '{
>   "project_id": 3800,
>   "messages": [
>     {
>       "role": "user",
>       "content": "Who are you ?"
>     }
>   ],
>   "model": "llama-3-8b-fast"
> }'
{"detail":"You do not have permission to perform this action."}r

So I only tried to fake project ID, it will be wrong with same message. So I concern that litellm proxy do not understand "project_ID" to POST to Premp AI sever?

krrishdholakia commented 3 months ago

Unable to repro. this works for me.

Screenshot 2024-05-27 at 11 13 27 AM
- model_name: Premp-gpt-4o
  litellm_params:
    model: openai/gpt-4o
    api_base: https://app.premai.io/v1
    api_key: <your-prem-key> # 👈 PUT KEY HERE
    rpm: 10
    project_id: 4457
tuanlv14 commented 3 months ago

Hi all, I confirm that krrishdholakia code work well via python code calls. But when I call this from Continue VScode extension, it still error. I do not know the error due to Continue extension or VScode. If you can check, pls help me to clear.

kwekewk commented 3 months ago

I don't know if its relevant, premai didn't support streaming

cat chat.hurl ; hurl chat.hurl                                                                                                                                               
POST http://0.0.0.0:4000/v1/chat/completions
Authorization: Bearer sk-cok
{
  "model": "premai/claude-3-haiku@anthropic",
  "stream": false,
  "messages": [
    {
      "role": "user",
      "content": "morning"
    }
  ]
}
{"id":null,"choices":[{"finish_reason":"stop","index":0,"message":{"content":"Good morning! How can I assist you today?","role":"assistant"}}],"created":1716846908,"mo
del":"claude-3-haiku-20240307","object":"chat.completion","system_fingerprint":null,"usage":{"completion_tokens":10,"prompt_tokens":2,"total_tokens":12}}(litellm)
cat chat.hurl ; hurl chat.hurl 
POST http://0.0.0.0:4000/v1/chat/completions
Authorization: Bearer sk-cok
{
  "model": "premai/claude-3-haiku@anthropic",
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "morning"
    }
  ]
}

data: {"error": {"message": "Expecting value: line 1 column 1 (char 0)\n\nTraceback (most recent call last):\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/proxy/proxy_server.py\", line 3480, in async_data_generator\n    async for chunk in response:\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/utils.py\", line 11902, in __anext__\n    raise e\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/utils.py\", line 11786, in __anext__\n    async for chunk in self.completion_stream:\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/opentelemetry/instrumentation/openai/shared/chat_wrappers.py\", line 502, in __anext__\n    raise e\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/opentelemetry/instrumentation/openai/shared/chat_wrappers.py\", line 498, in __anext__\n    chunk = await self.__wrapped__.__anext__()\n            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py\", line 144, in __anext__\n    return await self._iterator.__anext__()\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py\", line 183, in __stream__\n    data = sse.json()\n           ^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py\", line 259, in json\n    return json.loads(self.data)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/json/__init__.py\", line 346, in loads\n    return _default_decoder.decode(s)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/json/decoder.py\", line 337, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/json/decoder.py\", line 355, in raw_decode\n    raise JSONDecodeError(\"Expecting value\", s, err.value) from None\njson.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)\n", "type": "None", "param": "None", "code": 500}}
tuanlv14 commented 3 months ago

@krrishdholakia :

When I run model healthy check, I got this error:

  {
      "rpm": 10,
      "api_base": "https://app.premai.io/v1",
      "model": "openai/llama-3-8b-fast",
      "project_id": MY_ID,
      "stream": true,
      "error": "Error code: 403 - {'detail': 'You do not have permission to perform this action.'} stack trace: Traceback (most recent call last):\n  File \"/usr/local/lib/python3.11/site-packages/litellm/main.py\", line 4252, in ahealth_check\n    response = await openai_chat_completions.ahealth_check(\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/litellm/llms/openai.py\", line 1214, in ahealth_check\n    completion = await client.chat.completions.with_raw_response.create(\n                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/openai/_legacy_response.py\", line 353, in wrapped\n    return cast(LegacyAPIResponse[R], await func(*args, **kwargs))\n                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions.py\", line 1181, in create\n    return await self._post(\n           ^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/site-packages/openai/_base_client.py\", line 1790, in post\n    return awai"
    }

Can you guide me to fix ?

krrishdholakia commented 3 months ago

detail': 'You do not have permission to perform this action.'

please confirm you're able to make the call via curl

  1. did you put your api key in the config
  2. what is the raw request being sent by litellm - run the proxy with --detailed_debug to see this.
kwekewk commented 3 months ago

Sorry, may i ask? stream=true cause error on both proxy and python on all premai models

cat chat.hurl ; hurl chat.hurl 
POST http://0.0.0.0:4000/v1/chat/completions
Authorization: Bearer sk-cok
{
  "model": "premai/claude-3-haiku@anthropic",
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "morning"
    }
  ]
}

data: {"error": {"message": "Expecting value: line 1 column 1 (char 0)\n\nTraceback (most recent call last):\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/proxy/proxy_server.py\", line 3480, in async_data_generator\n    async for chunk in response:\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/utils.py\", line 11902, in __anext__\n    raise e\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/litellm/utils.py\", line 11786, in __anext__\n    async for chunk in self.completion_stream:\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/opentelemetry/instrumentation/openai/shared/chat_wrappers.py\", line 502, in __anext__\n    raise e\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/opentelemetry/instrumentation/openai/shared/chat_wrappers.py\", line 498, in __anext__\n    chunk = await self.__wrapped__.__anext__()\n            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py\", line 144, in __anext__\n    return await self._iterator.__anext__()\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py\", line 183, in __stream__\n    data = sse.json()\n           ^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/site-packages/openai/_streaming.py\", line 259, in json\n    return json.loads(self.data)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/json/__init__.py\", line 346, in loads\n    return _default_decoder.decode(s)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/json/decoder.py\", line 337, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/user/micromamba/envs/litellm/lib/python3.11/json/decoder.py\", line 355, in raw_decode\n    raise JSONDecodeError(\"Expecting value\", s, err.value) from None\njson.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)\n", "type": "None", "param": "None", "code": 500}}