[Feature]: Adding support for Volcano Engine

Jeffwan commented 4 months ago

The Feature

I am using LiteLLM to proxy request for different providers.

Motivation, pitch

I am using Volcano Engine internally https://www.volcengine.com/docs/82379/1133189#python and also OpenAI compatible services, I do want to use liteLLM as the proxy layer to aggregate different providers. Now, one blocker is it miss the support for volcano engine.

Twitter / LinkedIn details

https://www.linkedin.com/in/jiaxin-shan/

Jeffwan commented 4 months ago

I check the code and seems the prompt part needs to construct the prompt using chat template and special tokens. Is there a way to hornor the original chat template? If so, that would be easier and clean.

krrishdholakia commented 4 months ago

Hey @Jeffwan this is the curl i see from docs

it looks like volcano should already respect the messages format right (no prompt formatting needed)?

// Request
curl --request POST \
  --url http://maas-api.ml-platform-cn-beijing.volces.com/api/v2/endpoint/${YOUR_ENDPOINT_ID}/chat \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --data '{
        "stream": false,
        "parameters": {
                "max_new_tokens": 1024,
                "temperature": 0.9
        },
        "messages": [
                {
                        "role": "user",
                        "content": "你好"
                },
                {
                        "role": "assistant",
                        "content": "你好，有什么可以帮助您？"
                },
                {
                        "role": "user",
                        "content": "你可以做些什么？"
                }
        ]
}'

// Response
{
        "choices": [
            {
                    "message": {
                            "role": "assistant",
                            "content": "我可以回答各种问题，例如历史、科学、技术、文化、娱乐等方面的问题。我还可以生成文本，例如摘要、文章、故事等。您需要我做什么呢？"
                    },
                    "finish_reason": "stop"
            }
        ],
        "usage": {
                "prompt_tokens": 20,
                "completion_tokens": 43,
                "total_tokens": 63
        }
}

krrishdholakia commented 4 months ago

the only blocker to using it today w/ litellm is the fact that their api endpoint is not /chat/completions but instead just /chat?

krrishdholakia commented 4 months ago

bump on this? @Jeffwan

krrishdholakia commented 4 months ago

Hey @Jeffwan is this still an ongoing blocker?

Happy to do a quick call on this to speed up the PR

https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Jeffwan commented 4 months ago

@krrishdholakia sorry for late. It is, it uses different SDKs, role definitions, I kind of think we need to add a new model wrapper. I scheduled a sync up with you.

ishaan-jaff commented 2 months ago

done here @Jeffwan https://github.com/BerriAI/litellm/pull/4433

BerriAI / litellm