BerriAI / litellm

Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.32k stars 1.43k forks source link

[Feature]: Auto-convert message content for non-vision models #1893

Closed jakobdylanc closed 7 months ago

jakobdylanc commented 7 months ago

The Feature

Vision models support message content as a list:

{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ]
}

vs. the traditional string format:

{
  "role": "user",
  "content": "Hello!"
}

When I pass the list format to a non-vision model, LiteLLM should automatically convert it to the string format for me.

Presently, attempting this with LiteLLM and e.g. "mistral/mistral-medium" results in a format error from Mistral API.

Here's a more complex example showing how I think it should work:

If I pass the following message format to a non-vision model:

{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    },
    {
      "type": "text",
      "text": "What’s in this image?"
    },
    {
      "type": "image_url",
      "image_url": {
        "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
      }
    }
  ]
}

LiteLLM should automatically convert it to:

{
  "role": "user",
  "content": "Hello!\nWhat’s in this image?"
}

(Joined all text entries with \n and removed all image entries)

Maybe this deserves a warning printout too? To let the user know that they attempted to pass images to a non-vision model and they were removed automatically.

Motivation, pitch

Supporting "message content as a list" universally is advantageous because it lets developers stick with 1 format in their code that reliably works with both vision and non-vision models.

Lack of support for this in LiteLLM has forced me to include the following "bandaid fix" in my project: https://github.com/jakobdylanc/discord-llm-chatbot/blob/2eda3f4ad7f7b741776519033f5da23cd70ca2e6/llmcord.py#L94-L95

Twitter / LinkedIn details

No response

krrishdholakia commented 7 months ago

fix pushed. Should be live soon in v1.23.5