Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
{
"role": "user",
"content": "Hello!\nWhat’s in this image?"
}
(Joined all text entries with \n and removed all image entries)
Maybe this deserves a warning printout too? To let the user know that they attempted to pass images to a non-vision model and they were removed automatically.
Motivation, pitch
Supporting "message content as a list" universally is advantageous because it lets developers stick with 1 format in their code that reliably works with both vision and non-vision models.
The Feature
Vision models support message content as a list:
vs. the traditional string format:
When I pass the list format to a non-vision model, LiteLLM should automatically convert it to the string format for me.
Presently, attempting this with LiteLLM and e.g. "mistral/mistral-medium" results in a format error from Mistral API.
Here's a more complex example showing how I think it should work:
If I pass the following message format to a non-vision model:
LiteLLM should automatically convert it to:
(Joined all text entries with
\n
and removed all image entries)Maybe this deserves a warning printout too? To let the user know that they attempted to pass images to a non-vision model and they were removed automatically.
Motivation, pitch
Supporting "message content as a list" universally is advantageous because it lets developers stick with 1 format in their code that reliably works with both vision and non-vision models.
Lack of support for this in LiteLLM has forced me to include the following "bandaid fix" in my project: https://github.com/jakobdylanc/discord-llm-chatbot/blob/2eda3f4ad7f7b741776519033f5da23cd70ca2e6/llmcord.py#L94-L95
Twitter / LinkedIn details
No response