BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
14.07k stars 1.66k forks source link

[Feature]: Add citations support to completion for Cohere models #6814

Open linksmith opened 2 days ago

linksmith commented 2 days ago

The Feature

Models supported by Cohere return citations if you add a documents variable in the completion request (both stream=True, and stream=False) In response, chat completion returns citations.

See example of DEBUG output from liteLLM:

PROCESSED CHUNK PRE CHUNK CREATOR: b'{"is_finished":false,"event_type":"citation-generation","citations":[{"start":92,"end":99,"text":"popcorn","document_ids":["1"]}]}'; custom_llm_provider: cohere_chat
chunk: {"is_finished":false,"event_type":"citation-generation","citations":[{"start":92,"end":99,"text":"popcorn","document_ids":["1"]}]}
PROCESSED CHUNK POST CHUNK CREATOR: None

The citations are returned by the Cohere API, but ignored when liteLLM creates the chunks, and are thus not accessible when iterating through the chunks.

Update the chunk creation function for Cohere models to read the citations from the Cohere API, and provide them in the chunks. Possibly in the same way that already seems to be working in the Perplexity liteLLM completions.

Motivation, pitch

Many people use the Cohere API for their RAG use cases, as provides these citations. But, migration to liteLLM, is now not possible when using the completion API.

Twitter / LinkedIn details

No response

linksmith commented 19 hours ago

@krrishdholakia Glad to see you're picking this up. I can't wait to get started with liteLLM. Let me know if I can help out with a more detailed description if necessary.