BerriAI / litellm

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
https://docs.litellm.ai/docs/
Other
10.28k stars 1.15k forks source link

[Feature]: Support `include_usage` for bedrock #4407

Open Manouchehri opened 1 week ago

Manouchehri commented 1 week ago

The Feature

It'd be really nice if include_usage worked on providers other than just OpenAI. I think LiteLLM should be able to do this, since we already calculate the cost elsewhere?

Motivation, pitch

It's really useful for users to know how much they've spend in tokens for each streaming request.

Twitter / LinkedIn details

https://www.linkedin.com/in/davidmanouchehri/

krrishdholakia commented 1 week ago

this already works @Manouchehri

e.g. response from predibase.

Screenshot 2024-06-27 at 3 48 45 PM

can you share a case where it didn't work? and we can file that as an issue

Manouchehri commented 1 week ago

It's missing from Bedrock.

curl -v "${OPENAI_API_BASE}/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-20240620",
    "max_tokens": 10,
    "seed": 4242,
    "stream": true,
    "temperature": 0.0,
    "messages": [
      {
        "role": "user",
        "content": "Hello"
      }
    ],
    "stream_options": {
      "include_usage": true
    }
  }'
data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":"Hello","role":"assistant"}}],"created":1719529524,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":"!"}}],"created":1719529524,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":" How"}}],"created":1719529524,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":" can"}}],"created":1719529524,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":" I"}}],"created":1719529524,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":" assist"}}],"created":1719529524,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":" you"}}],"created":1719529524,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":" today"}}],"created":1719529525,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":"?"}}],"created":1719529525,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"index":0,"delta":{"content":" Feel"}}],"created":1719529525,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: {"id":"chatcmpl-9569e7af-aaf6-4744-9d8c-0be2bd77f528","choices":[{"finish_reason":"length","index":0,"delta":{}}],"created":1719529525,"model":"anthropic.claude-3-5-sonnet-20240620-v1:0","object":"chat.completion.chunk"}

data: [DONE]

I think it's missing from Azure OpenAI as well, haven't confirmed yet though.

Manouchehri commented 1 week ago

Confirmed, it is also missing/not working for Azure OpenAI requests.

Manouchehri commented 1 week ago

It's missing from Anthropic (directly) too.