Add response cost in model response (headers/hidden params)

BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

https://docs.litellm.ai/docs/

Other

12.52k stars 1.46k forks source link

Add response cost in model response (headers/hidden params) #4335

Closed krrishdholakia closed 3 months ago

krrishdholakia commented 3 months ago

proxy response could be :

{
...
"usage": {
    "completion_tokens": 163,
    "prompt_tokens": 14684,
    "total_tokens": 14847,
    "cost": 1.123 <-- new
   }
}

This may be controlled by a new proxy config parameter

Originally posted by @olad32 in https://github.com/BerriAI/litellm/discussions/3982

krrishdholakia commented 3 months ago

Addressed with https://github.com/BerriAI/litellm/pull/4436

@superpoussin22 can we do a 10min call sometime this/next week?

https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Would love to learn how you're using litellm proxy, so we can improve