Open tischi opened 11 months ago
related issues: #24 #34
Hi @haesleinhuepf @tischi I'm the maintainer of LiteLLM https://github.com/BerriAI/litellm we allow you to do cost tracking for 100+ LLMs
Docs: https://docs.litellm.ai/docs/#calculate-costs-usage-latency
from litellm import completion, completion_cost
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
response = completion(
model="gpt-3.5-turbo",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)
cost = completion_cost(completion_response=response)
print("Cost for completion call with gpt-3.5-turbo: ", f"${float(cost):.10f}")
We also allow you to create a self hosted OpenAI Compatible proxy server to make your LLM calls (100+ LLMs), track costs, token usage Docs: https://docs.litellm.ai/docs/simple_proxy
I hope this is helpful, if not I'd love your feedback on what we can improve
if we want to compute the price for a request we may not have to count the tokens ourselves, as they seem to be provided by the model response: https://platform.openai.com/docs/guides/text-generation/chat-completions-response-format