BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.7k stars 1.48k forks source link

[Feature]: Auto-update OpenRouter pricing #2407

Open krrishdholakia opened 7 months ago

krrishdholakia commented 7 months ago

The Feature

Openrouter pricing is here - https://openrouter.ai/api/v1/models

Need a github bot that can get this data, translate to our model_cost_map format and make a PR to update the model_cost_map with it

Our model cost map - https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json

Any community help with this would be appreciated!

Motivation, pitch

help users get more accurate pricing

Twitter / LinkedIn details

No response

vietpham1911 commented 6 months ago

Hello, I would like to work on this issue. Can you assign this issue to me, please? And I would like to ask if I'm understanding the issue correctly.

{
      "id": "openai/gpt-3.5-turbo",
      "name": "OpenAI: GPT-3.5 Turbo",
      "description": "GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data: up to Sep 2021.",
      "pricing": {
        "prompt": "0.000001",
        "completion": "0.000002",
        "image": "0",
        "request": "0"
      },
      "context_length": 4095,
      "architecture": {
        "modality": "text",
        "tokenizer": "GPT",
        "instruct_type": null
      },
      "top_provider": {
        "max_completion_tokens": null,
        "is_moderated": true
      },
      "per_request_limits": null
},
openrouter/openai/gpt-3.5-turbo": {
        "max_tokens": 4095,
        "input_cost_per_token": 0.0000015,
        "output_cost_per_token": 0.000002,
        "litellm_provider": "openrouter",
        "mode": "chat"
},

And the model_prices_and_context_window.json should be updated like this:

openrouter/openai/gpt-3.5-turbo": {
    "max_tokens": 4095,
    "input_cost_per_token": 0.000001,
    "output_cost_per_token": 0.000002,
    "litellm_provider": "openrouter",
    "mode": "chat"
},

I update "max_tokens", "input_cost_per_token" and "output_cost_per_token" of the json file by taking the values of "context_length", "prompt" and "completion" of https://openrouter.ai/api/v1/models. Is my understanding correct?

vietpham1911 commented 5 months ago

@krrishdholakia @ishaan-jaff

Merlinvt commented 5 months ago

I think it would be great if it could also pull whether the model supports vision. Can probably be done by checking if ("modality": "multimodal") Example (Gemini): "architecture": { "modality": "multimodal", "tokenizer": "Gemini", "instruct_type": null },

Merlinvt commented 5 months ago

I created a pull request with some additional models. It is not the implementation in this issue, but it would be nice in the meantime:

https://github.com/BerriAI/litellm/pull/3545