BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
14k stars 1.66k forks source link

[Feature]: track gemini image tokens #3269

Open krrishdholakia opened 6 months ago

krrishdholakia commented 6 months ago

The Feature

https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/send-multimodal-prompts#image-requirements

Motivation, pitch

accurate cost tracking

Twitter / LinkedIn details

No response

krrishdholakia commented 5 months ago

For Gemini 1.0 Pro Vision, each image accounts for 258 tokens.

For Gemini 1.5 Flash and Gemini 1.5 Pro:

If both dimensions of an image's aspect ratio are less than or equal to 384, then 258 tokens are used. If one dimension of an image's aspect ratio is greater than 384, then the image is cropped into tiles. Each tile size defaults to the smallest dimension (width or height) divided by 1.5. If necessary, each tile is adjusted so that it's not smaller than 256 and not greater than 768. Each tile is then resized to 768x768 and uses 258 tokens.