BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
14.09k stars 1.67k forks source link

[Feature]: Unknown context window size and costs can be determined at runtime (enhancement) #5639

Open iplayfast opened 2 months ago

iplayfast commented 2 months ago

The Feature

Feature request originally submitted to aider, but he said that it belonged here.

https://github.com/paul-gauthier/aider/issues/1380#issuecomment-2339128896

Issue

When using ollama I ran the following:

aider --model=ollama/yi-coder:latest
Model ollama/yi-coder:latest: Unknown context window size and costs, using sane defaults.
For more info, see: https://aider.chat/docs/llms/warnings.html

However if in another terminal I run the following:

 ollama run yi-coder
>>> /show
Available Commands:
  /show info         Show details for this model
  /show license      Show model license
  /show modelfile    Show Modelfile for this model
  /show parameters   Show parameters for this model
  /show system       Show system message
  /show template     Show prompt template

>>> /show info
  Model                      
    arch                llama        
    parameters          8.8B         
    quantization        Q4_0         
    context length      131072       
    embedding length    4096         

  Parameters                 
    stop    "<|endoftext|>"         
    stop    "<|im_end|>"            
    stop    "<fim_prefix>"          
    stop    "<fim_suffix>"          
    stop    "<fim_middle>"          

  License                    
    Apache License                
    Version 2.0, January 2004     

You can see that context length is actually given. This particular model prides itself on it's large context length. This information can be retrieved using the ollama api https://github.com/ollama/ollama/blob/main/docs/api.md#show-model-information

Version and model info

No response

Motivation, pitch

It would be useful to be able to pull info from ollama about a model's info, including context window size.

Twitter / LinkedIn details

No response

krrishdholakia commented 2 months ago

checking with aider what function is being used for this information

krrishdholakia commented 2 months ago

action item is to run the ollama get request in the get_model_info call