Aider-AI / aider

aider is AI pair programming in your terminal
https://aider.chat/
Apache License 2.0
22.04k stars 2.05k forks source link

litellm bug is causing "unknown model" warnings in aider for Ollama models #2318

Open sirus20x6 opened 5 days ago

sirus20x6 commented 5 days ago
Warning for ollama/vanilj/supernova-medius:q6_k_l: Unknown context window size and costs, using sane defaults.
Did you mean one of these?
- ollama/vanilj/supernova-medius:q6_k_l
You can skip this check with --no-show-model-warnings

.env

OLLAMA_API_BASE=http://127.0.0.1:11434
AIDER_DARK_MODE=true
AIDER_MODEL="ollama/vanilj/supernova-medius:q6_k_l"
AIDER_SHOW_DIFFS=true
AIDER_GIT=true
AIDER_GITIGNORE=true
AIDER_AIDERIGNORE=.aiderignore
AIDER_TEST_CMD="cd build && ./build.sh"
AIDER_AUTO_TEST=true
AIDER_TEST=true

.aider.model.metadata.json

{
    "ollama/vanilj/supernova-medius:q6_k_l": {
        "max_tokens": 131072,
        "max_input_tokens": 131072,
        "max_output_tokens": 4096,
        "input_cost_per_token": 0.0,
        "output_cost_per_token": 0.0,
        "litellm_provider": "ollama",
        "mode": "chat"
    }
}

It seems to me that those model names match so it should be picking up the settings?

paul-gauthier commented 3 days ago

When reporting problems, it is very helpful if you can provide:

Including the “announcement” lines that aider prints at startup is an easy way to share some of this helpful info.

Aider v0.37.1-dev
Models: gpt-4o with diff edit format, weak model gpt-3.5-turbo
Git repo: .git with 243 files
Repo-map: using 1024 tokens
exclamaforte commented 3 days ago

I'm seeing the same thing starting aider in an empty git project:

$ aider --model ollama/qwen2.5-coder:32b --model-metadata-file .aider.model.metadata.json
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Warning for ollama/qwen2.5-coder:32b: Unknown context window size and costs, using sane defaults.
Did you mean one of these?
- ollama/qwen2.5-coder:32b
You can skip this check with --no-show-model-warnings

https://aider.chat/docs/llms/warnings.html
Open documentation url for more info? (Y)es/(N)o [Yes]: N

Aider v0.62.1
Model: ollama/qwen2.5-coder:32b with whole edit format
Git repo: .git with 0 files
Repo-map: disabled
Use /help <question> for help, run "aider --help" to see cmd line args
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>

Platform: MacOS Metadata file:

{
    "ollama/qwen2.5-coder:32b": {
        "max_tokens": 4096,
        "max_input_tokens": 128000,
        "max_output_tokens": 128000,
        "input_cost_per_token": 0.000000000014,
        "output_cost_per_token": 0.000000000028,
        "litellm_provider": "ollama",
        "mode": "chat"
    }
}
paul-gauthier commented 3 days ago

Ok, aider was swallowing an Ollama exception, which I now fixed and allow it to explode. You need to have your Ollama server running and your api base set.

Exception: OllamaError: Error getting model info for ollama/qwen2.5-coder:32b. Set Ollama API Base via `OLLAMA_API_BASE` environment variable. Error: [Errno 61] Connection refused

The change is available in the main branch. You can get it by installing the latest version from github:

aider --install-main-branch

# or...

python -m pip install --upgrade --upgrade-strategy only-if-needed git+https://github.com/Aider-AI/aider.git

If you have a chance to try it, let me know if it works better for you.

paul-gauthier commented 3 days ago

Looks like this may be an underlying litellm bug.

https://github.com/BerriAI/litellm/issues/6703

sirus20x6 commented 3 days ago

I did some man in the middling

POST /api/show HTTP/1.1
Host: 127.0.0.1:11434
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
User-Agent: litellm/1.51.2
Content-Length: 28
Content-Type: application/json

{"name": "ollama/qc:latest"}HTTP/1.1 404 Not Found
Content-Type: application/json; charset=utf-8
Date: Tue, 12 Nov 2024 04:38:02 GMT
Content-Length: 46

{"error":"model 'ollama/qc:latest' not found"}

"ollama/" isn't being stripped off the model in the request

I edited def get_model_info to pass in my model name and built from source and pip installed the wheel but no change. the problem must be in return litellm.get_model_info(model)

andrew528i commented 1 day ago

I found temporary solution - you can create modelfile and then ollama create ollama/qwen2.5-coder:32b -f Modelfile

The content of modelfile:

FROM qwen2.5-coder:32b
image

After these manipulations I was able to run aider with qwen2.5-coder:32b (I made a custom one with own system prompt but anyway you can do the same with renaming original model)

sirus20x6 commented 9 hours ago

I updated litellm_provider and aider to latest versions litellm==1.52.8. the fix he pushed didn't seem to make a difference. the model file work around did, but then ollama crashed on me. it seems to be trying to load too many layers to my gpu when I have a server pretty much made for cpu inference and I really only want one gpu layer for the prompt processing speedup. I gave up on trying to find where to change that setting on ollama, and I decided to try switching to llama-cpp-python because llama.cpp has always worked the best for me out of anything. welll ...

aider
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Warning for llama-cpp-python/qwen2.5-coder:32b-instruct-q8_0: Unknown context window size and costs, using sane defaults.
Did you mean one of these?
- llama-cpp-python/qwen2.5-coder:32b-instruct-q8_0
You can skip this check with --no-show-model-warnings

https://aider.chat/docs/llms/warnings.html
Open documentation url for more info? (Y)es/(N)o/(D)on't ask again [Yes]:
paul-gauthier commented 7 hours ago

Litellm just released a fixed version. The main branch of aider uses it now.

The change is available in the main branch. You can get it by installing the latest version from github:

aider --install-main-branch

# or...

python -m pip install --upgrade --upgrade-strategy only-if-needed git+https://github.com/Aider-AI/aider.git

If you have a chance to try it, let me know if it works better for you.

paul-gauthier commented 6 hours ago

Sorry the above install approaches apparently won't bump the Litellm version.

Try:

pip install -U litellm
sirus20x6 commented 6 hours ago

I ended up just hacking together a llama.cpp backend