paul-gauthier / aider

aider is AI pair programming in your terminal
https://aider.chat/
Apache License 2.0
18.51k stars 1.71k forks source link

Yi coder cycles #1462

Open kv-gits opened 1 week ago

kv-gits commented 1 week ago

Issue

New model yi_coder has unusual bos eos tokens (<|startoftext|> <|endoftext|> ), so it cant stop generating and moves to inifinity cycle. Would be nice to have an option to change this tokens.

Version and model info

Aider 0.55.0, Yi-Coder-9B-Chat

fry69 commented 1 week ago

I have given this a shot, but I have not tested if it actually works. Please try and report ->

pip install git+https://github.com/fry69/aider.git@eos-bos

The new command line arguments are --bos-token "<|startoftext|>" and --eos-token "<|endoftext|>".

Update: I tested it locally, but I can't tell if this makes a difference. Can you please provide a test case?

I have seen one kind of cycle with BOS/EOS tokens set (using yi-coder 9B via Ollama). aider did not go infinite, but aider asked questions and answered itself, see image ->

short cycle

Is this what the BOS/EOS should prevent or just the infinite cycles (which I don't know how to trigger yet)?

kv-gits commented 1 week ago

Also no difference. I Think it is problem on the gguf model side.

fry69 commented 6 days ago

@kv-gits

Can you please try again? I fixed my branch to actually send the BOS/EOS tokens to LiteLLM, it involved a bit more than I initially thought. Here is how a request now looks in the Ollama debug log:

--- snip --- time=2024-09-11T11:22:14.745+02:00 level=DEBUG source=routes.go:211 msg="generate request" prompt="<|im_start|>user\n\n<|startoftext|>### System:\nAct as an expert code analyst.\nAnswer questions about the supplied code.\n\nAlways reply to the user in the same language they are using.\n<|endoftext|><|startoftext|>### User:\nI am not sharing the full contents of any files with you yet.<|endoftext|><|startoftext|>### Assistant:\nOk.<|endoftext|><|startoftext|>### User:\nJust another test.<|endoftext|>\n<|im_end|>\n<|im_start|>assistant\n" images=[] --- snap --

Update: branch rebased to current main, still seems to work fine, including tests now.

fry69 commented 6 days ago

As I am digging deeper into this topic, I think that BOS/EOS tokens should best be left alone. It seems to be the responsibility of the inference software to apply those correctly, and Ollama has access to the special_tokens_map.json file for Yi Coder to read from.

If I understand this correctly the client (aider+LiteLLM in this case) just has to supply the correct format (ChatML) to Ollama and the inference software (Ollama/llama.cpp) has to translate this to the format for the actual model (token wise). But please correct me if I am wrong.

paul-gauthier commented 6 days ago

When reporting problems, it is very helpful if you can provide the “announcement” lines that aider prints at startup:

Aider v0.37.1-dev
Models: gpt-4o with diff edit format, weak model gpt-3.5-turbo
Git repo: .git with 243 files
Repo-map: using 1024 tokens
kv-gits commented 6 days ago

@fry69 Most probably you right. I use llama.cpp, not ollama. Also aider-benchmarks work fine with yi-coder. I will try ollama and give feedback.

fry69 commented 6 days ago

Maybe also relevant -> https://huggingface.co/01-ai/Yi-Coder-9B-Chat/discussions/4

kv-gits commented 6 days ago

@fry69 You right. With ollama model works fine. And no need to set bos eos.