Open abishekmuthian opened 1 month ago
Issue seems to be related to context size https://github.com/continuedev/continue/issues/1776 , setting new session in chat makes Codestral usable but still not as fast as open-webui.
Can confirm this. Codestral is extremly slow (one word per ~20 seconds) in Continue, while it is blazing fast using ollama's direct console. A striking difference
P.S. Yep, looks like adjusting context size fixes this.
Before submitting your bug report
Relevant environment info
Description
Codestral response via ollama has suddenly become very slow with latest updates, its at least 4 times slower when compared to response timings from other other apps like open-webui, curl are very fast. Other models like deepseek-coder-v2 is working fine in Continue.
To reproduce
docker logs --follow ollama
Log output
Note: Model is already loaded in VRAM before testing.
Ollama logs for Codestral via Continue
Ollama logs for Codestral via open-webui