Feature Request: Increase Token Limit in Ollama via REST API

saoudrizwan / claude-dev

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, and more with your permission every step of the way.

https://marketplace.visualstudio.com/items?itemName=saoudrizwan.claude-dev

MIT License

5.95k stars 545 forks source link

Feature Request: Increase Token Limit in Ollama via REST API #259

Open vinoudropdrop opened 3 weeks ago

vinoudropdrop commented 3 weeks ago

Hi! Thank you so much for your exceptional work, you’re incredible!

However, with the LLM I’m using in Ollama’s section, I’m encountering errors that seem related to token length. Could you add a feature to adjust the token limit by retrieving the maximum allowed token length (context_length) via the REST API of the LLM being used?

More issues might stem from other stufs, but I’ve noticed that when I ask Claude-Dev to perform a simple "Hello World," it can do it. However, as soon as I ask for something more complex, it seems to lose track. I suspect there might be a problem with memory limits.

saoudrizwan commented 3 weeks ago

Hi! Thank you so much for your exceptional work, you’re incredible!

❤️

adjust the token limit by retrieving the maximum allowed token length (context_length) via the REST API

This is a great idea! Thank you for the suggestion

Nvocmoc commented 2 weeks ago

This would be a very useful feature, thanks for the suggestion and the awesome plugin!

techcow2 commented 2 weeks ago

Curious, which model do you find best for coding on ollama?

Hi! Thank you so much for your exceptional work, you’re incredible!

However, with the LLM I’m using in Ollama’s section, I’m encountering errors that seem related to token length. Could you add a feature to adjust the token limit by retrieving the maximum allowed token length (context_length) via the REST API of the LLM being used?

More issues might stem from other stufs, but I’ve noticed that when I ask Claude-Dev to perform a simple "Hello World," it can do it. However, as soon as I ask for something more complex, it seems to lose track. I suspect there might be a problem with memory limits.

vinoudropdrop commented 2 weeks ago

Depending what you wana do. I don't use it with Claude-dev but with Continue. I use deepseek-coder, kangali/room-research, kangali/room-coder, mistral-nemo, ALIENTELLIGENCE/pythoncoderv2, and sometime other stuffs. They dont support tool as it's needed by Claude-dev, (mistral-nemo support it). People said MFDoom/deepseek-coder-v2-tool-calling work with Claude-dev, but for me, it's a buggy-woogy festival! idk why.

Curious, which model do you find best for coding on ollama?

jackiezhangcn commented 2 weeks ago

why I asked to write a simple 'hello world', it never succeeds, output with wrong information, such like list_files, ....

vinoudropdrop commented 2 weeks ago

why I asked to write a simple 'hello world', it never succeeds, output with wrong information, such like list_files, ....

Ollama implementation is currently only at the preview stage. It's likely that its API is not yet fully implemented. Moreover, due to architectural differences, it's unlikely to be fully functional in the near future. When you observe basic errors that even a large and powerful language model like Gemini can make, you realize the long road ahead before having a reliable and bug-free local LLM.

lazyracket commented 2 weeks ago

I recently tested the newly released Qwen2.5-Coder, which now comes with tool support.

Just a heads-up: you might want to adjust Ollama's default long-context setting to support up to 128K tokens.

For example:

$ ollama run qwen2.5-coder

>>> /set parameter num_ctx 131072
Set parameter 'num_ctx' to '131072'.

>>> /save qwen2.5-coder
Created new model 'qwen2.5-coder'

>>> /bye

So far, the results seem decent compared to other models supported by Ollama. Just sharing for reference in case anyone else wants to give it a try.

vinoudropdrop commented 1 week ago

I recently tested the newly released Qwen2.5-Coder, which now comes with tool support.

Just tested it, and for local llm, it's better than any other previous tests! Still have errors with tools that don't make modifications in files or terminal, but it's better than any others previous tests with ollama. Thank you!