Open vinoudropdrop opened 3 weeks ago
Hi! Thank you so much for your exceptional work, you’re incredible!
❤️
adjust the token limit by retrieving the maximum allowed token length (context_length) via the REST API
This is a great idea! Thank you for the suggestion
This would be a very useful feature, thanks for the suggestion and the awesome plugin!
Curious, which model do you find best for coding on ollama?
Hi! Thank you so much for your exceptional work, you’re incredible!
However, with the LLM I’m using in Ollama’s section, I’m encountering errors that seem related to token length. Could you add a feature to adjust the token limit by retrieving the maximum allowed token length (context_length) via the REST API of the LLM being used?
More issues might stem from other stufs, but I’ve noticed that when I ask Claude-Dev to perform a simple "Hello World," it can do it. However, as soon as I ask for something more complex, it seems to lose track. I suspect there might be a problem with memory limits.
Depending what you wana do. I don't use it with Claude-dev but with Continue. I use deepseek-coder, kangali/room-research, kangali/room-coder, mistral-nemo, ALIENTELLIGENCE/pythoncoderv2, and sometime other stuffs. They dont support tool as it's needed by Claude-dev, (mistral-nemo support it). People said MFDoom/deepseek-coder-v2-tool-calling work with Claude-dev, but for me, it's a buggy-woogy festival! idk why.
Curious, which model do you find best for coding on ollama?
why I asked to write a simple 'hello world', it never succeeds, output with wrong information, such like list_files, ....
why I asked to write a simple 'hello world', it never succeeds, output with wrong information, such like list_files, ....
Ollama implementation is currently only at the preview stage. It's likely that its API is not yet fully implemented. Moreover, due to architectural differences, it's unlikely to be fully functional in the near future. When you observe basic errors that even a large and powerful language model like Gemini can make, you realize the long road ahead before having a reliable and bug-free local LLM.
I recently tested the newly released Qwen2.5-Coder, which now comes with tool support.
Just a heads-up: you might want to adjust Ollama's default long-context setting to support up to 128K tokens.
For example:
$ ollama run qwen2.5-coder
>>> /set parameter num_ctx 131072
Set parameter 'num_ctx' to '131072'.
>>> /save qwen2.5-coder
Created new model 'qwen2.5-coder'
>>> /bye
So far, the results seem decent compared to other models supported by Ollama. Just sharing for reference in case anyone else wants to give it a try.
I recently tested the newly released Qwen2.5-Coder, which now comes with tool support.
Just tested it, and for local llm, it's better than any other previous tests! Still have errors with tools that don't make modifications in files or terminal, but it's better than any others previous tests with ollama. Thank you!
Hi! Thank you so much for your exceptional work, you’re incredible!
However, with the LLM I’m using in Ollama’s section, I’m encountering errors that seem related to token length. Could you add a feature to adjust the token limit by retrieving the maximum allowed token length (context_length) via the REST API of the LLM being used?
More issues might stem from other stufs, but I’ve noticed that when I ask Claude-Dev to perform a simple "Hello World," it can do it. However, as soon as I ask for something more complex, it seems to lose track. I suspect there might be a problem with memory limits.