Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
https://anythingllm.com
MIT License
24.67k stars 2.48k forks source link

[FEAT]: Allow for user to modify context window size after supplying a suitable default. #2289

Open JDelekto opened 1 month ago

JDelekto commented 1 month ago

What would you like to see?

I followed another thread that looked as if it were closed already, requesting to allow to set a context size per workspace. I am currently using the Windows version of AnythingLLM (version 1.6.7) and notice that when I create new workspaces, set to different models, I cannot set a context Window size at the workspace level, only in the main settings. This is contrary to what I had seen in earlier videos showing AnythingLLM where it looked as if the context window size could be set for the workspace.

I surmise AnythingLLM looks at model properties for convenience, as Ollama provides model information that can identify the context window size. Still, I've run into a problem, specifically with DeepseekCoder v2, where the context size in the Ollama model file does not match what it supports. I ran across some documentation that mentioned this was the case for the smaller models trained with fewer parameters.

If the context window is inferred from model parameters, I would like to see a context window size still provided for the workspace, whose settings change when a different provider/model is selected, populating it with the parameters that it can infer, but still allow the user to provide an override value for the context window to overcome models that identify themselves as supporting a larger context size, but in actually, do not.

timothycarambat commented 1 month ago

Context window has never been set on the workspace level, it is a model property so it would not make sense to tie it directly to a workspace.

With Ollama specifically, the context window can be configured from the LLM selection.

Screenshot 2024-09-16 at 9 53 26 AM

However, this issue reads very odd to me. Is the request here to add a context window size override based on the model selection since you can select a model per workspace - which could indeed result in the model running on a workspace having a different context from the selected default LLM?

Just trying to better understand the ask

JDelekto commented 1 month ago

Hi @timothycarambat, I think I'm trying to convey the idea of allowing MaxTokens configurable per unique provider and model. So for example, if I'm using the Ollama provider, and Ollama has three models downloaded and available for it), then the MaxTokens is configurable per model. If I'm not mistaken (and of course, this is specifically in the case of Ollama models), metadata in the model file provides information about the context size.

I can see why setting it for a workspace is impractical as the same model can be used in different workspaces. However, I have my MaxTokens set to 128k, but I am using some models that only have an 8k context window. My concern is that setting the MaxTokens that large is problematic when I have two different workspaces set up with Ollama as the provider but using two other models.

In some cases, I might want to use a lower MaxTokens for certain models, but then use the largest possible for those that support it. Hopefully, that made a little more sense.