When using the default context length settings, each query to the Llama model is limited to approximately 2,000 tokens. To effectively utilize the full potential of the Llama 3 model, especially for more complex tasks requiring a larger context, it is necessary to explicitly set the maximum context length.
Below is an example payload configuration that sets the maximum context length to 8,192 tokens, which is not currently the default behavior:
Expected Behavior:
Allow the context length to be configurable up to the maximum supported by the model directly through the payload options.
Actual Behavior:
The context length defaults to around 2,000 tokens, which may not suffice for more in-depth analyses or larger data contexts required by users.
Suggested Fix:
Include an option within the model configuration to easily specify and adjust the maximum context length according to user needs or specific tasks.
this is only for Ollama queries
When using the default context length settings, each query to the Llama model is limited to approximately 2,000 tokens. To effectively utilize the full potential of the Llama 3 model, especially for more complex tasks requiring a larger context, it is necessary to explicitly set the maximum context length.
Below is an example payload configuration that sets the maximum context length to 8,192 tokens, which is not currently the default behavior:
Expected Behavior: Allow the context length to be configurable up to the maximum supported by the model directly through the payload options.
Actual Behavior: The context length defaults to around 2,000 tokens, which may not suffice for more in-depth analyses or larger data contexts required by users.
Suggested Fix: Include an option within the model configuration to easily specify and adjust the maximum context length according to user needs or specific tasks.