Closed Weaxs closed 2 weeks ago
I found two similar discussions that might be relevant to your issue:
Regarding your specific issue, the max_tokens
parameter for the meta-llama/Meta-Llama-3.1-40GB-Instruct
model is indeed set to a high value in the configuration interface, but the actual operational limit is 4096 tokens. This discrepancy can occur due to various reasons such as implementation constraints, API limitations, or practical performance considerations that are not reflected in the configuration interface [3][4].
The default value for the max_tokens
parameter in the Dify configuration for the meta-llama/Meta-Llama-3.1-40GB-Instruct
model is 512, with a maximum allowed value of 4096 [5].
The validation for the max_tokens
parameter is implemented in the api/core/model_runtime/model_providers/openrouter/llm/llama-3.1-8b-instruct.yaml
file, where it is restricted to a maximum of 4096 tokens [6][7].
If you need further assistance, you might want to check the configuration file api/core/model_runtime/model_providers/openrouter/llm/llama-3.1-405b-instruct.yaml
for detailed settings and limitations [8][9].
To continue talking to Dosu, mention @dosu.
Self Checks
Dify version
0.7.3
Cloud or Self Hosted
Cloud
Steps to reproduce
✔️ Expected Behavior
No response
❌ Actual Behavior
No response