langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
50.82k stars 7.31k forks source link

bug: max_token limit exceeded when change llm #8027

Closed Weaxs closed 2 weeks ago

Weaxs commented 2 months ago

Self Checks

Dify version

0.7.3

Cloud or Self Hosted

Cloud

Steps to reproduce

  1. choose GTP-4o-mini
  2. set max_token to 16384, save
  3. change model to siliconflow meta-3.1-405B-Instruct
  4. run workflow
  5. llm node occured error

image image

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

dosubot[bot] commented 2 months ago

I found two similar discussions that might be relevant to your issue:

  1. Max Token Limits Incorrect? - This discussion is still open and was last updated on August 14, 2024.
  2. Error when I modify the max_tokens of the custom Maximum chunk length from 1000 to 2000 - This discussion is also open and was last updated on May 4, 2024 [1][2].

Regarding your specific issue, the max_tokens parameter for the meta-llama/Meta-Llama-3.1-40GB-Instruct model is indeed set to a high value in the configuration interface, but the actual operational limit is 4096 tokens. This discrepancy can occur due to various reasons such as implementation constraints, API limitations, or practical performance considerations that are not reflected in the configuration interface [3][4].

The default value for the max_tokens parameter in the Dify configuration for the meta-llama/Meta-Llama-3.1-40GB-Instruct model is 512, with a maximum allowed value of 4096 [5].

The validation for the max_tokens parameter is implemented in the api/core/model_runtime/model_providers/openrouter/llm/llama-3.1-8b-instruct.yaml file, where it is restricted to a maximum of 4096 tokens [6][7].

If you need further assistance, you might want to check the configuration file api/core/model_runtime/model_providers/openrouter/llm/llama-3.1-405b-instruct.yaml for detailed settings and limitations [8][9].

To continue talking to Dosu, mention @dosu.