langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
45.86k stars 6.47k forks source link

glm-4v series max_tokens wrong #7880

Closed orangeclk closed 2 weeks ago

orangeclk commented 2 weeks ago

Self Checks

Dify version

0.7.2

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

  1. Use the glm4v series model and adjust max_tokens to above 1024.
  2. Type anything.

✔️ Expected Behavior

No response

❌ Actual Behavior

image

dosubot[bot] commented 2 weeks ago

I found two similar discussions that might be relevant to your issue:

  1. Max Token Limits Incorrect? (open) [1]
  2. Error when I modify the max_tokens of the custom Maximum chunk length from 1000 to 2000 (open) [2]

To resolve the "[zhipuai] Error: RetryError(<Future at 0x7fd05a3c41f0 state=finished raised APIRequestFailedError>)" when setting max_tokens above 1024 in the glm-4v series model, ensure that the max_tokens parameter is set within the allowed range specified in the configuration. According to the provided YAML configuration, the max_tokens parameter has a range from 1 to 8192.

Here is the relevant part of the configuration:

- name: max_tokens
  use_template: max_tokens
  default: 1024
  min: 1
  max: 8192

Make sure that the value you are setting for max_tokens does not exceed 8192. If you are setting it within this range and still encountering the error, it might be due to other constraints or limitations imposed by the API or the environment in which the model is running. Additionally, ensure that other parameters such as temperature and top_p are set correctly, as improper settings might also contribute to the issue [3][4].

To continue talking to Dosu, mention @dosu.