langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
52.74k stars 7.71k forks source link

Loss of Chat History Retention in GLM Text Generation Model When Prompts Exceed 1000 Words (Agent) #10571

Open Ranyi176 opened 2 weeks ago

Ranyi176 commented 2 weeks ago

Self Checks

Dify version

0.11.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

In Dify versions higher than 0.9.0, when utilizing the agent module, if the prompt exceeds 1000 characters, the GLM-4 series models will lose the capability to retain chat history. 1.I input a 1200-character Chinese prompt. The first input was: “Please remember my name.” The second input was: “What is my name?” The model provided an incorrect response. 2.I conducted a comparative test with a 900-character Chinese prompt. The first input was: “Please remember my name.” The second input was: “What is my name?” The model correctly answered my question.

✔️ Expected Behavior

When the prompt exceeds 1000 Chinese characters, Dify agent module’s invocation of the GLM-4 series models still maintains the ability to understand chat-history context.

❌ Actual Behavior

For prompts exceeding 1000 Chinese characters, the GLM-4-ChatBot constructed by the “agent” is unable to maintain chat-history context.

dosubot[bot] commented 2 weeks ago

@Ranyi176 Hello there! I'm here to help you with any bugs, questions, or contributions while you wait for a human maintainer. I'm looking forward to assisting you!

The issue you're experiencing with the GLM-4 series models losing chat history retention when prompts exceed 1000 characters might be related to how Dify manages input prompts and chat history. Dify uses a class called SimplePromptTransform to handle prompt transformations for both chat and completion models. This class ensures that the total token count does not exceed the model's limit by managing chat history with a TokenBufferMemory [1].

In your case, it seems that when the prompt length exceeds a certain threshold, the chat history might not be retained correctly. You might want to check the configuration settings for token limits in the glm_4_long.yaml file, where the max_tokens parameter can be customized. The default value is set to 1024, with a maximum of 4095 [2]. Adjusting this setting might help in retaining chat history for longer prompts.

To continue talking to Dosu, mention @dosu.