Closed juniorsereno closed 2 months ago
@juniorsereno Hello there! I'm Dosu, a bot here to help you with bug fixes, questions, and contributions while you wait for a human maintainer.
I found a similar unsolved discussion regarding limiting the number of messages to control token usage [1].
To address the issue of exorbitant token usage after multiple interactions, you can implement a context window limit for the agent. This can be done by modifying the iteration limit of the agent in the agent settings. Specifically, you can adjust the max_iteration_steps
parameter in the FunctionCallAgentRunner
class to control the number of iterations the agent can perform, thereby managing token usage.
Here is an example of how you can set this limit:
class FunctionCallAgentRunner(BaseAgentRunner):
def run(self, message: Message, query: str, **kwargs: Any) -> Generator[LLMResultChunk, None, None]:
# Other method code
iteration_step = 1
max_iteration_steps = min(app_config.agent.max_iteration, 5) + 1 # Set the max iteration steps
# Other method code
while function_call_state and iteration_step <= max_iteration_steps:
function_call_state = False
if iteration_step == max_iteration_steps:
# The last iteration, remove all tools
prompt_messages_tools = []
# Other loop code
iteration_step += 1
# Other method code
By setting max_iteration_steps
to a desired value, you can limit the number of iterations and thus control the token usage effectively [2].
Additionally, token usage is monitored and controlled in other parts of the codebase, such as the CotAgentRunner
and AgentChatAppRunner
classes, where functions like increase_usage
and get_pre_calculate_rest_tokens
help manage token consumption [3][4].
To continue talking to Dosu, mention @dosu.
@dosu How to make this modification with DIFY already installed, what is the directory of this file after installed on the VPS?
And how to make this modification with the installation of dify in Easypanel, do you have this information?
To modify the max_iteration_steps
parameter in the FunctionCallAgentRunner
class for a Dify installation on a VPS and Easypanel, follow these steps:
Locate the Dify Source Code Directory: Navigate to the directory where you cloned the Dify source code. Typically, it would be something like:
cd /path/to/dify
Modify the FunctionCallAgentRunner
Class:
Open the fc_agent_runner.py
file located in the api/core/agent
directory:
nano api/core/agent/fc_agent_runner.py
Update the max_iteration_steps
Parameter:
Locate the following line in the run
method:
max_iteration_steps = min(app_config.agent.max_iteration, 5) + 1
Change the value 5
to your desired maximum number of iteration steps. For example, to set it to 10:
max_iteration_steps = min(app_config.agent.max_iteration, 10) + 1
Restart the Docker Containers: Navigate to the Docker directory and restart the containers:
cd docker
docker compose down
docker compose up -d
Access the Easypanel Interface: Log in to your Easypanel dashboard and navigate to the Dify application.
Open the Terminal: Use the terminal feature in Easypanel to access the Dify container.
Modify the FunctionCallAgentRunner
Class:
Open the fc_agent_runner.py
file located in the api/core/agent
directory:
nano /path/to/dify/api/core/agent/fc_agent_runner.py
Update the max_iteration_steps
Parameter:
Locate and modify the max_iteration_steps
parameter as described above.
Restart the Docker Containers: Use the Easypanel interface to restart the Dify application. This will apply the changes.
By following these steps, you can modify the max_iteration_steps
parameter in both VPS and Easypanel installations to control the token usage effectively [1].
For more detailed instructions on deploying updates in Easypanel, you can refer to the Dify documentation [2][3].
To continue talking to Dosu, mention @dosu.
@dosu The DIFY container in EasyPanel does not have a folder called API. Apparently the installation of EasyPanel is not possible to make this adjustment.
To modify the max_iteration_steps
parameter in a Dify installation on Easypanel, even if the container does not have an api
folder, you can follow these steps:
Access the Container's File System: Use the terminal feature in Easypanel to access the Dify container.
Locate the fc_agent_runner.py
File:
The file might be located in a different directory. You can use the find
command to locate it:
find / -name fc_agent_runner.py
Modify the max_iteration_steps
Parameter:
Once you locate the fc_agent_runner.py
file, open it with a text editor like nano
:
nano /path/to/fc_agent_runner.py
Locate the following line in the run
method:
max_iteration_steps = min(app_config.agent.max_iteration, 5) + 1
Change the value 5
to your desired maximum number of iteration steps. For example, to set it to 10:
max_iteration_steps = min(app_config.agent.max_iteration, 10) + 1
Restart the Dify Application: Use the Easypanel interface to restart the Dify application. This will apply the changes.
If you still cannot find the fc_agent_runner.py
file, you might need to check the Dify documentation or contact the support team for further assistance.
To continue talking to Dosu, mention @dosu.
@dosu no easypanel tive esse erro no terminal:
bash: nano: command not found
To modify the max_iteration_steps
parameter in a Dify installation on Easypanel without using the nano
command, you can use another text editor available on your system, such as vi
or vim
. Here are the steps:
Access the Container's File System: Use the terminal feature in Easypanel to access the Dify container.
Locate the fc_agent_runner.py
File:
If you haven't already located the file, you can use the find
command:
find / -name fc_agent_runner.py
Open the File with vi
or vim
:
Once you locate the fc_agent_runner.py
file, open it with vi
or vim
:
vi /path/to/fc_agent_runner.py
Modify the max_iteration_steps
Parameter:
Locate the following line in the run
method:
max_iteration_steps = min(app_config.agent.max_iteration, 5) + 1
Change the value 5
to your desired maximum number of iteration steps. For example, to set it to 10:
max_iteration_steps = min(app_config.agent.max_iteration, 10) + 1
Save the Changes and Exit the Editor:
In vi
or vim
, you can do this by pressing Esc
, typing :wq
, and then pressing Enter
.
Restart the Dify Application: Use the Easypanel interface to restart the Dify application. This will apply the changes.
This will update the max_iteration_steps
parameter in your Dify installation [1].
To continue talking to Dosu, mention @dosu.
@dosu I need to limit the agent's context window and not the amount of interactions with the tool, the context window is infinite and it's causing me an abusive token bug, I need to limit the context window so that it doesn't accumulate too many messages on the agent
To limit the agent's context window and prevent it from accumulating too many messages, you can use the AgentHistoryPromptTransform
class. This class includes logic to manage the number of messages based on token usage. Specifically, it calculates the maximum token limit and ensures that the accumulated messages do not exceed this limit.
Here is the relevant code snippet that handles this:
def get_prompt(self) -> list[PromptMessage]:
prompt_messages = []
num_system = 0
for prompt_message in self.history_messages:
if isinstance(prompt_message, SystemPromptMessage):
prompt_messages.append(prompt_message)
num_system += 1
if not self.memory:
return prompt_messages
max_token_limit = self._calculate_rest_token(self.prompt_messages, self.model_config)
model_type_instance = self.model_config.provider_model_bundle.model_type_instance
model_type_instance = cast(LargeLanguageModel, model_type_instance)
curr_message_tokens = model_type_instance.get_num_tokens(
self.memory.model_instance.model, self.memory.model_instance.credentials, self.history_messages
)
if curr_message_tokens <= max_token_limit:
return self.history_messages
num_prompt = 0
for prompt_message in self.history_messages[::-1]:
if isinstance(prompt_message, SystemPromptMessage):
continue
prompt_messages.append(prompt_message)
num_prompt += 1
if isinstance(prompt_message, UserPromptMessage):
curr_message_tokens = model_type_instance.get_num_tokens(
self.memory.model_instance.model, self.memory.model_instance.credentials, prompt_messages
)
if curr_message_tokens > max_token_limit:
prompt_messages = prompt_messages[:-num_prompt]
break
num_prompt = 0
message_prompts = prompt_messages[num_system:]
message_prompts.reverse()
prompt_messages = prompt_messages[:num_system]
prompt_messages.extend(message_prompts)
return prompt_messages
This method ensures that the number of tokens used by the messages does not exceed the maximum token limit, effectively limiting the context window [1].
Additionally, the FunctionCallAgentRunner
class in the Dify platform provides a comprehensive implementation for managing the agent's context and token usage. It includes methods to organize prompt messages, handle tool invocations, and manage the agent's scratchpad. This class ensures that the agent's context is efficiently managed, preventing excessive token usage.
class FunctionCallAgentRunner(BaseAgentRunner):
# ... other methods ...
def _organize_prompt_messages(self):
prompt_template = self.app_config.prompt_template.simple_prompt_template or ""
self.history_prompt_messages = self._init_system_message(prompt_template, self.history_prompt_messages)
query_prompt_messages = self._organize_user_query(self.query, [])
self.history_prompt_messages = AgentHistoryPromptTransform(
model_config=self.model_config,
prompt_messages=[*query_prompt_messages, *self._current_thoughts],
history_messages=self.history_prompt_messages,
memory=self.memory,
).get_prompt()
prompt_messages = [*self.history_prompt_messages, *query_prompt_messages, *self._current_thoughts]
if len(self._current_thoughts) != 0:
# clear messages after the first iteration
prompt_messages = self._clear_user_prompt_image_messages(prompt_messages)
return prompt_messages
This implementation ensures that the agent's context window is effectively managed, preventing excessive token usage and maintaining optimal performance [2].
To continue talking to Dosu, mention @dosu.
@dosu não entendi praticamente nada, só quero limitar a janela de contexto do agente em 20 mensagens, algum admin ainda irá averiguar minha questão?
Para limitar a janela de contexto do agente a 20 mensagens, você pode ajustar a lógica na classe AgentHistoryPromptTransform
. Aqui está um exemplo de como fazer isso:
class AgentHistoryPromptTransform(PromptTransform):
# ... outros métodos e inicializações ...
def get_prompt(self) -> list[PromptMessage]:
prompt_messages = []
num_system = 0
for prompt_message in self.history_messages:
if isinstance(prompt_message, SystemPromptMessage):
prompt_messages.append(prompt_message)
num_system += 1
if not self.memory:
return prompt_messages
max_token_limit = 20 # Defina o limite máximo de tokens para 20
model_type_instance = self.model_config.provider_model_bundle.model_type_instance
model_type_instance = cast(LargeLanguageModel, model_type_instance)
curr_message_tokens = model_type_instance.get_num_tokens(
self.memory.model_instance.model, self.memory.model_instance.credentials, self.history_messages
)
if curr_message_tokens <= max_token_limit:
return self.history_messages[:max_token_limit] # Garanta que as mensagens do histórico não excedam o limite
num_prompt = 0
for prompt_message in self.history_messages[::-1]:
if isinstance(prompt_message, SystemPromptMessage):
continue
prompt_messages.append(prompt_message)
num_prompt += 1
if isinstance(prompt_message, UserPromptMessage):
curr_message_tokens = model_type_instance.get_num_tokens(
self.memory.model_instance.model, self.memory.model_instance.credentials, prompt_messages
)
if curr_message_tokens > max_token_limit:
prompt_messages = prompt_messages[:-num_prompt]
break
num_prompt = 0
message_prompts = prompt_messages[num_system:]
message_prompts.reverse()
prompt_messages = prompt_messages[:num_system]
prompt_messages.extend(message_prompts)
return prompt_messages[:max_token_limit] # Garanta que as mensagens finais do prompt não excedam o limite
Este código define o max_token_limit
para 20 e garante que o número de mensagens no resultado não exceda esse limite [1][2].
Se precisar de mais ajuda, um dos mantenedores pode dar uma olhada mais detalhada no seu caso. @takatost, poderia ajudar com essa questão?
To continue talking to Dosu, mention @dosu.
Waiting for @takatost I would really like to put a limit on the agent's context window because it is the best and easiest agent creator I have ever used, I love the agent, but without a context window it becomes a madness of token consumption.
Not only the agent but also the basic chatbot have this problem, however we do not consider fixing this at this moment.
Self Checks
Dify version
0.8.3
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
I have an agent that I use as a personal assistant that I disabled because I was spending 125 to 200 thousand tokens per response after 200 interactions.
I suggest that the community or the Devs put a context window on the Agent, otherwise it will only be used for testing and never for a real application because it will be unsustainable in the long run, forcing us to build chatflow to be able to have a context window and avoid this expense.
I love the ease of the Agent, but with this problem it becomes unfeasible to use it in the medium-long term...
✔️ Expected Behavior
A context window limit will be as expected.
❌ Actual Behavior
A gradual unsustainable expense after a few interactions (100 or more) with the agent occurs.